Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Sex And Personality Studies In Masculinity And Femininity
Sex And Personality Studies In Masculinity And Femininity
Sex And Personality Studies In Masculinity And Femininity
Ebook755 pages6 hours

Sex And Personality Studies In Masculinity And Femininity

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Lewis Madison Terman (1877-1956) was an American psychologist, noted as a pioneer in educational psychology in the early 20th century at the Stanford University School of Education.
Sex differences in personality and temperament are matters of universal human interest. Among all classes of people, from the most ignorant to the most cultivated, they provide an inexhaustible theme for light conversation and humorous comment. They have always been and perhaps always will be one of the chief concerns of novelists, dramatists, and poets. They are rapidly coming to be recognized as one of the central problems in anthropology, sociology, and psychology.
LanguageEnglish
Release dateApr 4, 2013
ISBN9781447495789
Sex And Personality Studies In Masculinity And Femininity

Read more from Lewis M. Terman

Related to Sex And Personality Studies In Masculinity And Femininity

Related ebooks

Psychology For You

View More

Related articles

Reviews for Sex And Personality Studies In Masculinity And Femininity

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Sex And Personality Studies In Masculinity And Femininity - Lewis M. Terman

    CHAPTER I

    RATIONALE OF THE MASCULINITY-FEMININITY TEST

    The belief is all but universal that men and women as contrasting groups display characteristic sex differences in their behavior, and that these differences are so deep seated and pervasive as to lend distinctive character to the entire personality. That masculine and feminine types are a reality in all our highly developed cultures can hardly be questioned, although there is much difference of opinion as to the differentiae which mark them off and as to the extent to which overlapping of types occurs. It is true that many social trends, of which the recent development of psychological science is one, have operated to reduce in the minds of most people the differences that had long been supposed to separate the sexes in personality and temperament. Intelligence tests, for example, have demonstrated for all time the falsity of the once widely prevalent belief that women as a class are appreciably or at all inferior to men in the major aspects of intellect. The essential equality of the sexes has further been shown by psychometric methods to obtain also in various special fields, such as musical ability, artistic ability, mathematical ability, and even mechanical ability. The enfranchisement of women and their invasion of political, commercial, and other fields of action formerly reserved to men have afforded increasingly convincing evidence that sex differences in practical abilities are also either nonexistent or far less in magnitude than they have commonly been thought to be.

    But if there is a growing tendency to concede equality or near equality with respect to general intelligence and the majority of special talents, the belief remains that the sexes differ fundamentally in their instinctive and emotional equipment and in the sentiments, interests, attitudes, and modes of behavior which are the derivatives of such equipment. It will be recognized that these are important factors in shaping what is known as personality, hence the general acceptance of the dichotomy between the masculine and feminine personality types. The belief in the actuality of M-F types remains unshaken by the fact, abundantly attested, that observers do not agree in regard to the multitudinous attributes which are supposed to differentiate them. Although practically every attribute alleged to be characteristic of a given sex has been questioned, yet the composite pictures yielded by majority opinion stand out with considerable clearness.

    In modern Occidental cultures, at least, the typical woman is believed to differ from the typical man in the greater richness and variety of her emotional life and in the extent to which her everyday behavior is emotionally determined. In particular, she is believed to experience in greater degree than the average man the tender emotions, including sympathy, pity, and parental love; to be more given to cherishing and protective behavior of all kinds. Compared with man she is more timid and more readily overcome by fear. She is more religious and at the same time more prone to jealousy, suspicion, and injured feelings. Sexually she is by nature less promiscuous than man, is coy rather than aggressive, and her sexual feelings are less specifically localized in her body. Submissiveness, docility, inferior steadfastness of purpose, and a general lack of aggressiveness reflect her weaker conative tendencies. Her moral life is shaped less by principles than by personal relationships, but thanks to her lack of adventurousness she is much less subject than man to most types of criminal behavior. Her sentiments are more complex than man’s and dispose her personality to refinement, gentility, and preoccupation with the artistic and cultural.

    But along with the acceptance of M-F types of the sort we have sketchily delineated, there is an explicit recognition of the existence of individual variants from type: the effeminate man and the masculine woman. Grades of deviates are recognized ranging from the slightly variant to the genuine invert who is capable of romantic attachment only to members of his or her own sex, although, as we shall see later, judges rating their acquaintances on degree of masculinity and femininity of personality seldom show very dose agreement. Lack of agreement in such ratings probably arises from two sources: (1) varying opinion as to what factors truly differentiate the M-F types, and (2) varying interpretations of specific kinds of observed behavior. Some raters base their estimates largely upon external factors, such as body build, features, voice, dress, and mannerisms; others, more penetrating, give larger weight to the subtle aspects of personality which manifest themselves in interests, attitudes, and thought trends.

    For many reasons, both practical and theoretical, it is highly desirable that our concepts of the M-F types existing in our present culture be made more definite and be given a more factual basis. Alleged differences between the sexes must give place to experimentally established differences. A measure is needed which can be applied to the individual and scored so as to locate the subject, with a fair degree of approximation, in terms of deviation from the mean of either sex. Range and overlap of the sexes must be more accurately determined than is possible by observational and clinical methods.

    It is to this end that the researches set forth in the present volume have been directed. As the result of empirical investigations, over a period of several years, a test of mental masculinity and femininity has been devised which is based upon actual differences between male and female groups ranging in age from early adolescence to life’s extreme. The reliability of the instrument, in the sense of consistency of its verdicts, qualifies it for the comparative study both of groups and of individuals. The test has been applied to samplings of populations of each sex differing in age, education, occupation, cultural interests, and familial background. Two groups of male and a few female homosexuals have been studied. Scores on the M-F test have been correlated with physical measurements and with rated estimates of several personality variables. The possibility of subjects faking their scores has been investigated. The mean, the range, and the overlapping of sex groups have been determined for numerous populations. The degree to which M-F traits enter into marital selection has been measured, and (by the use of a modified M-F test) spouse resemblance in M-F score has been correlated with marital happiness. The test items carrying masculine weights have been compared with those carrying feminine weights in an effort to arrive at a more exact and meaningful description of characteristic sex differences as they exist in the culture of our time and country.

    The M-F test is made up in two equivalent forms, A and B. It is entirely a pencil-and-paper test, of the questionnaire variety, composed of 910 items (Form A, 456; Form B, 454). In the interests of anonymity, as well as to insure speed in administration and scoring, all the items are responded to by checking one of four, three, or two multiple responses. The test is administered without a time limit, a single form requiring forty to fifty minutes for the majority of subjects, rarely an hour. It is hardly applicable to subjects of less than seventh-grade education and ability. The scoring is done by the use of stencils and is therefore entirely objective. Scoring by the Hollerith machine is possible but is not economical except for fairly large populations. Each response carries a weight of one and is scored either + or −, that is, masculine or feminine. Weighted scores were originally used but were discarded in favor of unit weights for reasons which will be given in Chapter IV.

    The purpose of the test has been disguised by the title, Attitude-Interest Analysis. This was made necessary by the fact that subjects who know what the test is intended to measure are able to influence their scores so greatly as to invalidate them entirely.¹

    Both forms of the test are reproduced in Appendix I, with + and − signs inserted to show how each item is scored. It will be noted that in subtests 2, 3, 5, and 6 omissions as well as actual responses are scored. There are seven subtests (or exercises) as follows:

    Examination of the seven exercises will show that with the exception of No. 6 each is composed of items of a fairly homogeneous type. Exercise 6 contains two parts, the first of which logically belongs with Exercise 5. The make-up of this exercise was influenced by space requirements in printing the blanks, and the same fact accounts for the addition to Exercise 4 of a small group of items which logically belong with Exercise 6. This does not greatly matter, as it was not the intention that each of the seven exercises should necessarily measure a unitary trait. All of them together present a wide sampling of sex differences and it is the total score with which we are chiefly concerned. The range of scores in the general population of adults is roughly as follows: for males, from +200 to − 100, with a mean of +52 and S.D. of 50; for females, from +100 to −200, with a mean of −70 and S.D. of 47. The score, as we shall see, is influenced by age, intelligence, education, interests, and social background, and to such an extent that groups differing in these respects often show markedly contrasting score distributions.

    The M-F test does not represent any radical departure in principle from methods previously employed in the study of sex differences. A majority of the earlier studies also resultéd in comparisons stated in quantitative terms. For example, the sexes were found to differ numerically in the number liking or playing particular games; the number liking or disliking particular colors, books, or school subjects; the number preferring this or that occupation, literary style, historical character, or ideal; or they were found to differ in the degree of introversion, dominance, inferiority feeling, conservatism, emotional stability, sense of humor, religious attitudes, or other alleged traits. The present study differs from its predecessors primarily in the fact that it represents a more systematic attempt to sample sex differences in a large variety of fields in which such differences are empirically demonstrable. The literature of the subject, both theoretical and experimental, was canvassed for clues as to the types of test responses most likely to reveal sex differences. Of the thousands of items which have been tried out only those have been retained which satisfied in some degree this criterion of discriminative capacity. The test is based, not upon some theory as to how the sexes may differ, but upon experimental findings as to how they do differ, at least in the present historical period of the Occidental culture of our own country.

    The purpose of the M-F test is to enable the clinician or other investigator to obtain a more exact and meaningful, as well as a more objective, rating of those aspects of personality in which the sexes tend to differ. More specifically, the purpose is to make possible a quantitative estimation of the amount and direction of a subject’s deviation from the mean of his or her sex, and to permit quantitative comparisons of groups differing in age, intelligence, education, interests, occupation, and cultural milieu. By comparing the responses made by groups of subjects on the several parts of the test we secure a basis for qualitative as well as quantitative studies of sex differences.

    The M-F test rests upon no assumption with reference to the causes operative in determining an individual’s score. These may be either physiological and biochemical, or psychological and cultural; or they may be the combined result of both types of influence. The aim has been merely to devise a test which would measure existing differences in mental masculinity and feminity, however caused. It is only when a test of this kind has been made available that it becomes possible to investigate with any degree of precision the influence of the numerous physical, social, and psychological factors that may affect a subject’s rating.

    At present extreme differences of opinion are to be found with reference to this question. Many if not the majority of students of homosexuality accept the theory that sexual inversion is not a product of psychological conditioning but is inborn. This seems to be also the almost universal belief of homosexuals themselves, though one would of course hesitate to give much weight to their unsupported opinion. So far as animals below man are concerned, the facts available point unmistakably to the conclusion that maleness and femaleness of behavior are biochemically determined. Transplantation of male gonadal tissue into the ovariectomized domestic fowl induces not only male secondary sexual characters but also typical male behavior. A corresponding effect in the reverse direction is produced by ovarian transplants in the castrated male fowl. Spontaneous sex reversals in pigeons and domestic fowls, ordinarily initiated by pathological conditions in the gonads, have been described in detail by Riddle, Crew, and others. In one case, reported by Crew, a hen which had been the mother of two broods of chicks lost her ovaries from a tubercular infection, developed male sexual organs, and later became the father of another brood of offspring.² The injection of the female hormone into the blood of virgin female white rats is followed by maternal behavior patterns otherwise observed only in late pregnancy and the postdelivery period, particularly nest building and retrieving of young. The impressive contrast between the personality of stallion and gelding, or of bull and steer, is familiar to every farm boy. In a majority of mammals and birds in which castrated subjects have been experimentally studied, the effect of this operation on the male is to produce a temperament (personality) more or less approximating that of the normal female. Castration of the female, on the other hand, has little effect except upon the specific patterns concerned with mating and with maternal behavior.

    Strong as the temptation is to draw inferences from lower mammals and fowls to man, it must be resisted in view of man’s enormously greater modifiability by psychological conditioning. Although early castration of the human male is well known to be followed by the development of a temperament lacking a number of the usual attributes of masculinity, we are not able to disentangle the biochemical from the psychological factors which may have combined to produce the total result.

    Our investigations offer considerable evidence of the influence of nurture on the masculinity and femininity of human personality, although we regard our results as far from conclusive. Perhaps cross-parent fixation followed by homosexuality is one of the most convincing illustrations of environmental influence, but even here biochemical abnormalities have not been ruled out as possible contributing causes of the mother-son or father-daughter attraction. In view of the scanty and conflicting data of strictly scientific nature it would seem that the only just course is to keep an open mind and to investigate every possible influence that is subject to even approximate measurement. One awaits with special interest the results of biochemical studies of homosexual subjects, particularly those which will establish the behavioral effects induced by synthetic preparations of male and female hormones. In these and other experiments it is hoped that the M-F test will prove a helpful research tool.

    In view of the myriad known physiological and biochemical differences between men and women, any degree of overlap of the sexes on a masculinity-femininity test of the type used in the investigations to be described must be regarded as psychologically and sociologically very significant. If it can be shown that despite the biological dichotomy between males and females of the genus Homo a few members of each sex rate as masculine or as feminine as the average member of the opposite sex, a heavy burden of proof devolves upon anyone who doubts the weighty influence of environment in shaping the patterns of male and female behavior. Certainly any such finding must constitute a challenge of the first order to the search for possible physiological and biochemical correlates of extreme deviation from the respective sex norms in M-F score. It is only by the discovery of such correlates that it will become possible to establish any definite limits to the effect of environmental influences. In the meantime, studies of sex differences by the use of subjective methods will remain of indeterminate value.

    At this point it may be well to give a few words of warning in regard to possible misuses of the M-F test.

    In the first place, one must bear in mind that since the test does not sample every conceivable kind of sex difference and does not sample with perfect reliability even those fields which it attempts to cover, the score it yields cannot be regarded as an adequate index of the totality of a subject’s mental masculinity and femininity. It would doubtless be possible to find enough additional valid items to make up several other tests as lengthy as the one we have devised. This would especially be the case if the test were not exclusively of the pencil-and-paper variety.

    Secondly, a more serious limitation to the present usefulness of the test lies in the fact that as yet too little is known about the behavior correlated with high and low scores. Painstaking clinical studies of large numbers of high-scoring and low-scoring subjects will be necessary to remedy this defect. Most emphatic warning is necessary against the assumption that an extremely feminine score for males or an extremely masculine score for females can serve as an adequate basis for the diagnosis of homosexuality, either overt or latent. It is true, as we shall show, that male homosexuals of the passive type as a rule earn markedly feminine scores, and that the small number of female homosexuals of the active type whom we have tested earned high masculine scores. That the converse of these rules is in accord with the facts, we have no assurance whatever; indeed, our findings indicate that probably a majority of subjects who test as variates in the direction of the opposite sex are capable of making a perfectly normal heterosexual adjustment. Mental masculinity or femininity is at most only one of a number of factors predisposing to homosexuality; one must even bear in mind the possibility that it may be a secondary rather than a primary condition, an effect rather than a cause.

    Used with suitable precaution we believe that the M-F test will be found valuable both as a clinical and as an investigational tool. The application of a single form is adequate for comparison of population groups and also for securing approximate ratings of individual subjects. When it is important that a subject’s rating be as accurate as possible, both forms of the test should be administered and the average taken of the two scores. Types of investigation in which the test should be helpful include, among others, the relationship of masculinity and femininity of temperament to body build, metabolic and other physiological factors, excess or deficiency of gonadal and other hormone stimulation, and homosexual behavior, and to such environmental influences as parent-child attachments, number and sex of siblings, sex of teachers, type of education, marital compatibility, and choice of friends or of occupations. It will be especially interesting to compare M-F differences in different cultures, and in the same culture at intervals of one or more generations. In short, the measurement of M-F differences will make it possible greatly to expand our knowledge of the causes which produce them.

    Our primary task has been to throw light on the meaning of the M-F score. For this reason we have presented only a relatively brief summary of the extensive experimental work which was necessary to bring the test to its present form, and have devoted the bulk of our volume to the relationships found to obtain between M-F score and other variables, including physique, personality traits as rated and measured, achievement, age, education, intelligence, occupational classification, interests, domestic milieu, delinquency, and homosexuality. It is hoped that the reader will thus be sufficiently impressed by the multiplicity of factors which go to determine an individual’s score and by the complexity of interaction among them. In the interest of concreteness and in order to illustrate some of the major questions that arise in score interpretation, three chapters have been devoted to case studies. We believe that many of these will be of surpassing interest to the general reader as well as to the professional student of personality and temperament.

    ¹ For data on this point see pp. 77 ff.

    ² CREW, F. A. E., Abnormal sexuality in animals. III. Sex reversal, Quart. Rev. Biol., 1927, 2, 427–441.

    CHAPTER II

    ORIGIN OF THE M-F TEST

    The idea of developing a test of masculinity and femininity first occurred to the senior author in 1922 in connection with an investigation of intellectually superior children. One division of that investigation had for its purpose comparison of gifted and unselected children with respect to their interest in, practice of, and knowledge about plays, games, and amusements. Each of 90 such activities was rated three times by each subject; first for acquaintance with it, secondly for interest in it, and thirdly for frequency with which it was practiced. There followed a list of 45 questions about experience or accomplishment in a wide variety of activities hardly to be classed as plays or games, such as Have you ever cooked a meal? Have you ever taken part in a play? etc. Finally, there were 123 information questions designed to test the subject’s actual knowledge about the plays, games, and other activities. (Examples: A singing game is follow-the-leader, London Bridge, poison; A game in which you must not smile is fruit-basket, old-witch, tin-tin.)

    The test was given to 303 boys and 251 girls of IQ140 or above, ages 6 to 14, and to a control group of 225 unselected boys and 249 unselected girls, ages 8 to 17.¹

    Sex differences between unselected boys and girls were computed for the composite of the two ratings given by the subjects for interest in and practice of the 90 plays, games, and activities. Next a masculinity index was computed for each of the activities, based upon the sex differences found in the control group. These indices ranged from 2 to 24, those above 13 indicating greater preference by boys, those below 13 greater preference by girls. They are shown in Table 1.

    TABLE 1.—MASCULINITY AND FEMININITY INDICES OF NINETY ACTIVITIES

    By use of the above data it was possible to derive a masculinity rating of each subject based upon the score weights given to masculine and feminine activities. The distributions of these masculinity ratings are given in Table 2 separately by sex, for the gifted and control groups.

    A certain amount of historic interest attaches to the data presented in Table 2. Here, for the first time, we have distributions by sex of masculinity ratings based upon an extensive sampling of behavior responses. We see that the range for each sex is very wide and that there is a considerable amount of overlap between the sexes; that there are indeed a few members of each sex who test beyond the mean of the opposite sex. Incidentally, though the fact is not of prime interest for our present purpose, little difference in masculinity was found for the gifted and control groups at any age.

    TABLE 2.—MASCULINITY RATINGS OF GIFTED AND UNSELECTED CHILDREN*

    It is possible that the masculinity test would have been allowed to rest at this point but for facts which came to light in a comparison of scores and case-history data for a number of subjects deviating greatly from their sex norm. Some of these comparisons indicated that the scores tended to be correlated with general masculinity and femininity of behavior and to reveal an important line of cleavage in personality and temperament. One of the deviating cases in particular furnished considerable motivation to further experimentation in the field of M-F testing. This was the gifted boy who can be identified in Table 2 as receiving a masculinity rating not only below all the other boys, of either the gifted or the control groups, but also below that of any girl. The following account of the case offers spectacular evidence of the significance that may attach to M-F ratings of the kind in question.

    An assistant who was tabulating the masculinity scores of gifted boys noted what seemed to be an error, a score that was more feminine than any for the girls. The score was accordingly checked for error, as was also the sex classification of the subject, but no error was found. Reference to the field assistant’s case history revealed the fact that the boy in question (age nine) had become a problem to his mother because of his persistent and overpowering desire to play the role of a girl. Besides showing a distinct preference for feminine plays and games, X frequently dressed himself in girl’s clothes and dolled his face with rouge, lipstick, and other cosmetics. When he found that his feminine behavior was beginning to attract the attention and disapproval of his parents and playmates he cleverly found a way to continue it without criticism by writing little plays for neighborhood performance, each carefully provided with a feminine role for himself. A follow-up study six years later showed that the feminine inclinations of X had become more rather than less marked. For now, at the age of fifteen, one of his favorite amusements was to dress himself as a stylish young woman, apply cosmetics liberally, and walk down the street to see how many men he could lure into flirtation. It is practically certain that at this time X had no knowledge whatever about the existence of such a thing as homosexuality.

    So striking a case of inverted behavior in childhood naturally provoked speculation in regard to the course of development that would ensue. To secure the facts, however, was not easy. Letters of inquiry were addressed to the mother from time to time, but as these were couched in general terms to avoid the risk of causing offense or shock their import was not understood and they elicited little information. Finally, one of these letters which was somewhat more pointed than usual was shown to X by his mother. X understood immediately the purpose of the inquiry and informed his mother that there was something he had kept from her, namely, that his love interests were unlike those of other men in that only boys had the slightest attraction for him. He denied, however, that he had engaged in any kind of overt homosexual behavior. As a result of this confession X was asked to fill out the Attitude-Interest Analysis Test (M-F test). The score of –71 which he now earned places him near the 50th percentile for women, or more feminine than 999 men out of 1,000.

    A few weeks before this chapter was written the mother of X requested information about the possibility of normalizing her son’s emotional life by the use of testoserone, a recently developed synthetic preparation of male hormone.²

    The case we have just described raises in dramatic form the question as to the age when an individual’s M-F status becomes relatively fixed. To what extent is the adult M-F status of a subject foreshadowed in the years of middle childhood and preadolescence? Does the very feminine boy usually become a very feminine man, or is X an exception to the rule? Similarly, does the tomboy usually develop into the masculine type of woman, or is tomboyishness more commonly but a passing phase?

    The question is a very interesting one, but at present no answer can be given. It cannot be answered until a more satisfactory M-F test has been devised for use with young children. The Plays, Games, and Amusements test is both too crude and too limited in scope to afford a measure comparable in reliability and validity to the M-F test we have devised for adults. Even so, the reader will want to know to what extent this test, given in the preadolescent years, agrees with scores made by the same subjects on the M-F test several years later.

    Six years after the P. G. and A. test was administered to the gifted subjects, the M-F test described in this volume was given to 94 boys and 99 girls of the same group. The ages at the time of the earlier test averaged about 10 1/2 years and ranged from 6 to 13, giving a mean age of about 16 1/2 at the time of the M-F test.

    In the case of the boys the correlation between the two sets of scores was .30±.08, which is large enough to be statistically significant but too small to serve as a basis of prediction in the case of individual subjects. For the girls the correlation was .24. In the case of both boys and girls, however, there were striking differences in the M-F scores of subjects who scored at opposite extremes on the P. G. and A. test. For example, the 11 most masculine boys on the P. G. and A. test had a mean M-F score of +78 six years later, as compared with a mean of +37 for the 6 most feminine. The two M-F scores, +78 and +37, are separated by approximately one S.D. of the distribution of a typical male group. (See norms, Appendix I.) The 12 most masculine girls on the P. G. and A. test had a mean M-F score of −50 six years later, as compared with −106 for the 12 who had been rated most feminine by the P. G. and A. test. This difference is about 1.5 S.D. of a typical female distribution.

    On the whole, these results suggest that there is a certain amount of constancy with regard to a subject’s M-F status from middle childhood to the adult period. We are inclined to think that improved tests for the earlier years will reveal a considerable degree of constancy. The case history data presented in Chapters XIII to XV lend considerable support to this view. Because of the far-reaching effects which extreme M-F deviation may have upon an individual’s social and sexual adjustment, the problem should be thoroughly investigated. In this connection one would like to know whether the child’s progress toward the adult M-F status is related to early or late puberty, or influenced by forced association with older or younger children as in the case of pupils who are markedly accelerated or retarded in school.

    The facts that have been presented, representing, as they do, the only examples of subjects being given an objective M-F test and later followed on to the adult period, are surely challenging enough to justify all the labor that has been expended in improving the test, in securing norms for a large number of population groups, and in correlating the scores with other variables.

    ¹ Subjects in the control group could not read well enough to take the test below the age of 8 years. As the test was given only through the eighth grade, in which gifted children above the age of 14 are rarely found, the age range for the gifted group does not run as high as for the control group. The unselected children accordingly have a considerable advantage in the comparison of the entire gifted and control group; only for ages 8 to 14 inclusive are the two groups strictly comparable.

    ² This youth, now in his twenties, is well started on what promises to be a brilliant career in one of the arts. There were several factors which may have contributed to the development of his homosexual tendencies. The mother married in the late thirties a man some twenty years her senior. When X was born he was definitely marked to remain an only child, and the stage was thus set for an excessive attachment to his overcherishing mother. Masculine contacts were largely lacking, as he did not spend much time with other boys and had little in common with his elderly father. His artistic temperament and refined sensibilities may also have played a part. X is an example of the highest type of mental sexual inversion; he has high principles, is passionately devoted to his work, and seems to have rejected all overt expression of his homosexual inclinations.

    CHAPTER III

    EXTENSION OF THE M-F TEST TO NEW TECHNIQUES

    Each exercise of the M-F test in its present form required a large amount of preliminary investigation, including the examination of experimental literature in the search for clues, selection of tentative lists of items, trial for rejection of non-discriminating items, application of retained items to numerous groups of subjects, computations of reliability and of overlap of sex groups, trial of different response methods, and comparison of the merits of various scoring and weighting techniques. The reader would hardly be interested in a minute account of this exploratory work and we shall therefore summarize in a single chapter the thousand or more pages of material representing this stage of the investigation.

    It was not until the summer of 1925 that opportunity arose to provide a broader basis for the M-F test than sex differences in the field of plays, games, and amusements. Examination of the literature suggested many possible lines of extension, but consideration was given only to methods which would permit group testing, pencil-and-paper responses, and objective scoring. Within a few months experiments were begun with five techniques: word association, ink-blot association, information, interests (likes and dislikes), and tests of introvertive tendencies. Two other methods were added to the experiment in 1926, a test of emotional and ethical response and a test of common beliefs.

    Apart from a few minor experiments which were quickly found to be unpromising, the investigations at this stage were confined to the above seven techniques. All of these turned out favorably with the exception of the test of opinions, which netted only a small proportion of items showing reliable sex differences. Although the investigation proceeded upon several lines simultaneously, often with application of the various tests to identical groups, the account which follows will deal with the seven techniques successively.

    THE WORD ASSOCIATION TEST

    The so-called free association method employing words as stimuli has been used by numberless investigators since the work by Aschaffenburg was first published in 1889. Among these, Jastrow, Calkins, Washburn, and others had made note of sex differences revealed by the method.¹ The sex differences that had been found were usually the by-products of studies primarily designed either to establish the general laws of associative response or to bring out the influence of individual differences other than sex, including personality type, psychotic tendency, mental complexes, consciousness of guilt, etc. It was reasonable to suppose that if a list of stimulus words was chosen specifically on the basis of significant response differences yielded by male and female subjects, it would prove a valuable addition to the test of masculinity and femininity.

    Dr. J. B. Wyman, in connection with the Stanford study of gifted children,² had already demonstrated that a word association test can be constructed which is effective in discriminating between groups of subjects differing in interests, attitudes, and thought trends. Her investigation was not primarily concerned with sex comparisons, but with groups representing the extremes of intellectual interests, social interests, and activity interests. The same principle, however, was involved. If the free association technique is capable of differentiating between subjects having much or little of one of three types of interest, it should be capable of discriminating between subjects differing in mental masculinity and femininity. Dr. Wyman’s adaptation of the method to group testing, making feasible its application to large populations, was also an important consideration. Her procedure was to expose the stimulus word visually, printed in giant type on a card 3 by 12 in., and have the subjects write in a numbered space the one word it made them think of. The responses were scored by the use of weights assigned to them on the basis of the frequency difference between subjects rated high or low in a given type of interest. A majority of the stimulus words in the final form of her test yielded responses which could be weighted for two or all three of the types of interest the author was attempting to measure. Reliability coefficients found for the list of 120 stimulus words ranged from .83 to .87 for intellectual interests, from .82 to .87 for social interests, and from .48 to .87 for activity interests. After correction for attenuation the correlations of the scores with teacher ratings of the subjects for the three types of interests averaged, for six groups of subjects, .65 for intellectual interests, .50 for social interests, and .31 for activity interests.

    In devising the M-F association test the procedure followed was in all essential respects that used by Dr. Wyman. First a short English dictionary was scanned in the search for words which looked as though they might bring sex differences in responses if used as stimulus words in an association test. A tentative list of about 500 such words was selected by the senior author, with the assistance of Edith M. Sprague. The selection was based in part on investigational data in the field of sex differences, but to a greater extent on subjective hunches. Each word in this list was next rated by three judges for probable merit in bringing out sex differences, with the result that the list was reduced from 500 to 220 (110 in each of two forms). The 220 stimulus words were printed in large type on cards like those used by Dr. Wyman and were administered to 400 high-school and university subjects equally divided between the sexes. The words were divided into two forms, A and B, as shown on pages 20 and 21.

    A few examples of responses found to be predominantly masculine or feminine which will give an idea of the possibilities of a test of this kind in the study of sex differences are shown on the following page.

    Reliabilities were secured by application of the test to 128 boys and 134 girls in the junior high school and 87 boys and 92 girls in the senior high school. Even-numbered items were correlated with the odd-numbered and the coefficients were corrected by the Spearman-Brown formula. This was done for scores obtained by two methods: (a) by weighting responses from +6 to −6 according to the amount of sex difference shown, and (b) by weighting each response as either +1 or −1 (masculine or feminine). The two sets of scores will hereafter be designated by the terms weighted scores³ and unit scores. For single-sex groups the reliabilities (based on 220 words) ranged from .45 to .64 for weighted scores, and slightly lower for unit scores. Weighted score reliabilities for the sexes combined ranged from .60 to .81.

    An association test of the kind just described has three serious disadvantages. (1) A majority of the responses have such low frequency that very large numbers of subjects must be tested in order to secure reliable sex differences for a reasonable proportion of the responses occurring. This means that of the entire number of responses given by a subject, many will be responses which cannot be scored. This, of course, seriously reduces the reliability of the test. (2) Scoring is laborious and time consuming, as each of the responses must be looked up in a tabulated list to find the weight it carries. (3) Occasionally a subject with defective vision fails to perceive correctly the stimulus word, while others, failing to find a response to a given stimulus word in the time allowed (10 seconds), lose their place in the column of spaces and misplace succeeding responses. Such misplacement is likely to cause errors in scoring.

    Because of these defects the type of association test first used was finally abandoned in favor of one which required the subject to check that one of four given responses which seemed to go best with the stimulus word. The stimulus words were no longer presented on cards, but were printed in capital letters in a test booklet. Each stimulus word was followed by the four response words printed in lower case and smaller type. Response words were selected which had shown differences in frequency between male and female subjects, and of the four following each stimulus word two were masculine and two feminine. The masculine and feminine responses were arranged in chance order to allow for the tendency of some subjects to check more often the response standing in a particular serial order.

    An experiment was made to find out whether it is possible to make up valid items of this kind without first giving them to subjects in the form of a free association test in order to find response words that have different frequencies for males and females. A list of 51 untried stimulus words was made up, each followed by two response words selected as likely to be masculine and two as likely to be feminine. Of the 51 items devised in this way 28 proved to be usable and in no way inferior to those for which response words were selected on the basis of empirically ascertained sex differences in frequency. This method saves much time and could probably be used advantageously in adapting the multiple-response association test to purposes other than the study of sex differences.

    The multiple-response form of the association test is better adapted to group testing, requires less time for its administration, and can be scored with far greater rapidity. However, the reader will naturally raise the question whether the test thus altered does not lose its essential character as an association technique. Certainly the association is less free, since response is limited to choice among the four alternatives presented. Even if these have been selected from among responses showing high frequency by the earlier method it will nevertheless often happen that in the case of a particular subject no one of the four would have been given by the method of free association. However, the relative merits of the two methods can only be judged by their results. Our data show that by every criterion the multiple-response method is as good for the present purpose as the older method. It is fully as reliable, gives as wide separation of sex groups, and in fact correlates with scores obtained by the standard method almost as highly as the latter correlate with themselves.

    The experimental form of the multiple-response association test was composed of 171 items, 120 from the 220 of the original list and 51 items artificially constructed. It was given to 600 subjects: 100 of each sex in the seventh grade, the junior year of high school, and the university. Both weighted and unit scores, based upon the sex differences found by item tabulation, were used in computing reliabilities and sex overlap. The weighted scores showed only a slight superiority when judged by these two criteria and were discarded in favor of the simpler unit scores.

    All of the types of association tests we have used as measures of M-F differences have low reliability. It will be recalled that the free association list of 220 stimulus words used in the first experiment had a reliability of only .45 to .64 for single-sex groups, and of only .60 to .81 for the sexes combined. The reliabilities of the multiple-response association test, with 171 stimulus words, averaged .59 for single-sex groups and .78 for the sexes combined. The amount of sex overlap was close to 10 per cent for each of the three groups tested.

    For the final M-F test 120 words were selected from the 171 used in the experiment just described. These have been equally divided between Form A and Form B and are reproduced in Appendix IV. The words retained include only those from the original list which showed sex differences in the same direction for at least three out of the four multiple responses in all the groups tested. The reliabilities were then computed for the retained lists of 60 words, both by the split-half method and by the correlation of Form A against Form B. Average reliabilities did not differ significantly for the two methods, averaging about .40 for single-sex groups and .55 for the sexes combined. By application of the Spearman-Brown formula the reliability of the 120 words of Form A and Form B combined is estimated to be about .57 for single-sex groups and .71 for the sexes combined. These figures would have been somewhat higher if less homogeneous groups had been used, that is, if subjects from junior high school, senior high school, college, and a wide variety of adult populations had been thrown together. Even so, the reliability of a test of this type tends to

    Enjoying the preview?
    Page 1 of 1