Selected Portions of

Intelligence: Knowns and Unknowns

Report of a Task Force established by the Board of Scientific Affairs of the American Psychological Association
Released August 7, 1995

Ulric Neisser, PhD, Chair; Emory University

As presented by STALKING THE WILD TABOO

at http://www.lrainc.com/swtaboo/taboos/apa_01.html

[A slightly edited version was published in the American Psychologist, Feb 1996. Official Journal of the APA]

...

I. CONCEPTS OF INTELLIGENCE

Individuals differ from one another in their ability to understand complex ideas, to adapt effectively to the environment, to learn from experience, to engage in various forms of reasoning, to overcome obstacles by taking thought. Although these individual differences can be substantial, they are never entirely consistent: a given person's intellectual performance will vary on different occasions, in different domains, as judged by different criteria. Concepts of "intelligence" are attempts to clarify and organize this complex set of phenomena. Although considerable clarity has been achieved in some areas, no such conceptualization has yet answered all the important questions and none commands universal assent. Indeed, when two dozen prominent theorists were recently asked to define intelligence, they gave two dozen somewhat different definitions (Sternberg & Detterman, 1986). Such disagreements are not cause for dismay. Scientific research rarely begins with fully agreed definitions, though it may eventually lead to them.

This first section of our report reviews the approaches to intelligence that are currently influential, or that seem to be becoming so. Here (as in later sections) much of our discussion is devoted to the dominant psychometric approach, which has not only inspired the most research and attracted the most attention (up to this time) but is by far the most widely used in practical settings. Nevertheless, other points of view deserve serious consideration. Several current theorists argue that there are many different "intelligences" (systems of abilities), only a few of which can be captured by standard psychometric tests. Others emphasize the role of culture, both in establishing different conceptions of intelligence and in influencing the acquisition of intellectual skills. Developmental psychologists, taking yet another direction, often focus more on the processes by which all children come to think intelligently than on measuring individual differences among them. There is also a new interest in the neural and biological bases of intelligence, a field of research that seems certain to expand in the next few years.

In this brief report, we cannot do full justice to even one such approach. Rather than trying to do so, we focus here on a limited and rather specific set of questions:

What are the significant conceptualizations of intelligence at this time? (Section I)
What do intelligence test scores mean, what do they predict, and how well do they predict it? (Section II)
Why do individuals differ in intelligence, and especially in their scores on intelligence tests? Our discussion of these questions implicates both genetic factors (Section III) and environmental factors (Section IV).
Do various ethnic groups display different patterns of performance on intelligence tests, and if so what might explain those differences? (Section V)
What significant scientific issues are presently unresolved? (Section VI)

Public discussion of these issues has been especially vigorous since the 1994 publication of Hermstein and Murray's The Bell Curve, a controversial volume which stimulated many equally controversial reviews and replies. Nevertheless, we do not directly enter that debate. Hermstein and Murray (and many of their critics) have gone well beyond the scientific findings, making explicit recommendations on various aspects of public policy. Our concern here, however, is with science rather than policy. The charge to our Task Force was to prepare a dispassionate survey of the state of the art: to make clear what has been scientifically established, what is presently in dispute, and what is still unknown. In fulfilling that charge, the only recommendations we shall make are for further research and calmer debate.

The Psychometric Approach

Ever since Alfred Binet's great success in devising tests to distinguish mentally retarded children from those with behavior problems, psychometric instruments have played an important part in European and American life. Tests are used for many purposes, such as selection, diagnosis, and evaluation. Many of the most widely used tests are not intended to measure intelligence itself but some closely related construct: scholastic aptitude, school achievement, specific abilities, etc. Such tests are especially important for selection purposes. For preparatory school, it's the SSAT; for college, the SAT or ACT; for graduate school, the GRE; for medical school, the MOAT; for law school, the LSAT; for business school, the GMAT. Scores on intelligence-related tests matter, and the stakes can be high.

Intelligence tests. Tests of intelligence itself (in the psychometric sense) come in many forms. Some use only a single type of item or question; examples include the Peabody Picture Vocabulary Test (a measure of children's verbal intelligence) and Raven's Progressive Matrices (a nonverbal, untimed test that requires inductive reasoning about perceptual patterns). Although such instruments are useful for specific purposes, the more familiar measures of general intelligence, such as the Wechsler tests and the Stanford-Binet, include many different types of items, both verbal and nonverbal. Test-takers may be asked to give the meanings of words, to complete a series of pictures, to indicate which of several words does not belong with the others, and the like. Their performance can then be scored to yield several subscores as well as an overall score.

By convention, overall intelligence test scores are usuallv converted to a scale in which the mean is 100 and the standard deviation is 15. (The standard deviation is a measure of the variability of the distribution of scores.) Approximately 95% of the population has scores within two standard deviations of the mean, i.e. between 70 and 130. For historical reasons, the term "IQ" is often used to describe scores on tests of intelligence. It originally referred to an "intelligence Quotient" that was formed by dividing a so-called mental age by a chronological age, but this procedure is no longer used.

Intercorrelations among Tests. Individuals rarely perform equally well on all the different kinds of items included in a test of intelligence. One person may do relatively better on verbal than on spatial items, for example, while another may show the opposite pattern. Nevertheless, subtests measuring different abilities tend to be positively correlated: people who score high on one such subtest are likely to be above average on others as well. These complex patterns of correlation can be clarified by factor analysis, but the results of such analyses are often controversial themselves. Some theorists (e.g., Spearman, 1927) have emphasized the importance of a general factor, g, which represents what all the tests have in common; others (e.g., Thurstone, 1938) focus on more specific group factors such as memory, verbal comprehension, or number facility. As we shall see in Section 2, one common view today envisages something like a hierarchy of factors with g at the apex. But there is no full agreement on what g actually means: it has been described as a mere statistical regularity (Thompson, 1939), a kind of mental energy (Spearman, 1927), a generalized abstract reasoning ability (Gustafsson 1984), or an index measure of neural processing speed (Reed & Jensen, 1992).

There have been many disputes over the utility of IQ and g. Some theorists are critical of the entire psychometric approach (e.g., Ceci, 1990; Gardner, 1983; Gould, 1978), while others regard it as firmly established (e.g., Carroll, 1993; Eysenck, 1973; Hermstein & Murray, 1994; Jensen, 1972). The critics do not dispute the stability of test scores, nor the fact that they predict certain forms of achievement-especially school achievement--rather effectively (see Section 2). They do argue, however, that to base a concept of intelligence on test scores alone is to ignore many important aspects of mental ability. Some of those aspects are emphasized in other approaches reviewed below.

Multiple Forms of Intelligence

Gardner's Theory. A relatively new approach is the theory of "multiple intelligences"; proposed by Howard Gardner (1983). On this view conceptions of intelligence should be informed not only by work with normal children and adults but also by studies of gifted individuals (including so-called 'savants"), of persons who have suffered brain damage, of experts and virtuosos, and of individuals from diverse cultures. These considerations have led Gardner to include musical, bodlily-kinesthetic, and various forms of personal intelligence as well as more familiar spatial, linguistic, and logical mathematical abilities in the scope of his theory. He argues that psychometric tests address only linguistic and logical plus some aspects of spatial intelligence; other forms have been entirely ignored. Moreover, the paper and-pencil format of most tests rules out many kinds of intelligent performance that matter in everyday life, such as giving an extemporaneous talk (linguistic) or being able to find one's way in a new town (spatial). While Gardner's arguments have attracted considerable interest, the stability and validity of performance tests in these new domains has yet to be conclusively demonstrated. It is also possible to doubt whether some of these abilities-bodily-kinesthetic," for example--are appropriately described as forms of intelligence rather than as special talents.

Sternberg's Theory. Robert Sternberg's (1985) triarchic theory proposes three fundamental aspects of intelligence-analytic, creative, and practical--of which only the first is measured to any significant extent by mainstream tests. His investigations suggest the need for a balance between analytic intelligence, on the one hand, and creative and especially practical intelligence on the other. The distinction between analytic (or "academic") and practical intelligence has also been made by others (e.g., Neisser, 1976). Analytic problems, of the type suitable for test construction, tend to (a) have been formulated by other people, (b) be clearly defined, (c) come with all the information needed to solve them, (d) have only a single right answer, which can be reached by only a single method, (e) be disembodied from ordinary experience, and (f) have little or no intrinsic interest. Practical problems, in contrast, tend to (a) require problem recognition and formulation, (b) be poorly defined, (c) require information seeking, (d) have various acceptable solutions, (e) be embedded in and require prior everyday experience, and (f) require motivation and personal involvement.

As part of their study of practical intelligence, Sternberg and his collaborators have developed measures of "tacit knowledge" in various domains, especially business management. In these measures, individuals are given written scenarios of various work related situations and then asked to rank a number of options for dealing with the situation presented. The results show that tacit knowledge predicts such criteria such as job performance fairly well, even though it is relatively independent of intelligence test scores and other common selection measures (Sternberg & Wagner, 1993; Sternberg, Wagner, Williams & Horvath, in press). This work, too, has its critics (Jensen, 1993; Schmidt & Hunter, 1993).

Related Findings. Other investigators have also demonstrated the relative independence of academic and practical intelligence. Brazilian street children, for example, are quite capable of doing the math required for survival in their street business even though they have failed mathematics in school (Carraher, Carraher, and Schliemann, 1985). Similarly, women shoppers in California who had no difficulty in comparing product values at the supermarket were unable to carry out the same mathematical operations in paper-and pencil tests (Lave, 1988). In a study of expertise in wagering on harness races, Ceci and Liker (1986) found that the skilled handicappers implicitly used a highly complex interactive model with as many as seven variables; the ability to do this successfully was unrelated to scores on intelligence tests.

Cultural Variation.

It is very difficult to compare concepts of intelligence across cultures. English is not alone in having many words for different aspects of intellectual power and cognitive skill (wise, sensible, smart, bright, clever; cunning, etc.); if another language has just as many, which of them shall we say corresponds to its speakers' "concept of intelligence"? The few attempts to examine this issue directly have typically found that, even within a given society, different cognitive characteristics are emphasized from one situation to another and from one subculture to another(Serpell, 1974; Super, 1983; Wober, 1974). These differences extend not just to conceptions of intelligence but to what is considered adaptive or appropriate in a broader sense.

These issues have occasionally been addressed across sub-cultures and ethnic groups in America. In a study conducted in San Jose California, Okagaki and Sternberg (1993) asked immigrant parents from Cambodia, Mexico, the Philippines and Vietnam, as well as native-born Angle-Americans and Mexican-Americans, about their conceptions of child-rearing, appropriate teaching, and children's intelligence. Parents from all groups except Angle-Americans indicated that such characteristics as motivation, social skills, and practical school skills were as or more important than cognitive characteristics for their conceptions of an intelligent first-grade child.

Heath (1983) found that different ethnic groups in North Carolina have different conceptions of intelligence. To be considered as intelligent or adaptive, one must excel in the skills valued by one's own group. One particularly interesting contrast was in the importance ascribed to verbal vs. nonverbal communication skills--to saying things explicitly as opposed to using and understanding gestures and facial expressions. Note that while both these forms of communicative skill have their uses, they are not equally well represented in psychometric tests.

How testing is done can have different effects in different cultural groups. This can happen for many reasons, including differential familiarity with the test materials themselves. Serpell (1979), for example, asked Zambian and English children to reproduce patterns in three media: wire models, clay models, or pencil and paper. The Zambian children excelled in the wire medium with which they were familiar, while the English children were best with pencil and paper. Both groups performed equally well with clay.

Developmental Progressions

Piaget's Theory. The best-known developmentally-based conception of intelligence is certainly that of the Swiss psychologist Jean Piaget (1972). Unlike most of the theorists considered here, Piaget had relatively little interest in individual differences. Intelligence develops in all children through the continually shifting balance between the assimilation of new information into existing cognitive structures and the accommodation of those structures themselves to the new information. To index the development of intelligence in this sense, Piaget devised methods that are rather different from conventional tests. To assess the understanding of "conservation." for example, (roughly, the principle that material quantity is not affected by mere changes of shape), children who have watched water being poured from a shallow to a tall beaker may be asked if there is now more water than before. (A positive answer would suggest that the child has not yet mastered the principle of conservation.) Piaget's tasks can be modified to serve as measures of individual differences; when this is done, they correlate fairly well with standard psychometric tests (for a review see Jensen, 1980).

Vygotsky's Theory. The Russian psychologist Lev Vygotsky (1978) argued that all intellectual abilities are social in origin. Language and thought first appear in early interactions with parents, and continue to develop through contact with teachers and others. Traditional intelligence tests ignore what Vygotsky called the "zone of proximal development." i.e., the level of performance that a child might reach with appropriate help from a supportive adult. Such tests are "static." measuring only the intelligence that is already fully developed. "Dynamic" testing, in which the examiner provides guided and graded feedback, can go further to give some indication of the child's latent potential. These ideas are being developed and extended by a number of contemporary psychologists (Brown & French, 1979; Feuerstein, 1980; Pascual-Leone & Ijaz, 1989).

Biological Approaches

Some investigators have recently turned to the study of the brain as a basis for new ideas about what intelligence is and how to measure it. Many aspects of brain anatomy and physiology have been suggested as potentially relevant to intelligence: the arborization of cortical neurons (Ceci, 1990), cerebral glucose metabolism (Haier 1993), evoked potentials (Caryl, 1994), nerve conduction velocity (Reed & Jensen, 1992), sex hormones (see Section 4), and still others (cf. Vernon, 1993). Advances in research methods, including new forms of brain imaging such as PET and MRI scans, will surely add to this list. In the not-too-distant future it may be possible to relate some aspects of test performance to specific characteristics of brain function.

This brief survey has revealed a wide range of contemporary conceptions of intelligence and of how it should be measured. The psychometric approach is the oldest and best established, but others also have much to contribute. We should be open to the possibility that our understanding of intelligence in the future will be rather different from what it is today.

II INTELLIGENCE TESTS AND THEIR CORRELATES

...

Tests as Predictors

School Performance. Intelligence tests were originally devised by Alfred Binet to measure children's ability to succeed in school. They do in fact predict school performance fairly well: the correlation between IS scores and grades is about .50. They also predict scores on school achievement tests, designed to measure knowledge of the curriculum. Note, however, that correlations of this magnitude account for only about 25% of the overall variance. Successful school learning depends on many personal characteristics other than intelligence, such as persistence, interest in school, and willingness to study. The encouragement for academic achievement that is received from peers, family and teachers may also be important, together with more general cultural factors (see Section 5).

The relationship between test scores and school performance seems to be ubiquitous. Wherever it has been studied, children with high scores on tests of intelligence tend to learn more of what is taught in school than their lower-scoring peers. There may be styles of teaching and methods of instruction that will decrease or increase this correlation, but none that consistently eliminates it has yet been found (Cronbach and Snow, 1977).

What children learn in school depends not only on their individual abilities but also on teaching practices and on what is actually taught. Recent comparisons among pupils attending school in different countries have made this especially obvious. Children in Japan and China, for example, know a great deal more math than American children even though their intelligence test scores are quite similar (see Section 5). This difference may result from many factors, including cultural attitudes toward schooling as well as the sheer amount of time devoted to the study of mathematics and how that study is organized (Stevenson & Stigler, 1992). In principle it is quite possible to improve the school learning of American children--even very substantially-without changing their intelligence test scores at all.

Years of Education. Some children stay in school longer than others; many go on to college and perhaps beyond. Two variables that can be measured as early as elementary school correlate with the total amount of education individuals will obtain: test scores and social class background. Correlations between IQ scores and total years of education are about .55, implying that differences in psychometric intelligence account for about 30% of the outcome variance. The correlations of years of education with social class background (as indexed by the occupation/ education of a child's parents) are also positive, but somewhat lower.

There are a number of reasons why children with higher test scores tend to get more education. They are likely to get good grades, and to be encouraged by teachers and counselors; often they are placed in "college preparatory" classes, where they make friends who may also encourage them. In general, they are likely to find the process of education rewarding in a way that many low-scoring children do not (Rehberg and Rosenthal, 1978). These influences are not omnipotent: some high scoring children do drop out of school. Many personal and social characteristics other than psychometric intelligence determine academic success and interest, and social privilege may also play a role. Nevertheless, test scores are the best single predictor of an individual's years of education.

In contemporary American society, the amount of schooling that adults complete is also somewhat predictive of their social status. Occupations considered high in prestige (e.g., law, medicine, even corporate business) usually require at least a college degree-16 or more years of education-as a condition of entry. It is partly because intelligence test scores predict years of education so well that they also predict occupational status, and even income to a smaller extent, (Jencks, 1979). Moreover, many occupations can only be entered through professional schools which base their admissions at least partly on test scores: the MCAT, the GMAT, the LSAT, etc. Individual scores on admission-related tests such as these are certainly correlated with scores on tests of intelligence.

Social Status and Income. How well do IQ scores (which can be obtained before individuals enter the labor force) predict such outcome measures as the social status or income of adults? This question is complex, in part because another variable also predicts such outcomes: namely, the socioeconomic status (SES) of one's parents. Unsurprisingly, children of privileged families are more likely to attain high social status than those whose parents are poor and less educated. These two predictors (IQ and parental SES) are by no means independent of one another; the correlation between them is around .33 (White, 1982).

One way to look at these relationships is to begin with SES. According to Jencks (1979), measures of parental SES predict about one-third of the variance in young adults' social status and about one-fifth of the variance in their income. About half of this predictive effectiveness depends on the fact that the SES of parents also predicts children's intelligence test scores, which have their own predictive value for social outcomes; the other half comes about in other ways.

We can also begin with IQ scores, which by themselves account for about one-fourth of the social status variance and one-sixth of the income variance. Statistical controls for parental SES eliminate only about a quarter of this predictive power. One way to conceptualize this effect is by comparing the occupational status (or income) of adult brothers who grew up in the same family and hence have the same parental SES. In such cases, the brother with the higher adolescent IQ score is likely to have the higher adult social status and income (Jencks, 1979). This effect, in turn, is substantially mediated by education: the brother with the higher test scores is likely to get more schooling, and hence to be better credentialled as he enters the workplace.

Do these data imply that psychometric intelligence is a major determinant of social status or income? That depends on what one means by major. In fact, individuals who have the same test scores may differ widely in occupational status and even more widely in income. Consider for a moment the distribution of occupational status scores for all individuals in a population, and then consider the conditional distribution of such scores for just those individuals who test at some given I8. Jencks (1979) notes that the standard deviation of the latter distribution may still be quite large; in some cases it amounts to about 88% of the standard deviation for the entire population. Viewed from this perspective, psychometric intelligence appears as only one of a great many factors that influence social outcomes.

Job Performance. Scores on intelligence tests predict various measures of job performance: supervisor ratings, work samples, etc. Such correlations, which typically lie between r=.30 and r=.50, are partly restricted by the limited reliability of those measures themselves. They become higher when ris statistically corrected for this unreliability: in one survey of relevant studies (Hunter, 1983), the mean of the corrected correlations was .54. This implies that, across a wide range of occupations, intelligence test performance accounts for some 29% of the variance in job performance.

Although these correlations can sometimes be modified by changing methods of training or aspects of the job itself, intelligence test scores are at least weakly related to job performance in most settings. Sometimes 19 scores are described as the 'best available predictor" of that performance. It is worth noting, however, that such tests predict considerably less than half the variance of job-related measures. Other individual characteristics such as interpersonal skills, aspects of personality, etc., are probably of equal or greater importance, but at this point we do not have equally reliable instruments to measure them.

Social Outcomes. Psychometric intelligence is negatively correlated with certain socially undesirable outcomes. For example, children with high test scores are less likely than lower-scoring children to engage in juvenile crime. in one study, Moffitt, Gabrielli, Mednick & Schulsinger (1981) found a correlation of -.19 between IQ scores and number of juvenile offenses in a large Danish sample; with social class controlled, the correlation dropped to -. 17. The correlations for most "negative outcome" variables are typically smaller than .20, which means that test scores are associated with less than 4% of their total variance. It is important to realize that the causal links between psychemetric ability and social outcomes may be indirect. Children who are unsuccessful in-and hence alienated from-school may be more likely to engage in delinquent behaviors for that very reason, compared to other children who enjoy school and are doing well.

In summary, intelligence test scores predict a wide range of social outcomes with varying degrees of success. Correlations are highest for school achievement, where they account for about a quarter of the variance. They are somewhat lower for job performance, and very low for negatively valued outcomes such as criminality. In general, intelligence tests measure only some of the many personal characteristics that are relevant to life in contemporary America. Those characteristics are never the only influence on outcomes, though in the case of school performance they may well be the strongest.

...

III. THE GENES AND INTELLIGENCE

In this section of the report we first discuss individual differences generally, without reference to any particular trait. We then focus on intelligence, as measured by conventional IQ tests or other tests intended to measure general cognitive ability. The different and more controversial topic of group differences will be considered in Section V.

We focus here on the relative contributions of genes and environments to individual differences in particular traits. To avoid misunderstanding, it must be emphasized from the outset that gene action always involves an environment--at least a biochemical environment, and often an ecological one. (For humans, that ecology is usually interpersonal or cultural.) Thus all genetic effects on the development of observable traits are potentially modifiable by environmental input, though the practicability of making such modifications may be another matter. Conversely, all environmental effects on trait development involve the genes or structures to which the genes have contributed. Thus there is always a genetic aspect to the effects of the environment (cf. Plomin & Bergeman, 1991).

Sources of Individual Differences

Partitioning the Variation. Individuals differ from one another on a wide variety of traits: familiar examples include height. intelligence, and aspects of personality. Those differences are often of considerable social importance. Many interesting questions can be asked about their nature and origins. One such question is the extent to which they reflect differences among the genes of the individuals involved, as distinguished from differences among the environments to which those individuals have been exposed. The issue here is not whether genes and environments are both essential for the development of a given trait (this is always the case), and it is not about the genes or environment of any particular person. We are concerned only with the observed variation of the trait across individuals in a given population. A figure called the "heritability" (h²) of the trait represents the proportion of that variation that is associated with genetic differences among the individuals. The remaining variation (1 - h²] is associated with environmental differences and with errors of measurement. These proportions can be estimated by various methods described below.

Sometimes special interest attaches to those aspects of environments that family members have in common (for example, characteristics of the home). The part of the variation that derives from this source, called "shared" variation or c², can also be estimated....

A high heritability does not mean that the environment has no impact on the development of a trait, or that learning is not involved. Vocabulary size, for example, is very substantially heritable (and highly correlated with general intelligence) although every word in an individual's vocabulary is learned. In a society in which plenty of words are available in everyone's environment, especially for individuals who are motivated to seek them out, the number of words that individuals actually learn depends to a considerable extent on their genetic predispositions.

...

How Genetic Estimates are Made. Estimates of the magnitudes of these sources of individual differences are made by exploiting natural and social 'experiments" that combine genotypes and environments in informative ways. Monozygotic (MZ) and dyzygotic (DZ) twins, for example, can be regarded as experiments of nature. MZ twins are paired individuals of the same age growing up in the same family who have all their genes in common; DZ twins are otherwise similar pairs who have only half their genes in common. Adoptions, in contrast, are experiments of society. They allow one to compare genetically unrelated persons who are growing up in the same family as well as genetically related persons who are growing up in different families. They can also provide information about genotype-environment correlations: in ordinary families genes and environments are correlated because the same parents provide both, whereas in adoptive families one set of parents provides the genes and another the environment. An experiment involving both nature and society is the study of monozygotic twins who have been reared apart (Bouchard, Lykken, McGue, Segal & Tellegen, 1990; Pedersen, Plomin, Nesselroade & McClearn, 1992). Relationships in the families of monozygotic twins also offer unique possibilities for analysis (e.g., Rose, Harris, Christian, & Nance, 1979). Because these comparisons are subject to different sources of potential error, the results of studies involving several kinds of kinship are often analyzed together to arrive at robust overall conclusions. (For general discussions of behavior genetic methods, see Plomin, DeFries, & McClearn, 1990, or Hay, 1985.)

Results for IQ scores

Parameter Estimates. Across the ordinary range of environments in modern Western societies, a sizable part of the variation in intelligence test scores is associated with genetic differences among individuals. Quantitative estimates vary from one study to another, because many are based on small or selective samples. If one simply combines all available correlations in a single analysis, the heritability (h²) works out to about .50 and the between-family variance (c²) to about .25 (e.g., Chipuer, Rovine, & Plomin, 1990; Loehlin, 1989). These overall figures are misleading, however, because most of the relevant studies have been done with children. We now know that the heritability of IQ changes with age: h² goes up and c² goes down from infancy to adulthood (McCartney, Harris, & Bernieri, 1990; McGue, Bouchard, Iacono, & Lykken, 1993). In childhood h² and C² for IQ are of the order of .45 and .35; by late adolescence h² is around .75 and c² is quite low (zero in some studies). Substantial environmental variance remains, but it primarily reflects within-family rather than between-family differences.

...

Implications. Estimates of h² and c² for IQ (or any other trait) are descriptive statistics for the populations studied. (In this respect they are like means and standard deviations.) They are outcome measures, summarizing the results of a great many diverse, intricate, individually variable events and processes, but they can nevertheless be quite useful. They can tell us how much of the variation in a given trait the genes and family environments explain, and changes in them place some constraints on theories of how this occurs. On the other hand they have little to say about specific mechanisms, i.e. about how genetic and environmental differences get translated into individual physiological and psychological differences. Many psychologists and neuroscientists are actively studying such processes; data on heritabilities may give them ideas about what to look for and where or when to look for it.

A common error is to assume that because something is heritable it is necessarily unchangeable This is wrong. Heritability does not imply immutability. As previously noted, heritable traits can depend on learning, and they may be subject to other environmental effects as well.

...

IV. ENVIRONMENTAL EFFECTS ON INTELLIGENCE

The 'environment" includes a wide range of influences on intelligence. Some of those variables affect whole populations, while others contribute to individual differences within a given group. Some of them are social, some are biological; at this point some are still mysterious. It may also happen that the proper interpretation of an environmental variable requires the simultaneous consideration of genetic effects. Nevertheless, a good deal of solid information is available.

Social Variables

It is obvious that the cultural environment - how people live, what they value, what they do - has a significant effect on the intellectual skills developed by individuals. Rice farmers in Liberia are good at estimating quantities of rice (Gay & Cole, 1967); children in Botswana, accustomed to storytelling, have excellent memories for stories (Dube, 1982). Both these groups were far ahead of American controls on the tasks in question. On the other hand Americans and other Westernized groups typically outperform members of traditional societies on psychometric tests, even those designed to be "culture-fair."

Cultures typically differ from one another in so many ways that particular differences can rarely be ascribed to single causes. Even comparisons between subpopulations are often difficult to interpret. If we find that groups living in different environments (e.g., middle-class and poor Americans) differ in their test scores, it is easy to suppose that the environmental difference causes the IQ difference. But there is also an opposite direction of causation: individuals may come to be in one environment or another because of differences in their own abilities, including the abilities measured by intelligence tests. Waller (1971) has shown, for example, that sons whose IQ scores are above those of their fathers also tend to achieve a higher social class status; conversely, those with scores below their fathers' tend to achieve lower status. Such an effect is not surprising, given the relation between IQ scores and years of education reviewed in Section II.

Occupation. In section II we noted that intelligence test scores predict occupational level, not only because some occupations require more intelligence than others but also because admission to many professions depends on test scores in the first place. There can also be an effect in the opposite direction, i.e. workplaces may affect the intelligence of those who work in them. Kohn and Schooler (1973), who interviewed some 3000 men in various occupations (farmers, managers, machinists, porters...), argued that more "complex" jobs produce more "intellectual flexibility" in the individuals who hold them. Although the issue of direction of effects complicates the interpretation of their study, this remains a plausible suggestion.

Among other things, Kohn & Schooler's hypothesis may help us understand urban/rural differences. A generation ago these were substantial in the United States, averaging about six IQ points or 0.4 standard deviations (Terman & Merrill, 1937; Seashore, Wesman & Doppelt, 1950). In recent years the difference has declined to about two points (Kaufman & Doppelt, 1976; Reynolds, Chastain, Kaufman & McLean, 1987). In all likelihood this urban/ rural convergence primarily reflects environmental changes: a decrease in rural isolation (due to increased travel and mass communications), an improvement in rural schools, the greater use of technology on farms. All these changes can be regarded as increasing the "complexity" of the rural environment in general or of farm work in particular. (However, processes with a genetic component, e.g., changes in the selectivity of migration from farm to city, cannot be completely excluded as contributing factors.)

Schooling. Attendance at school is both a dependent and an independent variable in relation to intelligence. On the one hand, children with higher test scores are less likely to drop out, more likely to be promoted from grade to grade and then to attend college. Thus the number of years of education that adults complete is roughly predictable from their childhood scores on intelligence tests. On the other hand schooling itself changes mental abilities, including those abilities measured on psychometric tests. This is obvious for tests like the SAT that are explicitly designed to assess school learning, but it is almost equally true of intelligence tests themselves.

The evidence for the effect of schooling on intelligence test scores takes many forms (Ceci, 1991). When children of nearly the same age go through school a year apart (because of birthday-related admission criteria), those who have been in school longer have higher mean scores. Children who attend school intermittently score below those who go regularly, and test performance tends to drop over the summer vacation. A striking demonstration of this effect appeared when the schools in one Virginia county closed for several years in the 1960s to avoid integration, leaving most Black children with no formal education at all. Compared to controls, the intelligence-test scores of these children dropped by about 0.4 standard deviations (6 points) per missed year of school (Green et al, 1964).

Schools affect intelligence in several ways, most obviously by transmitting information. The answers to questions like "Who wrote Hamlet?" and "What is the boiling point of water?" are typically learned in school, where some pupils learn them more easily and thoroughly than others. Perhaps at least as important are certain general skills and attitudes: systematic problem-solving, abstract thinking, categorization, sustained attention to material of little intrinsic interest, repeated manipulation of basic symbols and operations. There is no doubt that schools promote and permit the development of significant intellectual skills, which develop to different extents in different children. It is because tests of intelligence draw on many of those same skills that they predict school achievement as well as they do.

To achieve these results, the school experience must meet at least some minimum standard of quality. In very poor schools, children may learn so little that they fall farther behind the national IQ norms for every year of attendance. When this happens, older siblings have systematically lower scores than their younger counterparts. This pattern of scores appeared in at least one rural Georgia school system in the 1970s (Jensen, 1977). Before desegregation, it must have been characteristic of many of the schools attended by Black pupils in the South. In a study based on Black children who had moved to Philadelphia at various ages during this period, Lee (1951) found that their IQ scores went up more than half a point for each year that they were enrolled in the Philadelphia system.

Interventions. Intelligence test scores reflect a child's standing relative to others in his or her age cohort. Very poor or interrupted schooling can lower that standing substantially; are there also ways to raise it? In fact many interventions have been shown to raise test scores and mental ability 'in the short run" (i.e. while the program itself was in progress), but long-run gains have proved more elusive. One noteworthy example of (at least short-run) success was the Venezuelan Intelligence Project (Hermstein et al, 1986), in which hundreds of seventh-grade children from underprivileged backgrounds in that country were exposed to an extensive, theoretically based curriculum focused on thinking skills. The intervention produced substantial gains on a wide range of tests, but there has been no follow-up.

Children who participate in "Head Start" and similar programs are exposed to various school-related materials and experiences for one or two years. Their test scores often go up during the course of the program, but these gains fade with time. By the end of elementary school, there are usually no significant I9 or achievement-test differences between children who have been in such programs and controls who have not. There may, however, be other differences. Follow-up studies suggest that children who participated in such programs as preschoolers are less likely to be assigned to special education, less likely to be held back in grade, and more likely to finish high school than matched controls (Consortium for Longitudinal Studies, 1983; Darlington, 1986; but see Locurto, 1991).

More extensive interventions might be expected to produce larger and more lasting effects, but few such programs have been evaluated systematically. One of the more successful is the Carolina Abecedarian Project (Campbell & Ramey, 1994), which provided a group of children with enriched environments from early infancy through preschool and also maintained appropriate controls. The test scores of the enrichment-group children were already higher than those of controls at age two; they were still some five points higher at age twelve, seven years after the end of the intervention. Importantly, the enrichment group also outperformed the controls in academic achievement.

Family environment. No one doubts that normal child development requires a certain minimum level of responsible care. Severely deprived, neglectful, or abusive environments must have negative effects on a great many aspects of development, including intellectual aspects. Beyond that minimum, however, the role of family experience is now in serious dispute (Baumrind, 1993; Jackson, 1993; Scarr, 1992, 1993). Psychometric intelligence is a case in point. Do differences between children's family environments (within the normal range) produce differences in their intelligence test performance? The problem here is to disentangle causation from correlation. There is no doubt that such variables as resources of the home (Gottfried, 1984) and parents' use of language (Hart & Risley, 1992, in press) are correlated with children's IQ scores, but such correlations may be mediated by genetic as well as (or instead of) environmental factors.

...

These findings suggest that differences in the life styles of families whatever their importance may be for many aspects of children's lives make little long-term difference for the skills measured by intelligence tests. We should note, however, that low-income and non-white families are poorly represented in existing adoption studies as well as in most twin samples. Thus it is not yet clear whether these surprisingly small values of (adolescent) c² apply to the population as a whole. It re-mains possible that, across the full range of income and ethnicity, between-family differences have more lasting consequences for psychometric intelligence.

Biological Variables

Every individual has a biological as well as a social environment, one that begins in the womb and extends throughout life. Many aspects of that environment can affect intellectual development. We now know that a number of biological factors, including malnutrition, exposure to toxic substances, and various prenatal and perinatal stressors, result in lowered psychometric intelligence under at least some conditions.

Nutrition. There has been only one major study of the effects of prenatal malnutrition (i.e. malnutrition of the mother during pregnancy) on long-term intellectual development. Stein et al (1975) analyzed the test scores of Dutch 19-year-old males in relation to a wartime famine that had occurred in the winter of 1944-45, just before their birth. In this very large sample (made possible by a universal military induction requirement), exposure to the famine had no effect on adult intelligence. Note, however, that the famine itself lasted only a few months; the subjects were exposed to it prenatally but not after birth.

In contrast, prolonged malnutrition during childhood does have long-term intellectual effects. These have not been easy to establish, in part because many other unfavorable socioeconomic conditions are often associated with chronic malnutrition (Ricciuti, 1993; but cf. Sigman, 1995). In one intervention study, however, pre-schoolers in two Guatemalan villages (where undernourishment is common) were given ad lib access to a protein dietary supplement for several years. A decade later, many of these children (namely, those from the poorest socio-economic levels) scored significantly higher on school related achievement tests than comparable controls (Pollitt et al, 1993). It is worth noting that the effects of poor nutrition on intelligence may well be indirect. Malnourished children are typically less responsive to adults, less motivated to learn, and less active in exploration than their more adequately nourished counterparts.

...

Lead. Certain toxins have well established negative effects on intelligence. Exposure to lead is one such factor. In one long-term study (McMichael et al, 1988; Baghurst et al, 1992), the blood lead levels of children growing up near a lead smelting plant were substantially and negatively correlated with intelligence test scores throughout childhood. No "threshold dose" for the effect of lead appears in such studies. Although ambient lead levels in the United States have been reduced in recent years, there is reason to believe that some American children - especially those in inner cities - may still be at risk from this source (cf. Needleman, Geiger & Frank, 1985).

Alcohol Extensive prenatal exposure to alcohol (which occurs if the mother drinks heavily during pregnancy) can give rise to fetal alcohol syndrome, which includes mental retardation as well as a range of physical symptoms. Smaller "doses" of prenatal alcohol may have negative effects on intelligence even when the full syndrome does not appear. Streissguth et al (1989) found that mothers who reported consuming more than 1.5 oz, of alcohol daily during pregnancy had children who scored some five points below controls at age four. Prenatal exposure to aspirin and antibiotics had similar negative effects in this study.

Perinatal Factors. Complications at delivery and other negative perinatal factors may have serious consequences for development. Nevertheless, because they occur only rarely, they contribute relatively little to the population variance of intelligence [Broman et al, 1975). Down's syndrome, a chromosomal abnormality that produces serious mental retardation, is also rare enough to have little impact on the overall distribution of test scores.

The correlation between birth weight and later intelligence deserves particular discussion. In some cases low birth weight simply reflects premature delivery; in others, the infant's size is below normal for its gestational age. Both factors apparently contribute to the tendency of low-birth-weight infants to have lower test scores in later childhood (Lubchenko, 1976). These correlations are small, ranging from .05 to .13 in different groups (Broman et al, 1975). The effects of low birth weight are substantial only when it is very low indeed (less than 1500 gm). Premature babies born at these very low birth weights are behind controls on most developmental measures; they often have severe or permanent intellectual deficits (Rosetti, 1986).

Continuously Rising Test Scores

Perhaps the most striking of all environmental effects is the steady worldwide rise in intelligence test performance. Although many psychometricians had noted these gains, it was James Mynn (1984, 1987) who first described them systematically. His analysis shows that performance has been going up ever since testing began. The "Flynn Effect" is now very well documented, not only in the United States but in many other technologically advanced countries. The average gain is about three IQ points per decade; more than a full standard deviation since, say, 1940.

Although it is simplest to describe the gains as increases in population IQ, this is not exactly what happens. Most intelligence tests are "re-standardized" from time to time, in part to keep up with these very gains. As part of this process the mean score of the new standardization sample is typically set to 100 again, so the increase more or less disappears from view. In this context, the Flynn effect means that if twenty years have passed since the last time the test was standardized, people who now score 100 on the new version would probably average about 106 on the old one.

The sheer extent of these increases is remarkable, and the rate of gain may even be increasing. The scores of nineteen-year-olds in the Netherlands, for example, went up more than 8 points--over half a standard deviation-between 1972 and 1982. What's more, the largest gains appear on the types of tests that were specifically designed to be free of cultural influence (Flynn, 1987). One of these is Raven's Progressive Matrices, an untimed non-verbal test that many psychometricians regard as a good measure of g.

These steady gains in intelligence test performance have not always been accompanied by corresponding gains in school achievement. Indeed, the relation between intelligence and achievement test scores can be complex. This is especially true for the Scholastic Aptitude Test (SAT), in part because the ability range of the students who take the SAT has broadened over time. That change explains some portion, but not all, of the prolonged decline in SAT scores that took place from the mid nineteen-sixties to the early eighties, even as IQ scores were continuing to rise(Flynn, 1984). Meanwhile, however, other more representative measures show that school achievement levels have held steady or in some cases actually increased [Hermstein & Murray, 1994). The National Assessment of Educational Progress (NAEP), for example, shows that the average reading and math achievement of American 13- and l7-year-olds improved somewhat from the early nineteen-seventies to 1990 (Grissmer, Kirby, Berends & Williamson, 1994). An analysis of these data by ethnic group, reported in Section 5, shows that this small overall increase actually reflects very substantial gains by Blacks and Latinos combined with little or no gain by Whites.

The consistent IQ gains documented by Flynn seem much too large to result from simple increases in test sophistication. Their cause is presently unknown, but three interpretations deserve our consideration. Perhaps the most plausible of these is based on the striking cultural differences between successive generations. Daily life and occupational experience both seem more "complex" (Kohn & Schooler, 1973) today than in the time of our parents and grandparents. The population is increasingly urbanized; television exposes us to more information and more perspectives on more topics than ever before; children stay in school longer; almost everyone seems to be encountering new forms of experience. These changes in the complexity of life may have produced corresponding changes in complexity of mind, and hence in certain psychometric abilities.

A different hypothesis attributes the gains to modern improvements in nutrition. Lynn (1990) points out that large nutritionally-based increases in height have occurred during the same period as the IQ gains: perhaps there have been increases in brain size as well. As we have seen, however, the effects of nutrition on intelligence are themselves not firmly established.

The third interpretation addresses the very definition of intelligence. Flynn himself believes that real intelligence-whatever it may be--cannot have increased as much as these data would suggest. Consider, for example, the number of individuals who have IQ scores of 140 or more. (This is slightly above the cutoff used by L.M. Terman (1925) in his famous longitudinal study of "genius.") In 1952 only 0.38% of Dutch test takers had IQs over 140; in 1982, scored by the same norms, 9. 12% exceeded this figure! Judging by these criteria, the Netherlands should now be experiencing "...a cultural renaissance too great to be overlooked" (Flynn, 1987, p.187). So too should France, Norway, the United States, and many other countries. Because Flynn (1987) finds this conclusion implausibie or absurd, he argues that what has risen cannot be intelligence itself but only a minor sort of "abstract problem solving ability." The issue remains unresolved.

Individual Life Experiences

Although the environmental variables that produce large differences in intelligence are not yet well understood, genetic studies assure us that they exist. With a heritability well below 1.00, IQ must be subject to substantial environmental influences. Moreover, available heritability estimates apply only within the range of environments that are well-represented in the present population. We already know that some relatively rare conditions, like those reviewed earlier, have large negative effects on intelligence. Whether there are (now equally rare) conditions that have large positive effects is not known.

As we have seen, there is both a biological and a social environment. For any given child, the social factors include not only an overall cultural/ social/school setting and a particular family but also a unique "micro-environment" of experiences that are shared with no one else. The adoption studies reviewed in Section 3 show that family variables, such as differences in parenting style, in the resources of the home, etc., have smaller long-term effects than we once supposed. At least among people who share a given SES level and a given culture, it seems to be unique individual experience that makes the largest environmental contribution to adult IQ differences.

We do not yet know what the key features of those micro-environments may be. Are they biological? Social? Chronic? Acute? Is there something especially important in the earliest relations between the infant and its caretakers? Whatever the critical variables may be, do they interact with other aspects of family life? Of culture? At this point we cannot say, but these questions offer a fertile area for further research.

...

In this contentious arena, our most useful role may be to remind our readers that many of the critical questions about intelligence are still unanswered. Here are a few of those questions:

Differences in genetic endowment contribute substantially to individual differences in (psychometric) intelligence, but the pathway by which genes produce their effects is still unknown. The impact of genetic differences appears to increase with age, but we do not know why.
Environmental factors also contribute substantially to the development of intelligence, but we do not clearly understand what those factors are or how they work. Attendance at school is certainly important, for example, but we do not know what aspects of schooling are critical.
The role of nutrition in intelligence remains obscure. Severe childhood malnutrition has clear negative effects, but the hypothesis that particular "micro-nutrients" may affect intelligence in otherwise adequately-fed populations has not yet been convincingly demonstrated.
There are significant correlations between measures of information processing speed and psychometric intelligence, but the overall pattern of these findings yields no easy theoretical interpretation.
Mean scores on intelligence tests are rising steadily. They have gone up a full standard deviation in the last fifty years or so, and the rate of gain may be increasing. No one is sure why these gains are happening or what they mean.
The differential between the mean intelligence test scores of Blacks and Whites (about one standard deviation, although it may be diminishing) does not result from any obvious biases in test construction and administration, nor does it simply reflect differences in socio-economic status. Explanations based on factors of caste and culture may be appropriate, but so far have little direct empirical support. There is certainly no such support for a genetic interpretation. At present, no one knows what causes this differential.
It is widely agreed that standardized tests do not sample all forms of intelligence. Obvious examples include creativity, wisdom, practical sense and social sensitivity; there are surely others. Despite the importance of these abilities we know very little about them: how they develop, what factors influence that development, how they are related to more traditional measures.

In a field where so many issues are unresolved and so many questions unanswered, the confident tone that has characterized most of the debate on these topics is clearly out of place. The study of intelligence does not need politicized assertions and recriminations; it needs self-restraint, reflection, and a great deal more research. The questions that remain are socially as well as scientifically important. There is no reason to think them unanswerable, but finding the answers will require a shared and sustained effort as well as the commitment of substantial scientific resources. Just such a commitment is what we strongly recommend.