This research studied the relationship between Kohlberg’s stages of moral development and the process of decision making utilized by 214 subjects, aged 14 to 63 years, during the resolution of hypothetical moral dilemmas. Stage of moral development, as determined by the Ethical Reasoning Scale, was used to classify subjects responding to the Preferential Reasoning Profile, an instrument requiring subjects to weight reasons for pursuing a course of action. A multiple discriminant analysis indicated that it was not possible to distinguish moral development stages on the basis of optimum differential weighting of the variables of the decision-making matrix. Results indicate that the method of adjudicating moral claims is informed by a consistent rationality and sense of fairness across stages of moral development.
Enthusiasm for Kohlberg’s theory of moral development has been interrupted recently by the realization that few psychologists or educators understand Kohlberg in exactly the same way. Kohlberg’s (1971a, 1971b) major explications of this theory proposed a complex relationship between moral philosophy, developmental psychology, and subject responses to hypothetical moral dilemmas, though his attempted integration does not seem to have clarified the substantive moral changes that are supposed to be occurring between moral development stages. Uncertainty regarding this between-stage change is roughly equivalent to uncertainty about what is actually “developing.” The need for clarity here was noted by Aronfreed (1971), while Alston (1971) and Peters (1971) questioned whether or not Kohlberg’s theory describes development that is specifically moral in type or quality.
This article reviews the frequently unacknowledged complexity of Kohlberg’s position and attempts to locate a clear definition of between-stage change across moral development stages. Since an adequate definition of moral development cannot be formulated by pointing to changes in the individual’s knowledge, concrete values, or use of language, an attempt was made to provide empirical verification for Kohlberg’s key theoretical claim that the decision making process during the adjudication of moral claims changes across moral development stages. Lacking such verification, the significance of a moral development stage score was thought to remain in doubt.
Kohlberg’s stages of moral development do not measure or reflect a simple hierarchy of values. A subject’s invocation of concrete values would be easy to measure but would not support a comprehensive theory of moral development, since higher stage values would not always generate a morally superior prescription for the specific hypothetical dilemma (cf. Gauthier, cited in Beck, Crittenden, & Sullivan, 1971, pp. 365-366). Rather, Kohlberg (1971a) has argued that “each new stage of moral judgment entails a new set of logical operations not present at the prior stage” (p. 186). These logical operations tend to reflect more and more of the characteristics of formal reasoning, involving impersonality, ideality, universality, and preemptive ness (Kohlberg, 1971b, p. 55). Moral development, then, does not refer precisely to changes in the content of personal values but to changes in the structure of the individual’s reasoning. “Each value is restructured in the course of development” (Kohlberg, 1973, p. 5).
Now Kohlberg (1971b) also contends that “the content of moral concerns and claims is always welfare” (p. 63), since “there is no moral situation that does not involve considerations of people’s happiness or welfare and considerations of equal treatment between people” (p. 59). Moral principles are born, therefore, from the increasing capacity to adjudicate welfare claims in a formally ideal way, where more concrete referents, such as egocentric sentiments or social stereotypes, have lost their persuasiveness for the developing individual. A formally adequate moral position requires, moreover, that there is only one moral principle that can resolve competing or conflicting claims to welfare, namely, the principle of justice (Kohlberg, 1971b, p. 63). Although arguments about the nature of justice may go on at all stages of development, Kohlberg (1971b) has proposed that a formally adequate principle of justice takes the form: “Consider each person’s welfare equally” and “Consider every man’s moral claims equally” (p. 64). It has become possible, then, for Kohlberg (1971a) to conclude that “the basic referent of the term ‘moral’ is a type of judgment or a type of decision making process” (p. 214).
It is not entirely clear how the scoring of subject responses to hypothetical moral dilemmas can be said to measure a process. Kohlberg (cf. 1971a, pp. 195-213) may have intended to clarify the relationship between content and process by referring to the concepts of differentiation and equilibrium. He theorized that there are discrete changes occurring within the individual that give rise to different structural wholes or ways of comprehending the world. At Stage 1, for example, only relative degrees of power or status are differentiable in the individual’s reasoning. At Stage 2, the individual can differentiate quantities of exchange, that is, equal or unequal exchanges of reward or penalty. By Stage 3, the individual can differentiate the actual exchange from an ideal exchange and so on for higher stages and capacities to differentiate aspects of a moral issue.
Prior to Stage 6, however, these differential capacities continue to lack formal and moral adequacy, since the individual’s version of justice generates inconsistencies or a certain kind of disequilibrium. Perhaps the guidelines for law maintainers are not reciprocally applied to lawmakers. The idea of respect for rules may interfere with the capacity to evaluate those rules. An individual may hold that a certain moral principle is unconditionally valid while simultaneously arguing that the application of that principle depends on cultural norms-and so on. So the individual who has not reached Stage 6 may attain a partial equilibrium only to have it topple over into disequilibrium as new differentiations are made. Only at the end of the developmental spectrum does a morally adequate concept of justice meet the psychological capacity to utilize that concept. “The formal psychological developmental criteria of differentiation and integration, of structural equilibrium, map into the formal moral criteria of prescriptiveness and universality” (Kohlberg, 1971a, p. 224).
If the foregoing remarks accurately and sufficiently summarize the components of Kohlberg’s theory, then the problem of defining the nature of between-stage change, that is, the problem of defining moral development, can be encapsulated by the following question: Does a mistaken view of the criteria that truly contribute to human welfare necessarily imply an underdeveloped sense of justice? Words remain ambiguous here, even when one tries to answer within Kohlberg’s own framework. It seems possible to define between-stage change by pointing to changes in the individual’s differential view of the criteria that contribute to welfare (e.g., from quantitative satisfactions at Stage 2 to interpersonal harmony at Stage 3). These changes, however, do not imply corresponding changes in the individual’s sense of fairness or rule reciprocity. Not only could increasingly comprehensive views of the criteria that contribute to welfare be succinctly explained as a matter of social learning (Hall & Davis, 1975), but such cognitive comprehensiveness would have little to do with an increasing use of and appreciation for the method of equal consideration for welfare claims. Sooner or later this attempt to define between-stage change will degenerate into a theory of values acquisition, a brand of moral realism that Kohlberg himself has consistently avoided.
It seems possible to approach the question from the other side and argue that between-stage change reflects an increasingly comprehensive understanding of the principle of justice. A sense of justice, however, is not to be confused with abstract ideas about moral problems. A person may make increasingly sophisticated arguments about the nature of right and wrong without caring about such issues in the concrete case. A person may abhor the imposition of arbitrary or inconsistent rules without being able to justify the abhorrence in a formally adequate way (Peters, 1971, p. 261). It seems needlessly complex and empirically unfounded to suggest that a sense of justice as a method for adjudicating welfare claims develops in proportion to the individual’s ability to give formal reasons for employing the method. A North American teenager is under no moral obligation to justify his or her reasons for action in the style of a European intellectual weltanschauung (Ausubel, 1971).
Thus far the idea that psychological criteria map into moral criteria during development appears to be a desirable but vague conclusion that provides little usable information. If Kohlberg is correct in asserting that a formally adequate prescription for justice takes the form, “Consider each person’s welfare and/or moral claims equally,” then one wants to know how moral development can be differentiated from the individual’s developing rationality. Any rule for social discourse and any prescription to be rational in one’s thinking requires that every person’s claims be given equal attention or respect, so it becomes all the more important to distinguish between apparent metaethical principles and exhortations for clearheadedness (cf. Hill, 1972). Kohlberg’s references to differentiation and equilibrium may simply point to the fact that new knowledge corrects previous mistakes. Ausubel (1971) has suggested that an implicit principle of rationality informs all prescription for action and eventually defines inconsistencies for the agent during the agent’s own development. No particularly moral significance, however, could be attached to the agent’s relative ability to justify the importance of rationality or consistency of thought in response to hypothetical dilemmas. Lacking further clarification, it would appear that Kohlberg’s definition of moral development as stages of differentiation and equilibrium could be used at will to construct additional theories of political development or religious development, that is, to construct developmental theories in any arena that a researcher cared to ask questions about.
Having failed to derive a clear conception of between-stage change from the possibilities noted above, there remains a third alternative described in Kohlberg’s writings but in need of empirical verification. This alternative suggests that subjects at increasingly higher stages of moral development would be expected to engage in a decision-making process (cf. Kohlberg, 1971a, p. 214) that reflects the method of justice in an increasingly complete and ideal way. In empirical terms, subjects at higher stages should rank inadequate or inappropriate reasons for action less desirable in resolving hypothetical moral dilemmas than would subjects at lower stages. When the decision-making process is conceptualized as a procedure during which criteria thought to enhance welfare are evaluated relative to criteria thought to harm welfare, then between-stage change may be tentatively defined as a proportional change in the influence ascribed to harmful criteria. More developed subjects should rule out more and more of the many possible reasons for action on the grounds that, morally speaking, such reasons are at least irrelevant if not ultimately unjust.
In order to verify this tentative definition of moral development, a convenient sample was derived by asking each of 22 members of an undergraduate course in tests and measurements to complete two instruments, the Ethical Reasoning Scale and the Preferential Reasoning Profile (discussed below), and to administer both instruments to an additional 10 people (N = 242). This method of subject selection was intended to generate a reasonably wide variety of ages, levels of education, and social backgrounds.
A total of 214 subjects completed both instruments. Of this total, 46.7% were male, and 53.3% were female. Ages of subjects ranged from 14 to 63 years, with a median age of 31 years. Approximately 78% of the subjects were under the age of 40 years. The social status level of occupations, rated according to Warner (1957), showed a modal and median value of 4, corresponding to skilled clerics, owners of small businesses, independent skilled technicians, and the like. Nearly 65% of the sample had never been married, while an additional 24% were currently living with a spouse. Only 21% of the subjects had not completed some portion of a college education, while 13% had completed some type of degree at the postgraduate level. The majority of subjects (78%) had been born within 100 miles of New York City, while 10.3% had been born in Puerto Rico, and the remainder born in places scattered throughout the United States.
The first instrument completed by each subject, the Ethical Reasoning Scale (Sullivan, Note 1), was intended to measure each subject’s stage of moral development. This instrument contains 54 forced-choice items that are divided into roughly equal proportions among three of Kohlberg’s hypothetical dilemmas. Each item requires the subject to choose between two reasons for pursuing a course of action, where the paired statements have been arranged to correspond to prototypic answers for the first five of Kohlberg’s moral development stages. For example, an item dealing with the dilemma of Heinz and the drug presents two reasons for stealing the drug that reflect a prototypic Stage 1 response (“Heinz will suffer more from his wife’s death than from a prison term”) and a prototypic Stage 4 response (“Life is worth infinitely more than the druggist’s profit”). Another item requires a choice between a prototypic Stage 1 response and prototypic Stage 4 response, each suggesting that Heinz should not steal the drug: “Stealing brings punishment from God and from society” versus “Heinz should not take the matter into his own hands, but should refer it to the authorities.”
The item pool contrasts all possible combinations of prototypic responses for the first five of Kohlberg’s moral development stages. During the scoring of the instrument, the cumulative frequency of a subject’s higher stage choices peaks at the highest stage of the subject’s discriminating ability and then increases only as a matter of guessing for the remainder of higher stage discriminations. This cumulative frequency peak is scored, therefore, as the subject’s stage of moral development.
The Ethical Reasoning Scale provides five subscale scores corresponding to the first five of Kohlberg’s moral development stages. The reliability of each subscale, using Kuder-Richardson 20, is given as follows: For Stage 1, r = .63; for Stage 2, r = .47; for Stage 3, r = .41; for Stage 4, r = .47; and for Stage 5, r = .63. Overall reliability of the instrument as determined by Kuder-Richardson 20 is.72 (N = 479). Average item difficulty for these prototypic statements was found to be .65 (n = 80), implying a maximum expected reliability of. 73. When the median value for subscale reliability (.47) is used as an indicator of construct validity, then it can be noted that such an estimate is close to the maximum possible value for r2 (.52).
Scores obtained on the Ethical Reasoning Scale indicated that 7.2% of the 214 subjects in the study functioned at a moral development stage characteristic of Stage 1, 3.7% at Stage 2, 27.5% at Stage 3, 31.3% at Stage 4, and 30.3% at Stage 5 or higher.
Immediately following the completion of the Ethical Reasoning Scale, all subjects completed the Preferential Reasoning Profile (Ahlskog, Note 2). This instrument was designed to test for changes in the decision-making process during the resolution of hypothetical moral dilemmas. It contains 60 stated reasons for pursuing or refraining from a course of action, equally divided between three of Kohlberg’s hypothetical dilemmas: Heinz and the drug, the doctor and the wife, and the judge and the doctor. For each section of 20 items, 5 items provide reasons for pursuing a course of action (e.g., the judge should punish the doctor); 5 items provide reasons for avoiding that course of action; 5 items provide reasons for pursuing the alternate course of action (e.g., let the doctor go free); and 5 items provide reasons for avoiding that course. Sample items for these four categories are reported, respectively, as follows:
1. The judge is responsible for seeing that doctors do not take it upon themselves to kill people when they think,there is some good reason to do it.
2. Even if the judge believes that the doctor was wrong, it would not benefit anyone to put the doctor in jail.
3. Since the doctor tried to do the right thing, the judge should not sentence him as if he were a criminal.
4. Our system of justice would be useless if judges ignored the law because of their personal views.
For each of the three dilemmas of the instrument, subjects were asked to make a decision (in yes or no terms) about the central quetion of the particular dilemma (e.g., “Should the judge punish the doctor?”) and then to weight the items of each section from zero to six, indicating the extent to which they endorsed the item. In effect, the instrument generated three “decisions of choice” and three sets of responses to the items grouped according to the four categories noted above. By summing the responses to each category across three hypothetical dilemmas, cumulative scores for four variables were obtained for each subject. These four variables may be visualized as a payoff matrix representing the weights each subject ascribed to (a) the benefits derived from the decision of choice, (b) the penalties paid for pursuing the decision of choice, (c) the benefits derived from pursuing the alternate decision, and (d) the penalties paid for pursuing the alternate decision.
A pilot study determined that subject scores for each of the three dilemmas were correlated with subject scores on the instrument as a whole at levels of r that ranged from .73 to .90. Split-half reliability during this pilot study was calculated at .88 (n = 35). The responses of the 214 subjects in the subsequent study generated item-scale biserial correlations for the Preferential Reasoning Profile that ranged from .49 to .86, with a median value of .73.
Results and Discussion
Since between-stage change was tentatively defined as change in the decision making process in response to hypothetical moral dilemmas, it was hypothesized that subjects at different stages of moral development would demonstrate variance in their usage of the four response categories of the Preferential Reasoning Profile. That is, subjects at higher stages of moral development should endorse fewer reasons as justification for their decision(s) of choice and should find almost no endorsable criteria in support of the alternate decision(s). While these subjects may have understood many if not all of the lower stage reasons for action, it was expected that they would not endorse such reasons as morally defensible, even when those reasons happened to support the decision of choice. Equal consideration of welfare claims would be increasingly restricted to claims regarding human worth and dignity, so that other potential welfare criteria (such as personal happiness or gain) would be ignored. Subjects at lower stages would be expected to endorse a wider variety of welfare claims (and penalties), including inconsistent claims as general evidence of disequilibrium.
The null hypothesis, therefore, stated that scores for the four variables of the payoff matrix for subjects grouped according to moral development stage would not be distinguishable from the matrices of subjects grouped at random from the same population. Evidence needed to refute this null hypothesis could be used to generate the coefficients of a decision-making function representing changes in the process of moral decision making across moral development stages.
On the basis of the assumption that the decision-making process for a subject at any particular stage of moral development should be identical to the decision-making process of all other subjects assessed at that moral development stage, a multiple discriminant analysis was chosen in order to maximize the ratio of between and within sums of squares. The multiple discriminant analysis failed to find a linear combination of variables to distinguish the five groups, as is further shown by an obtained lambda of .93 and an approximate F of .95 (p < .51). The results of this analysis indicate that it was not possible to distinguish moral development stages on the basis of optimum differential weighting of the variables of the payoff matrix. Mean values for the matrices of subjects grouped according to moral development stage are reported in Table 1.
table 1 developmental stages
The inability to identify significant differences between the variables of the payoff matrix for subjects grouped according to moral development stage precluded any further investigation into the magnitude or direction of change in the decision-making process across five stages of moral development. In summation, the attempt to verify that between-stage change reflects change in the process of adjudicating moral claims was entirely unsuccessful, and one surmises that some new attempt to clarify what develops across the stages is required.
One wonders, of course, whether or not there is anything remarkable about a study that only requires subjects to "think what they think" in a consistent way. Closer inspection, however, indicates that the moral adequacy of a person's judgment must now be differentiated from the rational adequacy of that same judgment. Moral development may well refer to developing knowledge, developing language, or the acquisition of new values. It apparently does not refer, however, to a developing sense of justice as rule reciprocity or the developing ability to state a rational case for human welfare. A similar sense of fairness seems to have informed the decision-making process of all subjects in the study, leaving a more precise understanding of specifically moral growth in doubt. Without a clearer understanding of between-stage change, it is difficult to assess the import of stage scores or their subsequent modification during the developmental and educational processes.
Reference Note Sullivan, A. The Ethical Reasoning Scale. Unpublished manuscript, 1975. (Available from A. Sullivan, Graduate School of Education, Fordham University, Lincoln Center, New York, New York 10023.)
Ahlskog, G. The Preferential Reasoning Profile. Unpublished manuscript, 1976. (Available from G. Ahlskog, 37 West 87th Street, New York, New York 10024.)
Alston, W. P. Comments on Kohlberg’s “From is to ought.” In T. Mischel (Ed.), Cognitive development and epistemology. New York: Academic Press, 1971.
Aronfreed, J. Some problems for a theory of the acquisition of conscience. In C. M. Beck, B. S. Crittenden, & E. V. Sullivan (Eds.), Moral education. New York: Newman Press, 1971.
Ausubel, D. Psychology’s undervaluation of the rational components in moral behaviour. In C. M. Beck, B. S. Crittenden, & E. V. Sullivan (Eds.), Moral education. New York: Newman Press, 1971.
Beck, C. M., Crittenden, B. S., & Sullivan, E. V. (Eds).
Moral education. New York: Newman Press, 1971.
Hall, R. T., & Davis, J. U. Moral education in theory and practice. Buffalo, N. Y.: Prometheus Books, 1975.
Hill, B. V. Education for rational morality or moral rationality? Educational Theory, 1972, 22, 286- 292.
Kohlberg, L. From is to ought: How to commit the naturalistic fallacy and get away with it in the study of moral development. In T. Mischel (Ed.), Cognitive development and epistemology. New York: Academic Press, 1971. (a)
Kohlberg, L. Stages of moral development as a basis for moral education. In C. M. Beck, B. S. Crittenden, & E. V. Sullivan (Eds.) , Moral education. New York: Newman Press, 1971. (b)
Kohlberg, L. The contribution of developmental psychology to education-Examples from moral education. Educational Psychologist, 1973, 10, 2-14.
Peters, R. S. Moral development: A plea for pluralism.
In T. Mischel (Ed.), Cognitive development and epistemology. New York: Academic Press, 1971.
Warner, W. L. Social class in America: A manual of procedure for the measurement of social status. Gloucester, Mass.: P. Smith, 1957.