Tag Archives: Behavior

TWO years ago, I idly surfed my way to a harmless-seeming article from 2004 by Denny Borsboom, Gideon Mellenbergh, and Jaap van Heerden entitled The Concept of Validity. More than a decade had passed since its publication, and I had never heard of it. Egocentrically, this seemed like reason enough to surf right past it. Then I skimmed the abstract. Intrigued, I proceeded to read the first few paragraphs. By that point, I was hooked: I scrapped my plans for the next couple of hours so I could give this article my complete attention. This was a paper I needed to read immediately.

I’ve thought about The Concept of Validity every day for the past two years. I have mentioned or discussed or recommended The Concept of Validity hundreds of times. My zeal for The Concept of Validity is the zeal of an ex-smoker. The concept of validity in The Concept of Validity has led to a complete reformatting of my understanding of validity, and of measurement in general—and not just in the psychological sciences, but in the rest of the sciences, too. And those effects have oozed out to influence just about everything else I believe about science. The Concept of Validity is the most important paper you’ve probably never heard of.*

The concept of validity in The Concept of Validity is so simple that it’s a bit embarrassing even to write it down, but its simplicity is what makes it so diabolical, and so very different from what most in the social sciences of have believed validity to be for the past 60 years.

According to Borsboom and colleagues, a scientific device (let’s label it D) validly measures a trait or substance (which we will label T), if and only if two conditions are fulfilled:

(1) T must exist;

(2) T must cause the measurements on D.

That’s it. That is the concept of validity in The Concept of Validity.

This is a Device. There are invisible forces in the world that cause changes in the physical state of this Device. Those physical changes can be read off as representations of the states of those invisible forces. Thus, this Device is a valid measurement of those invisible forces.

What is most conspicuous about the concept of validity in The Concept of Validity is what it lacks. There is no talk of score meanings and interpretations (à la Cronbach and Meehl). There is no talk of integrative judgments involving considerations of the social or ethical consequences of how scores are put to use (à la Messick). There’s no talk of multitrait-multimethod matrixes (à la Campbell and Fiske), nomological nets (Cronbach and Meehl again), or any of the other theoretical provisos, addenda, riders, or doo-dads with which psychologists have been burdening their concepts of validity since the 1950s. Instead, all we need—and all we must have—for valid measurement is the fulfillment of two conditions: (1) a real force or trait or substance (2) whose presence exerts a causal influence on the physical state of a device. Once those conditions are fulfilled, a scientist can read off the physical changes to the device as measurements of T. And voila: We’ve got valid measurement.

Boorsboom and colleagues’ position is such a departure from 20th century notions of validity precisely because they are committed to scientific realism—a stance to which many mid-20th-century philosophers of science were quite allergic. But most philosophers of science have gotten over their aversion to scientific realism now. In general, they’re mostly comfortable with the idea that there could be hidden realities that are responsible for observable experience. Realism seemed like a lot to swallow in 1950. It doesn’t in 2017.

As soon as you commit to scientific realism, there is a kind of data you will prize more highly than any other for assessing validity, and that’s causal evidence. What a realist wants more than anything else on earth or in the heavens is evidence that the hypothesized invisible reality (the trait, or substance, or whatever) is causally responsible for the measurements the device produces. Every other productive branch of science is already working from this definition of validity. Why aren’t the social sciences?

For some of the research areas I’ve messed around with over the past few years, the implications of embracing the concept of validity in The Concept of Validity are profound, and potentially nettlesome: If we follow Borsboom and colleagues’ advice, we can discover that some scientific devices do indeed provide valid measurement, precisely because the trait or substance T they supposedly measure actually seems to exist (fulfilling Condition #1) and because there is good evidence that T is causally responsible for physical features of the device that can be read off as measurements of T (fulfilling Condition #2). In other areas, the validity of certain devices as measures looks less certain because even though we can be reasonably confident that the trait or substance T exists, we cannot be sure that changes in T are responsible for the physical changes in the device. In still other areas, it’s not clear that T exists at all, in which case there’s no way that the device can be a measure of T.

I will look at some of these scenarios more closely in an upcoming post.

Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061-1071.

*Weirdly, The Concept of Validity does not come up in Google Scholar. I’ve seen this before, actually. Why does this happen?


Human Oxytocin Research Gets a Drubbing

There’s a new paper out by Gareth Leng and Mike Ludwig1 that bears the coy title “Intranasal Oxytocin: Myths and Delusions” (get the full text here before it disappears behind a pay wall) that you need to know about if you’re interested in research on the links between oxytocin and human behavior (as I am; see my previous blog entries here, here, and here). Allow me to summarize some highlights, peppered with some of my own (I hope not intemperate) inferences. Caution: There be numbers below, and some back-of-the-envelope arithmetic. If you want to avoid all that, just go to the final paragraph where I quote directly from Gareth and Mike’s summary.

brain-OTFig 1. It’s complicated.

  1. In the brain, it’s the hypothalamus that makes OT, but it’s the pituitary that stores and distributes it to the periphery. I think those two facts are pretty commonly known, but here’s a fact I didn’t know: At any given point in time, the human pituitary gland contains about 14 International Units (IU) of OT (which is about 28 micrograms). So when you read that a researcher has administered 18 or 24IU of oxytocin intranasally as part of a behavioral experiment, bear in mind that they have dumped more than an entire pituitary gland’s worth of OT into the body.
  2. To me, that seems like a lot of extra OT to be floating around out there without us knowing completely what its unintended effects might be. Most scientists who conduct behavioral work on OT with humans think and of course hope that this big payload of OT is benign, and to be clear, I know of no evidence that it is not benign. Even so, research on the use of OT for labor augmentation has found that labor can be stimulated with as little as 3.2 IU of intranasal OT during childbirth by virtue of its effects on the uterus. This is saying a lot about OT’s potential to influence the body’s peripheral tissues because that OT has to overcome the very high levels of oxytocinase (the enzyme that breaks up OT) that circulate during pregnancy. It of course bears repeating that behavioral scientists typically use 24 IU to study behavior, and 24 > 3.2.2
  3. Three decades ago, researchers found that rats that received injections of radiolabeled OT showed some uptake of the OT into regions of the brain that did not have much of a blood brain barrier, but in regions of the brain that did have a decent blood brain barrier, the concentrations were 30 times lower. Furthermore, there was no OT penetration deeper into the brain. Other researchers who have injected rats with subcutaneous doses of OT have managed to increase the rats’ plasma concentrations of OT to 500 times their baseline levels, but they found only threefold increases in the CSF levels. On the basis of these results and others, Leng and Ludwig speculate that as little as 0.002% of the peripherally administered OT is finding its way into the central nervous system, and it has not been proven that any of it is capable of reaching deep brain areas.
  4. The fact that very low levels of OT appear to make it into the central nervous system isn’t a problem in and of itself—if that OT reaches behaviorally interesting brain targets in concentrations that are high enough to produce behavioral effects. However, OT receptors in the brain are generally exposed to much higher levels of OT than are receptors in the periphery (where baseline levels generally range from 0 to 10 pg/ml). As a result, OT receptors in the brain need to be exposed to comparatively high amounts of OT to produce behavioral effects—sometimes as much as 5 to 100 nanograms.
  5. Can an intranasal dose of 24 IU deliver 5 – 100 nanograms of OT to behaviorally relevant brain areas? We can do a little arithmetic to arrive at a guess. The 24 IU that researchers use in intranasal administration studies on humans is equivalent to 48 micrograms, or 48,000 nanograms. Let’s assume (given Point 3 above) that only .002 percent of those 48,000 nanograms is going to get into the brain. If that assumption is OK, then we might expect that brain areas with lots of OT receptors could—as an upper limit—end up with no more than 48,000 nanograms * .00002 = .96 (~1) nanogram of OT. But if 5 – 100 nanograms is what’s needed to produce a behavioral effect, then it seems sensible to conclude that even a 24 IU bolus of OT (which, we must remember, is more than a pituitary gland’s worth of OT) administered peripherally is likely too little to produce enough brain activity to produce a behavioral change—assuming that it’s even able to get into deep brain regions.

Leng and Ludwig aren’t completely closed to the idea that intranasal oxytocin affects behavior via its effects on behaviorally relevant parts of the brain that use oxytocin, but they maintain a cautious stance. I can find no better way to summarize their position clearly than by quoting from their abstract:

The wish to believe in the effectiveness of intranasal oxytocin appears to be widespread, and needs to be guarded against with scepticism and rigor.

1If you don’t know who Gareth Leng and Mike Ludwig are, by the way, and are wondering whether their judgment is backed up by real expertise, by all means have a look at their bona fides.

2A little bet-hedging: I think I read somewhere that there is upregulated gene expression for oxytocin receptors late in pregnancy, so this could explain the uterus’s heightened sensitivity to OT toward the end of pregnancy. Thus, it could be that the uterus becomes so sensitive to OT not because 3.2 IU is “a lot of OT” in any absolute sense, but because the uterus is going out of its way to “sense” it. Either way, 3.2 IU is clearly a detectible amount to any tissue that really “wants”* to detect it.

*If you’re having a hard time with my use of agentic language to refer to the uterus, give this a scan.


A P-Curve Exercise That Might Restore Some of Your Faith in Psychology

I teach my university’s Graduate Social Psychology course, and I start off the semester (as I assume many other professors who teach this course do) by talking about research methods in social psychology. Over the past several years, as the problems with reproducibility in science have become more and more central to the discussions going on in the field, my introductory lectures have gradually become more dismal. I’ve come to think that it’s important to teach students that most research findings are likely false, that there is very likely a high degree of publication bias in many areas of research, and that some of our most cherished ideas about how the mind works might be completely wrong.

In general, I think it’s hard to teach students what we have learned about the low reproducibility of many of the findings in social science without leaving them with a feeling of anomie, so this year, I decided to teach them how to do p-curve analyses so that they would at least have a tool that would help them to make up their own minds about particular areas of research. But I didn’t just teach them from the podium: I sent them away to form small groups of two to four students who would work together to conceptualize and conduct p-curve analysis projects of their own.

I had them follow the simple rules that are specified in the p-curve user’s guide, which can be obtained here, and I provided a few additional ideas that I thought would be helpful in a one-page rubric. I encouraged them to make sure they were sampling from the available population of studies in a representative way. Many of the groups cut down their workload by consulting recent meta-analyses to select the studies to include. Others used Google Scholar or Medline. They were all instructed to follow the p-curve manual chapter-and-verse, and to write a little paper in which they summarized their findings. The students told me that they were able to produce their p-curve analyses (and the short papers that I asked them to write up) in 15-20 person-hours or less. I cannot recommend this exercise highly enough. The students seemed to find it very empowering.

This past week, all ten groups of students presented the results of their analyses, and their findings were surprisingly (actually, puzzlingly) rosy: All ten of the analyses revealed that the literatures under consideration possessed evidentiary value. Ten out of ten. None of them showed evidence for intense p-hacking. On the basis of their conclusions (coupled with the conclusions that previous meta-analysts had made about the size of the effects in question), it does seem to me that there really is license to believe a few things about human behavior:

(1) Time-outs really do reduce undesirable behavior in children (parents with young kids take notice);

(2) Expressed Emotion (EE) during interactions between people with schizophrenia and their family members really does predict whether the patient will relapse in in the successive 9-12 months (based on a p-curve analysis of a sample of the papers reviewed here);

(3) The amount of psychological distress that people with cancer experience is correlated with the amounts of psychological distress that their caregivers manifest (based on a p-curve analysis of a sample of the papers reviewed here);


(4) Men really do report more distress when they imagine their partners’ committing sexual infidelity than women do (based on a p-curve analysis of a sample of the papers reviewed here; caveats remain about what this finding actually means, of course…)

I have to say that this was a very cheering exercise for my students as well as for me. But frankly, I wasn’t expecting all ten of the p-curve analyses to provide such rosy results, and I’m quite sure the students weren’t either. Ten non-p-hacked literatures out of ten? What are we supposed to make of that? Here are some ideas that my students and I came up with:

(1) Some of the literatures my students reviewed involved correlations between measured variables (for example, emotional states or personality traits) rather than experiments in which an independent variable was manipulated. They were, in a word, personality studies rather than “social psychology experiments.” The major personality journals (Journal of Personality, Journal of Research in Personality, and the “personality” section of JPSP) tend to publish studies with conspicuously higher statistical power than do the major journals that publish social psychology-type experiments (e.g., Psychological Science, JESP and the two “experimental” sections of JPSP), and one implication of this fact, as Chris Fraley and Simine Vazire just pointed out is that the former set of experiment-friendly journals are more likely, ceteris paribus, to have higher false positive rates than is the latter set of personality-type journals.

(2) Some of the literatures my students reviewed were not particularly “sexy” or “faddish”–at least not to my eye (Biologists refer to the large animals that get the general public excited about conservation and ecology as the “charismatic megafauna.” Perhaps we could begin talking about “charismatic” research topics rather than “sexy” or “faddish” ones? It might be perceived as slightly less derogatory…). Perhaps studies on less charismatic topics generate less temptation among researchers to capitalize on undisclosed researcher degrees of freedom? Just idle speculation…

(3) The students went into the exercise without any a priori prejudice against the research areas they chose. They wanted to know whether the literatures the focused on were p-hacked because they cared about the research topics and wanted to base their own research upon what had come before–not because they had read something seemingly fishy on a given topic that gave them impetus to do a full p-curve analysis. I wonder if this subjective component to the exercise of conducting a p-curve analysis is going to end up being really significant as this technique becomes more popular.

If you teach a graduate course in psychology and you’re into research methods, I cannot recommend this exercise highly enough. My students loved it, they found it extremely empowering, and it was the perfect positive ending to the course. If you have used a similar exercise in any of your courses, I’d love to hear about what your students found.

By the way, Sunday will be the 1-year anniversary of the Social Science Evolving Blog. I have appreciated your interest.  And if I don’t get anything up here before the end of 2014, happy holidays.

The Myth of Moral Outrage

This year, I am a senior scholar with the Chicago-based Center for Humans and Nature. If you are unfamiliar with this Center (as I was until recently), here’s how they describe their mission:

The Center for Humans and Nature partners with some of the brightest minds to explore humans and nature relationships. We bring together philosophers, biologists, ecologists, lawyers, artists, political scientists, anthropologists, poets and economists, among others, to think creatively about how people can make better decisions — in relationship with each other and the rest of nature.

In the year to come, I will be doing some writing for the Center, starting with a piece I that has just appeared on their web site. In The Myth of Moral Outrage, I attack the winsome idea that humans’ moral progress over the past few centuries has ridden on the back of a natural human inclination to react with a special kind of anger–moral outrage–in response to moral violations against unrelated third parties:

It is commonly believed that moral progress is a surfer that rides on waves of a peculiar emotion: moral outrage. Moral outrage is thought to be a special type of anger, one that ignites when people recognize that a person or institution has violated a moral principle (for example, do not hurt others, do not fail to help people in need, do not lie) and must be prevented from continuing to do so . . . Borrowing anchorman Howard Beale’s tag line from the film Network, you can think of the notion that moral outrage is an engine for moral progress as the “I’m as mad as hell and I’m not going to take this anymore” theory of moral progress.

I think the “Mad as Hell” theory of moral action is probably quite flawed, despite the popularity that it has garnered among may social scientists who believe that humans possess “prosocial preferences” and a built-in (genetically group-selected? culturally group selected?) appetite for punishing norm-violators. I go on to describe the typical experimental result that has given so many people the impression that we humans do indeed possess prosocial preferences that motivate us to spend our own resources for the purpose of punishing norm violators who have harmed people whom we don’t know or otherwise care about. Specialists will recognize that the empirical evidence that I am taking to task comes from that workhorse of experimental economics, the third-party punishment game:

…[R]esearch subjects are given some “experimental dollars” (which have real cash value). Next, they are informed that they are about to observe the results of a “game” to be played by two other strangers—call them Stranger 1 and Stranger 2. For this game, Stranger 1 has also been given some money and has the opportunity to share none, some, or all of it with Stranger 2 (who doesn’t have any money of her own). In advance of learning about the outcome of the game, subjects are given the opportunity to commit some of their experimental dollars toward the punishment of Stranger 1, should she fail to share her windfall with Stranger 2.

Most people who are put in this strange laboratory situation agree in advance to commit some of their experimental dollars to the purpose of punishing Stranger 1’s stingy behavior. And it is on the basis of this finding that many social scientists believe that humans have a capacity for moral outrage: We’re willing to pay good money to “buy” punishment for scoundrels.

In the rest of the piece, I go on to point out the rather serious inferential limitations of the third-party punishment game as it is typically carried out in experimental economists’ labs. I also point to some contradictory (and, in my opinion, better) experimental evidence, both from my lab and from other researchers’ labs, that gainsay the widely accepted belief in the reality of moral outrage. I end the piece with a proposal for explaining what the appearance of moral outrage might be for (in a strategic sense), even if moral outrage is actually not a unique emotion (that is, a “natural kind” of the type that we assume anger, happiness, grief, etc. to be) at all.

I don’t want to steal too much thunder from the Center‘s own coverage of the piece, so I invite you to read the entire piece over on their site. Feel free to post a comment over there, or back over here, and I’ll be responding in both places over the next few days.

As I mentioned above, I’ll be doing some additional writing for the center in the coming six months or so, and I’ll be speaking at a Center event in New York City in a couple of months, which I will announce soon.

The Trouble with Oxytocin, Part III: The Noose Tightens for The Oxytocin–>Trust Hypothesis

https://i1.wp.com/media-cache-ak0.pinimg.com/736x/2b/1f/9b/2b1f9b4e930d47f31b1f7f3aecd0b0cf.jpgMight be time to see about having that Oxytocin tattoo removed…

When I started blogging six months ago, I kicked off Social Science Evolving with a guided tour of the evidence for the hypothesis that oxytocin increases trusting behavior in the trust game (a laboratory workhorse of experimental economics). The first study on this topic, authored by Michael Kosfeld and his colleagues, created a big splash, but most of the studies in its wake failed to replicate the original finding. I summarized all of the replications in a box score format (I know, I know: Crude. So sue me.) like so:

Box Score_Dec2013By my rough-and-ready calculations, at the end of 2013 there were about 1.25 studies’ worth of successful replications of the original Kosfeld results, but about 3.75 studies’ worth of failed replications (see the original post for details). Even six months ago, the empirical support for the hypothesis that oxytocin increases trust in the trust game was not looking so healthy.

I promised that I’d update my box score as I became aware of new data on the topic, and a brand new study has just surfaced. Shuxia Yao and colleagues had 104 healthy young men and women play the trust game with four anonymous trustees. One of those four trustees (the “fair” trustee) returned enough of the subject’s investment to cause the subject and the trustee to end up with equal amounts of money; the other three trustees (designated as the “unfair players”) declined to return any money to the subject at all.

Next, subjects were randomly assigned to receive either the standard dose of intranasal oxytocin, or a placebo. Forty-five minutes later, participants were told that they would receive an instant message from the four players to whom they had entrusted money during the earlier round of the trust game. The “fair” player from the earlier round, and one of the “unfair” players, sent no message at all. The second unfair player sent a cheap-talk sort of apology, and the third unfair player offered to make a compensatory monetary transfer to the subject that would make their payoffs equal.

Finally, study participants took part in a “surprise” round of the trust game with the same four strangers. The researchers’ key question was whether the subjects who had received oxytocin would behave in a more trusting fashion toward the four players from Round 1 than the participants who received a placebo instead.

They didn’t.

In fact, the only hint that oxytocin did anything at all to participants’ trust behaviors was a faint statistical signal that oxytocin caused female participants (but not male participants) to treat the players from Round 1 in a less trusting way. If anything, oxytocin reduced women’s trust. I should note, however, that this females-only effect for oxytocin was obtained using a statistically questionable procedure: The researchers did not find a statistical signal of an interaction between oxytocin and subjects’ sex, and without such a signal, their separation of the men’s and the women’s data for further analyses really wasn’t licensed. But regardless, the Yao data fail to support the idea that oxytocin increases trusting behavior in the trust game.

It’s time to update the box score:


In the wake of the original Kosfeld findings, 1.25 studies worth of results have accumulated to suggest that oxytocin does increase trust in the trust game, but 4.75 studies worth of results have accumulated to suggest that it doesn’t.

It seems to me that the noose is getting tight for the hypothesis that intransasal oxytocin increases trusting behavior in the trust game. But let’s stay open-minded a while longer. As ever, if you know of some data out there that I should be including in my box score, please send me the details. I’ll continue updating from time to time.

Of Crackers and Quackers: Human-Duck Social Interaction is Regulated by Indirect Reciprocity (A Satire)

1280px-221_Mallard_DuckWatching the ducks on a neighborhood pond can be an entertaining and rewarding pastime. I myself, along with my nine-year-old co-investigator, have taken daily opportunities to feed some ducks on a nearby pond over the past several months. In doing so, we not only had fun but also managed to conduct some urban science that led us to a new scientific discovery: Mallards (Anas platyrhynchos L.) engage in indirect reciprocity with humans. Scientists have known for decades, of course, that indirect reciprocity was critical to the evolution of human social interaction in large-scale societies, but we believe we are the first to identify indirect reciprocity at work in human-duck social interaction.

Here’s how we made this discovery.

On random days, we take a soda cracker along with us to feed to a single lucky duck. On the other days, we take our walks without a cracker. What my young co-investigator and I have noticed is that on cracker days, after we’ve fed the cracker to the first duck that approaches us (the “focal duck,” which we also call “the recipient”), other ducks (which we call “entertainment ducks,” or “indirect reciprocators”) appear to take notice of our generosity toward the recipient. Almost immediately, the indirect reciprocators start to perform all sorts of entertaining behaviors: They swim toward us eagerly, they waddle up to us enthusiastically, they stare at us with their dead, obsidian eyes, they quack imploringly. It’s all very amusing and my co-investigator and I have a great time. Take note of the fact that we always bring only a single cracker with us on cracker days. As a result, the indirect reciprocators have absolutely nothing to gain from the entertainment they provide. In fact, they actually incur costs (in the form of energy expended and lost foraging time) when they do so. Thus, their indirect reciprocity behavior is altruistic.

Our experience with the indirect reciprocators is very different on non-cracker days. If a focal duck comes up to us on a non-cracker day, there’s just no cracker to be had, no matter how charming or insistent the request. Dejected, the focal duck typically waddles or paddles away within a few seconds. Now, what do you suppose the entertainment ducks do after we refuse to feed the focal duck? That’s right. They withhold their entertainment behaviors. This pattern, of course, is exactly as one would expect if the entertainment ducks were regulating their entertainment behaviors according to the logic of indirect reciprocity.

Theorists typically assume that the computational demands for indirect reciprocity to evolve are quite extensive. For instance, indirect reciprocators need to possess computational machinery that enables them to acquire information about the actions of donors—either through direct sensory experience of donor-recipient interactions, or (more rarely) language-based gossip, or (even more rarely) social information stored in an external medium, such written records or the reputational information that’s often available in online markets. Indirect reciprocators also need be able to tag donors’ actions toward recipients as either “beneficial” or “non-beneficial,” store that social information in memory, and then feed that information to motivational systems that can produce the indirect reciprocity behaviors that will serve as rewards to donors. However, the indirect reciprocity we’ve identified in our mallards suggests that those computational requirements may be fulfilled in vertebrates more commonly than theorists originally thought.

Neither of us could figure out for sure whether the focal ducks were transmitting information about our generosity/non-generosity to the indirect reciprocators through verbal (or non-verbal) communication, but we think it is unlikely. Instead, we suspect that the indirect reciprocators were directly observing our behavior and then using that sensory information to regulate their indirect reciprocity behavior.

In support of this interpretation, we note that on several cracker days, it was not only other ducks that engaged us as indirect reciprocators, but individuals from two different species of turtles (which we believe to be Rachemys scripta and Apalone ferox) as well. The turtles’ indirect reciprocity behaviors, of course, were different from those of the ducks, due to differences in life history and evolutionary constraints: The turtles didn’t reward our generosity through waddle-based or quack-based rewarding, but rather, by (a) rooting around in the mud where the focal duck had received the cracker earlier, and (b) trying to grab the focal duck by the leg and drag it to a gruesome, watery death. The fact that turtles engaged in their own forms of indirect reciprocity suggests that they, at least, were obtaining information about our generosity via direct sensory experience, rather than through duck-turtle communication or written or electronic records: It is widely accepted, after all, that turtles don’t understand Mallardese or use eBay.

The involvement of turtles as indirect reciprocators also suggests that indirect reciprocity might be even more prevalent–and more complex–than even we originally suspected. Not only does indirect reciprocity evolve to regulate interactions within species (viz., Homo sapiens), and between species (viz., between Homo sapiens and Anas platyrhynchos L., as we have documented here), but also among species (Homo sapiens as donors, Anas platyrhynchos L. as recipients, and Rachemys scripta and Apalone ferox as indirect reciprocators).

Finally, we should point out that although our results are consistent with the indirect reciprocity interpretation that we have proffered here, other interpretations are possible as well. We look forward to new work that can arbitrate between these two accounts (and perhaps others). We also see excellent opportunities for simulation studies that can shed light on the evolution of indirect reciprocity involving interactions between two or even three different species, which my co-Investigator thinks she might pursue after she has mastered long division.

h/t Eric P.

I’m feeling Edge-y about Human Evolutionary Exceptionalism

Unless your Internet has been broken for the past few days, by now you’re probably aware that John Brockman, via his Edge.org web site, has published the responses to his Annual Edge Question of the Year. For more than 15 years, John has been inviting people who think and write about science and the science-culture interface to respond to a provocative question. The question for 2014 was “What scientific idea is ready for retirement?” Brockman explains:

Science advances by discovering new things and developing new ideas. Few truly new ideas are developed without abandoning old ones first. As theoretical physicist Max Planck (1858-1947) noted, “A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.” In other words, science advances by a series of funerals. Why wait that long? What scientific idea is ready for retirement? Ideas change, and the times we live in change. Perhaps the biggest change today is the rate of change. What established scientific idea is ready to be moved aside so that science can advance?

I received an invitation to participate this year, and it didn’t take me long to settle on a topic. Something has been bugging me about the application of evolutionary thinking to human behavior for a while, and the Question of the Year was the perfect place to condense my thoughts into a 1000-word essay. What scientific idea, in my opinion, is ready for retirement? I nominated Human Evolutionary Exceptionalism. Here’s how I framed the problem:

Humans are biologically exceptional. We’re exceptionally long-lived and exceptionally cooperative with non-kin. We have exceptionally small guts and exceptionally large brains. We have an exceptional communication system and an exceptional ability to learn from other members of our species. Scientists love to study biologically exceptional human traits such as these, and that’s a perfectly reasonable research strategy. Human evolutionary exceptionalism, however—the tendency to assume that biologically exceptional human traits come into the world through exceptional processes of biological evolution—is a bad habit we need to break. Human evolutionary exceptionalism has sown misunderstanding in every area it has touched.

In my essay, I went on to describe examples of how human evolutionary exceptionalism has muddled the scientific literatures on niche construction, major evolutionary transitions, and cooperation. You can read my entire essay over here, but for this blog I’m reproducing what I had to say about the major evolutionary transitions—for reasons that I will make clear presently. The most critical part below is bolded and italicized. Here’s what I wrote:

Major Evolutionary Transitions. Over the past three billion years, natural selection has yielded several pivotal innovations in how genetic information gets assembled, packaged, and transmitted across generations. These so-called major evolutionary transitions have included the transition from RNA to DNA; the union of genes into chromosomes; the evolution of eukaryotic cells; the advent of sexual reproduction; the evolution of multicellular organisms; and the appearance of eusociality (notably, among ants, bees, and wasps) in which only a few individuals reproduce and the others work as servants, soldiers, or babysitters. The major evolutionary transitions concept, when properly applied, is useful and clarifying.

It is therefore regrettable that the concept’s originators made category mistakes by characterizing two distinctly human traits as outcomes of major evolutionary transitions. Their first category mistake was to liken human societies (which are exceptional among the primates for their nested levels of organization, their mating systems, and a hundred other features) to those of the eusocial insects because the individuals in both kinds of societies “can survive and transmit genes . . . only as part of a social group.”…

Their second category mistake was to hold up human language as the outcome of major evolutionary transition. To be sure, human language, as the only communication system with unlimited expressive potential that natural selection ever devised, is biologically exceptional. However, the information that language conveys is contained in our minds, not in our chromosomes. We don’t yet know precisely where or when human language evolved, but we can be reasonably confident about how it evolved: via the gene-by-gene design process called natural selection. No major evolutionary transition was involved.

This past Monday morning, right as I was about to go to the Edge web site to check out some of the other essays, someone e-mailed me to let me know about an uncanny coincidence. Just hours before my Edge essay came out—in which I was calling for the retirement of the misconception (among others) that human language was the outcome of a major evolutionary transition, Martin Nowak had published an essay on the Templeton Big Questions web site in which he was pushing in exactly the opposite direction. Here’s what Nowak had to say (my emphasis in boldface and italics):

I would consider these to be the five major steps in evolution: (i) the origin of life; (ii) the origin of bacteria; (iii) the origin of higher cells; (iv) the origin of complex multi-cellularity and (v) the origin of human language. Bacteria discovered most of biochemistry, higher cells discovered unlimited genetics; complex multicellularity discovered intricate developmental processes and animals with a nervous system. Humans discovered language.

Human language gave rise to a new mode of evolution, which we call cultural or linguistic evolution. The enormous speed of human discovery and invention is driven by this new mode of evolution. An idea or concept that originates in one brain can quickly spread to others. Structural changes (memories) are imprinted from one brain to another. Prior to human language the most crucial information transfer of evolution was mostly in terms of genetic information. Now we have genetic and linguistic evolution. The latter is much faster.  Presumably the collective information in human brains evolves at a much faster rate than any previous evolutionary system on earth. The growing world wide connectivity speeds up this linguistic evolutionary process.

Now, Nowak and I both agree that human language is a Very Special Way of transmitting information, but I say human language was not the outcome of a major evolutionary transition. Nowak says it was. We can’t both be right, so what’s going on? With respect, I think it’s Nowak who’s muddling things.

It was John Maynard Smith and Eörs Szathmáry who were actually responsible for popularizing the idea that human language was a major transition in evolution (see my essay and read between the lines; you’ll know whom I’m talking about even though I followed Brockman’s instructions to talk about ideas rather than the people who promote them). But as I wrote in my essay, Maynard Smith and Szathmáry made a category mistake when they did so: Here’s the first sentence from the description of their book, The Major Transitions in Evolution: “During evolution, there have been several major changes in the way that genetic information is organized and transmitted from one generation to the next.”

The critical word in the last sentence is “genetic.” Evolutionary transitions are about information stored in DNA, not about information in people’s minds. So, by Maynard Smith and Szathmáry’s own definition of major evolutionary transitions, human language categorically, absolutely cannot be one of them.

This has got to be incredibly obvious to anyone who takes a moment to think about it, so I’m not quite sure why influential people keep the misconception going. Equally puzzling to me, though, is why Maynard Smith and Szathmáry committed this error in the first place. Those gents are/were smart (Maynard Smith died in 2004; Szathmáry is still with us), and few people have ever had cause to doubt Maynard Smith’s judgment (though read Ullica Segerstrale’s biography of Bill Hamilton to learn about a striking exception to that rule). In any case, I can’t understand why Nowak continues to promulgate the notion that the evolution of human adaptations for nice things like human societies (as he did here) and human language (as in the Big Questions Online piece)—often using Maynard Smith and Szathmáry’s book for citation firepower—are comparable to actual Major Evolutionary Transitions that involved actual “major changes in the way that genetic information is organized and transmitted from one generation to the next.”

Human language is fascinating, puzzling, and a prime target for theory-building and research. Ditto for human cooperation and human societies. But these interesting features of human life are made neither grander, nor more comprehensible, by trying to get them into The Major Evolutionary Transitions club. They just don’t have the proper credentials.