Universal Positivity Bias?

Another study that arouses my instinctual skepticism: “Human Language Reveals a Universal Positivity Bias,” by Peter Sheridan Dodds, Eric M. Clark, Suma Desuc, et al., PNAS Online, 112.8 (2015). The abstract:

Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (i) the words of natural human language possess a universal positivity bias, (ii) the estimated emotional content of words is consistent between languages under translation, and (iii) this positivity bias is strongly independent of frequency of word use. Alongside these general regularities, we describe interlanguage variations in the emotional spectrum of languages that allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.

Significance: “The most commonly used words of 24 corpora across 10 diverse human languages exhibit a clear positive bias, a big data confirmation of the Pollyanna hypothesis.” It may well be true, and as a pretty optimistic guy myself, I don’t find it inherently implausible, but I don’t know enough about statistics to evaluate it, and I can’t help but think the results depend on how you set up the study and define things like “positive.” But confirmation of the Pollyanna hypothesis (which apparently goes back to 1969) certainly sounds like a good thing. Always look on the bright side of life!

Comments

  1. I didn’t quite understand their method, but results like p-value 10-118 do not inspire confidence. I am not sure about other languages, but English is dominated (by frequency) by function words, which, probably by definition, do not carry any emotional content. Also, it is not clear, whether in their main graph they’ve included each word as 1 unit (dangerous, because distribution on happiness scale can be frequency dependent, the authors contend otherwise, but some real measure would be better; I think that was the case) or proportional to its frequency (functional words would have swamped any useful signal). All that sad (I mean “said”), the overall thrust that people prefer to project happiness rather than gloom is reasonable and the paper provides some support to it. Hedonometer though should be taken out and shot if for nothing else then for its name only.

  2. Start the year with a stupid study, I always say ! The authors take the utterly vague word “positivity”, then burble about methods and mathematics in order to bend it to mean what they want it to mean. Strike one against them: they draw the credulous reader in with vagueness, then send him on his way with spurious precision.

    Now let’s consider the simple math involved in these statements by the authors:

    “10 languages .. ~10,000 words for each language … Overall, we collected 50 ratings per word for a total of around 5 million individual human assessments. … We then paid native speakers to rate how they felt in response to individual words on a nine-point scale, with 1 corresponding to most negative or saddest, 5 to neutral, and 9 to most positive or happiest.”

    OK, indeed 10 * 10^4 * 50 = 5 million. But how did this work in practice ? Apparently at least 50 people “overall” must have been involved.

    If there were only 50 native speakers involved, it seems that each of them would have to “assess” ~10,000 words. But there are only 10 languages involved. Assuming the authors took the same number of native speakers for each language, then each of 5 native speakers per language would need to “assess” ~10,000 words. These 5 speakers could not have divided up the task for their common language into disjoint sets of 2,000 words for each speaker, since that would not yield 50 ratings per word, but only 1. In this scenario each word is “assessed” by only one speaker.

    So many more than 50 native speakers must be involved. If there were 5,000 native speakers involved, that makes 500 speakers per language (still assuming the same number of speakers for each of 10 languages). Would each of them need to “assess” the full ~10,000 words of their language ? No, the ~10,000 words could be separated into 10 disjoint sets of 1,000 words each, with one of 10 disjoint groups of 50 speakers assigned to each set of 1,000 words. This would yield the claimed 50 ratings per word “overall”.

    So, in this scenario, each speaker would have to “assess” each of 1,000 words for its “positivity”.

    Can any reader here seriously imagine himself/herself “assessing”, in any reproducible and reliable way, each of 1,000 words for its “positivity” ?

    Take Hat’s post for example. How would you “assess the positivity” of the words in this sentence:

    It may well be true, and as a pretty optimistic guy myself, I don’t find it inherently implausible, but I don’t know enough about statistics to evaluate it, and I can’t help but think the results depend on how you set up the study and define things like “positive.”

    Is “inherently” a word with positivity ? What about ” “positive” “, i.e. the quoted word ? Is “optimistic” a word with positivity, or is it a neutral word that means “positivitiness” ?

  3. Well, that’s where 50 assessments come in. If you have no idea whether a word is positive or negative or how much then simply hit the midrange. Or maybe take a random number. That’s the whole point of statistics – we leave in a world with a good deal of noise, but if things are averaged properly, an interesting correlation may shine through.

    The whole idea of the paper remained somewhat unclear. I don’t think it’s a very hard task to come up with the lists of “happy” and “sad” words. And apparently people have been doing it for some time and assessing various texts using these lists. Now our bored physicists or comp scientists or whomever ride in and tell everyone that it is unscientific and what we need is a data driven approach. Fine. But as the outcome, you do not construct a silly hedonometer, but figure out what new did your statistical approach discovered. The paper seems to be on a silly side of things, but it is different from wrong.

  4. Athel Cornish-Bowden says:

    The authors take the utterly vague word “positivity”, then burble about methods and mathematics in order to bend it to mean what they want it to mean. Strike one against them: they draw the credulous reader in with vagueness, then send him on his way with spurious precision.

    I’m reminded of Wassily Leontev’s comment about theoretical physicists who have displayed a tendency to “lead … the reader from sets of more or less plausible but entirely arbitrary assumptions to precisely stated but irrelevant theoretical conclusions”. He was talking about economics, but we get a lot of that in theoretical biology.

  5. marie-lucie says:

    Athel: we get a lot of that in theoretical biology.

    I don’t think it is absent from some forms of linguistic theory.

  6. marie-lucie says:

    “The most commonly used words of 24 corpora across 10 diverse human languages exhibit a clear positive bias, a big data confirmation of the Pollyanna hypothesis.”

    I am skeptical, but not interested enough to want to read the article.

    Young people at the most linguistically creative time of their lives seem to reevaluate words with currently positive bias as they become “bleached” from overuse, and to assign very positive bias to others which were only mildly positive or even neutral (even negative!), before a few years of use requires a new cycle of creativity. Words newly created for the purpose may or may not remain in the language (see the fate of “groovy” from the 60s, now hopelessly dated).

    Conversely, the same thing can happen at the negative end. I remember reading about a French study (probably pre-WWII) which found an extraordinary number of words for ‘stingy, avaricious, tight with money’ in urban slang as well as rural dialects, implying that the lack of generosity was seen as by far the worst attribute of a person, and no single word was negative enough to describe it.

  7. Jim (another one) says:

    “but English is dominated (by frequency) by function words, which, probably by definition, do not carry any emotional content.”

    Any study that distinguishes function words from syntactic affixes is flawed from the start. It’s just not a very serious approach.

    “I don’t think it is absent from some forms of linguistic theory.”

    And we can list a lot of them by name, can’t we?

  8. Any study that distinguishes function words from syntactic affixes is flawed from the start. It’s just not a very serious approach.

    I am not sure what you are saying. If it is that it is and it’s should not be treated differently, true, but trivial. If you also extend your comment to focusing on words as strings of characters separated by blank spaces and punctuation marks vs. stemming then I don’t really know why. At any rate, the function words are very frequent and emotional content ascribed to them is most probably just noise. Thus, omnivorous data analysis is at risk to be swarmed by noise. If you want to make a rigorous study and by “rigorous” you mean “not excluding words that are obviously irrelevant without any formal procedure”, there is a possibility to exclude the function words on syntactic grounds. Yes, languages are messy, but it is no way to deal with it just by ignoring this fact and hoping that the truth will out all by itself if enough data is collected. Both the data and the clear reasoning are necessary.

  9. > “The most commonly used words of 24 corpora across 10 diverse human languages exhibit a clear positive bias, a big data confirmation of the Pollyanna hypothesis.”

    It seems like this could just as easily confirm the, um, Tolstoy hypothesis: if “positive statements are all alike; every negative statement is negative in its own way”, then we’d expect there to be fewer distinct positive words, such that positive words have a better chance to be “most commonly used words”.

    (Disclaimer: I haven’t read the paper. Maybe it somehow rules out this interpretation.)

Speak Your Mind

*