A computer program can allegedly distinguish between male and female authors with 80% accuracy. If this can be independently verified, I guess you can’t argue with success, but I’m deeply suspicious of anything that runs on this kind of fuel:

“Women have a more interactive style,” said Shlomo Argamon, a computer scientist at the Illinois Institute of Technology in Chicago who developed the program. “They want to create a relationship between the writer and the reader.”
Men, on the other hand, use more numbers, adjectives and determiners – words such as “the,” “this” and “that” – because they apparently care more than women do about conveying specific information.

Uh huh. Anyway, read all about it in the Jewish World Review story (which I chose out of a bunch of identical ones from different newspapers because it has the URL of a site where you can examine Argamon’s research); thanks to Laputan Logic for the story (he gives a link to The Age, but it’s the same old applesauce).
Addendum. Related MetaFilter thread.


  1. Interesting. No reason to be suspicious, I think, given two recently popularized techniques for classifying text.
    Naïve Bayesian text classifiers are being very successfully employed to distinguish between spam and non-spam messages. Researchers have also recently been able to classify texts by author, by compressing (with gzip) corpuses of text from each author, then appendning the text to be classified to each corpus and re-compressing. The resulting compressed file which changes in size the least, turns out to be a good classification.
    If there is any bias at all in writing style between genders (or, in this case, strictly sexes), either of these methods might have a good chance of detecting it.

  2. It has been suggested that Tiptree is female, a theory that I find absurd, for there is to me something ineluctably masculine about Tiptree’s writing. I don’t think the novels of Jane Austen could have been written by a man nor the stories of Ernest Hemingway by a woman, and in the same way I believe the author of the James Tiptree stories is male.

    Robert Silverberg, in his introduction to Warm Worlds and Otherwise (1975) by James Tipree, Jr., the pseudonym of Alice Sheldon, woman.

  3. Reflecting upon this, it seems to me that we could quite easily conduct our own experiment. Choose N known female bloggers, and M known male bloggers. Take a month or two of the bloggage of each from which to build text corpuses.
    For each blogger, construct two corpuses—one male, one female—of all the bloggage EXCEPT for the blogger in question, and perform the naïve Bayes test, and the compression test, and record how they are classified by each test.

  4. Jonathan, you get a gold star. That was perfect.
    Jim: An excellent idea.

  5. Mua-hahahahaha. I volunteer to be one of the female subjects. I get to pick the month, though.

  6. A sure thing about that test: the probability of one deliberately aiming to foil it increases the more contrary, perverse people read about it.

  7. I’m sure that any differences between female and male writers (in English? Hebrew?) can be determined by a computer program, but I’m not so sure about the proposed reasons, such as: “[Women] want to create a relationship between the writer and the reader.”

Speak Your Mind