Gordon Smith has a Conglomerate post about a Utah Supreme Court case, State v. Rasabout, which involved the question of whether a man was properly convicted of 12 counts of “unlawful discharge”: was each shot a separate “discharge,” or should the 12 shots together be considered a single “discharge”? The court held that “each discrete shot” is one “discharge,” but the interesting thing is that Associate Chief Justice Tom Lee was uncomfortable resolving the statutory ambiguity by reference to the dictionary; Smith says that “the gist of the problem is that the dictionary definition of ‘discharge’ could mean ‘to shoot’ or it could mean ‘to unload.’ And the dictionary does not tell us the best meaning in this context. To resolve this problem, Justice Lee turns to corpus linguistics:”
In this age of information, we have ready access to means for testing our resolution of linguistic ambiguity. Instead of just relying on the limited capacities of the dictionary or our memory, we can access large bodies of real-world language to see how particular words or phrases are actually used in written or spoken English. Linguists have a name for this kind of analysis; it is known as corpus linguistics.
The fancy Latin name makes this enterprise seem esoteric and daunting. It is not. We all engage in it even if we don’t attach the technical label to it. A corpus is a body, and corpus linguistics analysis is no more than a study of language employing a body of language. When we communicate using words we naturally access a large corpus—the body of language we have been exposed to during our lifetimes—to decode the groups of letters or sounds we encounter. The most basic corpus linguistics analysis involves our split-second effort to access the body of language in our heads in our ongoing attempt to decode words or phrases we may be uncertain of. We all do that repeatedly every day.
It is a small step to utilize a tool to aid our linguistic memory. Judges do this with some frequency as well. Naturally. If judges are entitled to consult the corpus of language in our heads (and how could we not?), we must also be permitted to supplement and check our memory against publicly available sources of language.
As Smith says, “Yes, yes, yes!” Via Mark Liberman’s Log post, where you will find a good discussion (including a response from Smith, who has fixed a typo I pointed out).