The Importance of Data.

January 25, 2019 by languagehat 19 Comments

John Cowan sent me a link to “Methodological Thoughts from the Linguistic Field” by William Davies, whose abstract reads:

Data are the heart and soul of any linguistic research. Regardless of how incisive an analysis might be, or how clever, it can never be any better than the data it is based upon. For the field linguist gathering data, important considerations include the selection of informants, the number of informants selection, and data collection techniques. Different research objectives, be they descriptive, prescriptive or theory-driven, require techniques appropriate to those particular goals and should be evaluated within the context of inquiry. What follows is a consideration of the techniques generally used by field linguists with a general descriptive goal within the framework of generative linguistics.

JC’s comment:

This is a paper about field research into an obscure language, Madurese — by a generativist. Will wonders never cease? I’m particularly impressed that he talks about “acceptability judgments” instead of “grammaticality judgments”, which confirms my view that grammaticality is relative to a specific grammar and cannot be judged by informants; what they can tell you is whether the sentence is acceptable Madurese or not.

Very true!

Comments

David Eddyshaw says

January 25, 2019 at 8:01 pm

My impression is that field linguists in general are pretty sniffy about elicitation, regarding it basically as a pis aller when there’s no other way of filling out a paradigm or whatever. That’s for good reason, too: it’s a grand way of producing unnatural language of a kind never heard in the wild. It’s a bit surprising to see a different take here, and I wonder if this is in fact the result of the author’s generativism, which seems likely as an ideological position to make one less concerned about such things – it’s all about “acceptability.”

But many informants happily accept locutions they would never actually produce. (On a trivial level, it took me a while to realise that in West Africa I should never ask directly “Is that a correct way of saying it?”; it’s extremely impolite to say “no” to any leading question, especially from a high-status person or a guest.)

At any rate, that goes for syntax; when it comes to phonology and morphology life is too short for the perfectionist route of hoping you live long enough to hear someone say the right thing by chance one day. But even there you can get into trouble. Intelligent informants with a feel for the patterns of their language will fill in the gaps in defective paradigms for you just to be helpful …
David Eddyshaw says

January 25, 2019 at 8:42 pm

The opposite is also pretty common: lots of people will tell you that a construction is unacceptable, and then turn round and use it themselves when they’re not thinking about it. This is just as true for small languages with no prescriptive tradition as it is for French or English.
Piotr Gąsiorowski says

January 25, 2019 at 9:17 pm

As in, “The very idear of pronouncing an intrusive /r/ makes me shudder!”
Norvin says

January 25, 2019 at 10:12 pm

But many informants happily accept locutions they would never actually produce.

One standard way of addressing this, for us elicitation-using generativist field linguists (and yes, there are many of us) is to ask people to repeat the sentences that they’re claiming are fine. Often they’ll change them.

I have also had a lot of consultants who acted this way but who were willing to draw a distinction between things that I, the outsider, could say–which was pretty much anything–and things that they would say themselves. Sometimes it becomes a running joke: “Is this something I can say?” “Oh yes, certainly,” “So, can you say it?” “Oh, no!”
Bathrobe says

January 25, 2019 at 10:19 pm

I have never had to interview informants, but learning a foreign language from a native teacher can involve similar issues.

One phenomenon I find baffling is when speakers say: “According to grammar we could say that, but we don’t”. That is, a particular construction could be stretched in a particular direction in theory but isn’t in practice.

Sometimes this kind of objection can be circumvented with a specific situational or linguistic context, in which it becomes clear that the construction can be used in the desired way. But often it turns out that the proposed sentence simply isn’t valid. At what point do you accept that the sentence is invalid in any context and move on?

Prescriptivism is also a doughty foe. A well-meaning teacher could tell you that a structure is wrong because it doesn’t conform with what they were taught in school or what they regard as good logic. Such a person might have excellent insights into language but still have blind spots caused by judgemental attitudes. It takes a certain degree of sophistication to note that what is unacceptable in formal written language (with its prescriptive grammar) might be acceptable in loose everyday speech.
Bathrobe says

January 25, 2019 at 10:27 pm

“Is this something I can say?” “Oh yes, certainly,” “So, can you say it?” “Oh, no!”

“Is this something I can say?” might be par for the course in English but doesn’t necessarily work in other languages (e.g., pro-drop languages). If you are interviewing people in their own language you need to be alert to the nuances of how you ask about acceptability.

I would hate to interview English speakers about their language because prescriptivism would almost invariably raise its ugly head. “Could you say this?” “No, it’s wrong”. “Er…, ok, do you know any people who might say it?” “Yes but that’s sloppy English”.
David Eddyshaw says

January 25, 2019 at 10:32 pm

Yo, Norvin! Welcome back!

Do you think there’s anything in my feeling that the use of elicitation without getting hung up on its essential sinfulness is more characteristic of generativists than field linguists of other persuasions? (It seems likely to me a priori, but lots of things that seem likely to me a priori turn out to be – alas – false.)
You seem like a good person to ask.
Norvin says

January 25, 2019 at 10:40 pm

Hi, David!

I think you are probably on to something there, yes. There are generativists who are also worried about elicitation (Sandy Chung is one famous name that comes to mind), and even those of us who use it try to think hard about the best ways to use it and the best ways to interpret the results. But I think it would be hard to find a generativist who just outright won’t accept elicited data. For the things we work on, it’s hard to imagine a substitute.
Joel says

January 25, 2019 at 11:12 pm

What does it take for a language to merit the label “obscure”?
languagehat says

January 25, 2019 at 11:22 pm

Yo, Norvin! Welcome back!

Seconded! I was hoping this post would bring you out of the woodwork.
Y says

January 25, 2019 at 11:23 pm

I can’t imagine anyone doing fieldwork can be doctrinaire about corpus vs. elicitation techniques. It’s inescapable that you need both, and so recommend all linguistics fieldwork textbooks, including Samarin’s, written in 1967.
A speaker of a language without the pluperfect might not ever discover its existence in a language that does using elicitation alone. On the other hand, a corpus might not ever give you instances of ‘I died’ (which might happen to have an interesting inflection).
Bathrobe says

January 25, 2019 at 11:28 pm

@David

I suspect that the generativist use of elicitation is an extension of the generativist use of introspection. A lot of generativists, at least in the early days, seemed to gather linguistic data by looking at their own usage, going over in their minds the acceptability of this sentence or that, and making mental changes that might impact on its acceptability. This way they tried to arrive at a model of speaker ‘competence’ (their own) that would help them formulate the correct generative explanation.

Since you can’t do this with a language you aren’t a native speaker of, the next best thing is to put all kinds of alternatives to informants in order to test their acceptability. For instance, studies on the anaphora of reflexives might involve asking native speakers about a whole range of embedded reflexive forms to see what they are understood as referring back to. This means pushing the boundaries of language beyond natural spoken usage in an attempt to find what the underlying grammatical rules are.

In my understanding (and I am not totally sure of my ground here), the old structuralist tradition was perhaps less focused on this kind of constant probing and more interested in eliciting natural forms for the purpose of linguistic description. Confusing native speakers with seemingly nonsensical hypothetical forms would, I suspect, not have been regarded as good practice. If this is the case, perhaps generativitists are more open to the creative use of elicitation techniques than structuralists were. But I am willing to be corrected by someone who has a better knowledge of structuralist field techniques.
David Marjanović says

January 25, 2019 at 11:41 pm

One phenomenon I find baffling is when speakers say: “According to grammar we could say that, but we don’t”. That is, a particular construction could be stretched in a particular direction in theory but isn’t in practice.

Why baffling? Plenty of things are grammatical but not idiomatic.

(And a few fixed phrases might even be idiomatic but not grammatical. But in such cases the grammar would most likely be defined as containing those phrases as exceptions.)
Norvin says

January 26, 2019 at 1:04 am

I miss Bill Davies–he was an easy person to talk to. Probably part of what made him so good at fieldwork. He died pretty recently, far too young:

https://linguistlist.org/issues/28/28-3515.html
David Eddyshaw says

January 26, 2019 at 7:11 am

I can’t imagine anyone doing fieldwork can be doctrinaire about corpus vs. elicitation techniques. It’s inescapable that you need both

Undeniable. But it does seem to me that teh hardcorez (including many of the authors of the most impressive descriptive grammars I’ve seen) do tend to regard elicitation as a necessary evil rather than an optimum.

Bathrobe’s comment neatly expresses why I wondered if a relatively unbuttoned take on elicitation might be encouraged by the generativist outlook.

Norvin is obviously right that a lot turns on how you conduct your elicitation. And a lot depends on your informant cottoning on to what you’re actually trying to do and becoming a partner in exploration rather than a sort of grammar-check (I was especially fortunate in that regard myself.)
Bathrobe says

January 26, 2019 at 8:10 am

@ David

I wasn’t thinking of idioms, rather of grammar. For instance, Mongolian has a passive, but it is not much like the English passive in either frequency or usage. This is not a matter of “idiomaticity”.

Even in English, a sentence like “That can’t be afforded by me” would probably put most native speakers off, but “That can’t be afforded by most people” might actually be found in use. Given that usage can shift, the point I want to make is, where is the line between “grammatically correct” and “idiomatic” usage? And more importantly, where and how do native speakers draw the line? It can be hard to tell if they are making judgements of pure, mechanical grammaticality, or judgements about “acceptability”.
David Marjanović says

January 26, 2019 at 8:38 am

Good point.
melboiko says

January 28, 2019 at 11:26 am

I do research on Japanese dialectal tones, and I just ask them to read written words aloud (the writing system doesn’t mark tone). Early data showed quite safely that they don’t reproduce the standard tones this way, but seem to apply local tone even when in “school pupil mode” (enunciating standard words in a more or less consciously prescriptive way). It had long been noticed in Japanese dialectology that tone appears to resist standardization; probably because tone classes are unpredictable cross-dialectally (the only way to reproduce another dialect’s tone is to relearn the pitch patterns for every single word in the lexicon, which is hard).

Admittedly, elicitation by reading might still have various kinds of interferences. But it’s a calculate gambit; our goal is to probe the development of Middle Japanese tone classes into the various dialects, and by doing this we can gather large-scale data on most of the known (tonally annotated) MJ vocabulary.

My favourite moment was when one particularly unceremonious elderly lady pointed out that this is a bad method because it’s hard to do dialect properly when looking at orthography (e.g. ido ‘well’ in local speech would be [e̝ⁿdo]; but if you are looking at letters which say ido, you’ll be tempted to say it the way it’s written). She was of course 100% correct, and I had to spend some time to convince her that it works for tones and that’s all we need right now.
David Eddyshaw says

January 28, 2019 at 12:47 pm

I suspect that in general, elicitation works best for exactly those features of language that speakers are least conscious of, or at least aren’t thinking about at that precise moment.

Kusaal distinguishes a ‘personal’ gender, appropriate for entities that might conceivably be referred to by first or second person pronouns (called ‘animate’ for convenience) from an impersonal ‘inanimate’ gender. The boundary is not altogether fixed; for example as in English, babies can be animate or inanimate depending on how empathetic you’re feeling, and higher animals in practice are often referred to as animate even though my informants unanimously refused to accept this in isolated expressions. But mass nouns referring to substances have (unsurprisingly) consistently inanimate gender. Nevertheless, when I was exploring whether it was possible to say salima la’ad ne butiis in the sense “silver [items and cups]” as opposed to “[silver items] and cups” (it was an ill-chosen example), my informant (who was very aware of the nuances of his language and understood what I was trying to do perfectly) said that this would have to be recast as salima la’ad ne o butiis, using the animate sg pronoun [“silver’s items and his cups”]. He only “corrected” the gender when I specifically drew his attention to it. I noticed a lot more of this sort of thing once I got better at eavesdropping on conversations; in such cases written texts (even informal ones) usually also conceal the linguistic reality, just as they conceal external sandhi phenomena. You need recordings of unselfconscious everyday conversation – if you can find a way of pulling it off. Heisenberg … And obviously there are questions of genre and varying levels of elevation of discourse which might complicate things further.

At the end of the day, the fact is that it is perfectly possible for careful and insightful speakers, uncorrupted by prescriptivism, to have clear insights about their own language which are, in point of fact, wrong.

Cyrus Gordon somewhere remarked on how he learnt early in his career that you cannot rely on Sprachgefühl even in your own mother tongue. I remember being quite shocked by this statement when I first encountered it. Whether you find it shocking or not probably says a great deal about your philosophy of linguistics; for some, I guess that it’s an outright contradiction in terms.

The Importance of Data.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments