Nothing Survives Transcription.

May 11, 2023 by languagehat 57 Comments

But also, nothing doesn’t survive transcription. So says Allison Parrish in a lecture delivered at Iona University’s Data Science Symposium in April; some excerpts:

My talk today is about transcription—how text comes to be. My goal is to trouble your understanding of what transcriptions are, how transcriptions work, and the stakes of this understanding (with particular reference to large language models). […]

By “transcription” I mean the result of adapting some stretch of language from one medium to another, in such a way that the adapted version is understood to have the same “content” as the “original.” Maybe more precisely: a linguistic artifact A is a transcription of a different linguistic artifact B if B precedes A causally and temporally, and A and B are understood to be identical in meaning, though they differ in material form.

The prototypical example of a transcription is a “transcript”—a written artifact that records the “content” of a stretch of language that was spoken out loud. And indeed, I’ll be talking about transcripts of this kind in more detail later. But I think the term “transcription” usefully applies to adaptations of language between any two modalities. For example, producing a typewritten copy of a handwritten manuscript is a kind of transcription. Taking notes on a lecture is a kind of transcription. Under this definition, even my verbal performance of this talk (reading from my speaker notes) is a variety of transcription. […]

There is what I call a “folk theory of transcription,” which is that transcription is, for the most part, a transparent process that mostly “just works,” and that a transcript of a stretch of language, or a digitized version of a text, is more or less the “same thing” as the original. This theory furthermore supposes that a transcription’s failure to reflect any given aspect of the original stems from either (a) the triviality of that aspect, or (b) an insufficiently “close” transcription process.

This folk theory, I think, is what underwrites the claims of large language models, like ChatGPT, which are mostly trained on plain text transcriptions of documents. The claim that these models can produce meaning relies on the assumption that plain text transcriptions of documents contain more or less the same “content” as their originals. […] This is (as I hope to demonstrate) not a very rich or very useful way of understanding texts.

With all of that in mind, I want to go back to Emily Dickinson. […] Dickinson’s punctuation is of particular concern to Howe and other scholars. She often eschewed periods, commas and semicolons in favor of dashes of varying lengths and directions, along with intentional variations in the spacing between words. Her earliest editors simply eliminated these, considering them anomalies, and typeset Dickinson’s poems according to poetic conventions of the time. But since the 1960s, there has been a simmering debate about the precise meaning of these marks and how they should be transcribed. Edith Wylder has proposed, for example, that these marks are “a form of punctuation,” of Dickinson’s invention, directed “to the reader’s inner ear.” The marks, she claims, register tonal modulation, pitch, and breath. Other scholars have suggested that Dickinson’s punctuation is “another means by which [she] takes us backstage to view the struggle of poetic process, a struggle to find the right word” (Wylder 210).

In the face of all this, Susan Howe concludes that there is no editorial approach to Dickinson that could be truly faithful to her poetic innovations. “Words are only frames. No comfortable conclusions. Letters are scrawls, turnabouts, astonishments, strokes, cuts, masks.… [Dickinson’s] manuscripts should be understood as visual productions. […] I think her poems need to be transcribed into type, although increasingly I wonder if this is possible.” I read Howe here as claiming that, in fact, the “crucial part” of Dickinson’s lyric did not survive transcription. Howe underlines the impossibility of survival with a political argument: “The production of meaning will be brought under the control of social authority” (Howe 140). (Howe also doesn’t hesitate to point out that although Dickinson’s poems themselves are in the public domain, you must pay a fee to quote the transcriptions of those texts.) […]

And transcriptions are institutions. We see this with Emily Dickinson’s work. The questions surrounding how to transcribe this text has breathed life not just into scholarship like Susan Howe’s, but also new editions of Dickinson’s work designed to better represent her linguistic innovation, including photographic facsimile editions. The institution of Dickinson’s transcription has also occasioned works of art, such as Jen Bervin’s The Dickinson Composites, a series of large-scale embroideries that reproduce Dickinson’s variant punctuation, leaving out the accompanying words. […]

Before we move on from Dickinson, I want to quote one of the Dickinson scholars that I read to prepare for this talk. Edith Wylder draws a conclusion much like Howe’s, writing that the “subtle refinements” of Dickinson’s punctuation “are lost in transcription” and that the “failure to transcribe these notations accurately… blur[s] the accent that distinguishes her persona and her story.” But then she makes an appeal: “Surely the infinite resources of modern technology will permit a more accurate transcription of Dickinson’s accentual notations” (Wylder 221–22). Now, Wylder was writing twenty years ago, long before the decidedly limited nature of technology’s resources was widely recognized. But I think this kind of technosolutionism is still prevalent when it comes to how we think about transcriptions.

She goes on to discuss conversation analysis (“I’m interested in dispelling the idea that there is such a thing as an objectively ‘accurate’ transcript that ‘preserves the essential aspects’ of some stretch of language”) and John Cage’s famous 4’33”:

The appearance of “nothing” in artistic works is often an appeal to look outside systems of transcription. The text of Cage’s piece (at least in certain editions) is simply the word “TACET,” written once for each of the piece’s three movements. But the experience of the piece—the “accidental sounds”—can’t be scored, or abstracted, or understood to be identical with any other experience. For me, the point of 4’33” is that “nothing” is always actually material. “Nothing” exists in real space and real time. In drawing attention to “nothing,” artists draw attention to this irreducible—and untranscribable—materiality.

Her conclusion begins:

So: nothing survives transcription, in the sense that no text makes it to the far side of the transcription process with its life intact. And also, nothing does not survive transcription: the empty parts of a text, the silent parts, the parts of the text that draw attention to its own materiality, specifically operate outside transcription’s capabilities. And all of us—whether as artists, poets, or everyday conversationalists—draw on the “nothing” that forms the gap between what can be transcribed and what cannot as a productive and creative resource.

It’s one of those essays that pushes an idea too far (in that it’s easy to push back against a lot of what she says) in the interests of forcing us to think differently; I think that’s a useful way to proceed, and the whole piece is worth reading.

Comments

David Eddyshaw says

May 11, 2023 at 8:55 pm

As Wendy Cope* says

Higgledy-piggledy
Emily Dickinson
Liked to use dashes
Instead of full stops.

Nowadays, faced with such
Idiosyncrasy,
Critics and editors
Send for the cops.

* Cope’s “Waste Land Limericks” are a distinct improvement on Eliot’s original.
AntC says

May 11, 2023 at 9:29 pm

… in the sense that no text makes it to the far side of the transcription process with its life intact.

Interesting Parrish should mention Cage’s 4’33” and yet not the long tradition of musical transcriptions.

Bach’s transcriptions for keyboard of (say) Vivaldi’s concertos are really more of a re-imagining.

Busoni’s transcriptions (to small ensemble) of Bach’s keyboard works (bloody Ave Maria) mostly, yes, leave little intact.

Stokowski’s transcription for orchestra of the Ciaccona from the solo violin Partita No. 2 somehow contrives to make the whole Philadelphia Symphony sound smaller than a single violin. (Bach himself transcribed the Partitas for lute; there’s various performances on Youtube, including one on a modern lute-alike with bolt-on sympathetic strings that is just perfect.)

Mussorgsky’s Pictures at an Exhibition sounds to me just perfect for piano. (I have a very old Melodia recording by Sviatoslav Richter, snuck out during Soviet times, when they were dirt cheap.) No orchestration has improved it imo — even though M himself intended to orchestrate.
AntC says

May 11, 2023 at 9:36 pm

The claim that these [large language like ChatGPT] models can produce meaning relies on the assumption that plain text transcriptions of documents contain more or less the same “content” as their originals.

There’s lots of hype around those models, but if you pay close attention, I don’t think they’re claiming to “produce meaning”. Rather, it’s the source texts that contain meaning (_if_ they do); ChatGPT mimics it or abbreviates it or something.

ChatGPT is to source texts as Busoni is to Bach. Discuss.
languagehat says

May 11, 2023 at 9:42 pm

Mussorgsky’s Pictures at an Exhibition sounds to me just perfect for piano.

I entirely agree.
Y says

May 11, 2023 at 10:09 pm

The piano PaaE is perfect, especially with the right pianist. That said, Ravel’s version is, if not perfect, flawless. No complaints at all.
Y says

May 11, 2023 at 10:14 pm

Recommended for the Dickinsonian: The Gorgeous Nothings, a beautiful facsimile edition of the poems she wrote on envelope scraps.
languagehat says

May 11, 2023 at 10:21 pm

Yes, it would be a great thing to have, but it’s a bit pricey for us common folk.
AntC says

May 11, 2023 at 10:30 pm

but it’s a bit pricey for us common folk.

Would they be cheaper if — oh I don’t know — they were transcribed on to the back of old envelopes?
David L says

May 11, 2023 at 11:33 pm

Emily Dickinson’s poetry would have been altogether different if she’d had access to full sheets of paper rather than a collection of scraps.
AntC says

May 12, 2023 at 12:15 am

William S. Burroughs similarly.
D.O. says

May 12, 2023 at 12:23 am

David L, way back when my friends and I used to write our calculations on the back side of some low quality election flyers (left over from the friends of the friends who had been making a bit of dough on some election campaign). Once, one of my friends had temerity to show his calculations to his thesis advisor written on these pages. After some substantive criticism, the advisor ventured an opinion that the calculations might be improved if they were written on “real paper”.

Andrei Bitov (mentioned many times on this blog) once read from stage a rough draft of some poem by Pushkin including all crossings, corrections etc. The aesthetic value of this exercise was lost on me.

Transcription as well as transmission is lossy. People who dealt with it professionally knew it forever. Talk about forcing an open door.
Jongseong Park says

May 12, 2023 at 12:55 am

,

Under this definition, even my verbal performance of this talk (reading from my speaker notes) is a variety of transcription.

This doesn’t really work for me because my intuitive understanding of transcription in all its senses is that the adapted version has to be text, whether handwritten, printed, or stored digitally as a string of characters. Not sure how I would express the generalized concept though. Maybe transmodality?
Brett says

May 12, 2023 at 2:29 am

Howe also doesn’t hesitate to point out that although Dickinson’s poems themselves are in the public domain, you must pay a fee to quote the transcriptions of those texts.

Say what, now?

@Y: Although I love the original piano version of Pictures at an Exhibition, I cannot really argue that Ravel’s version is not better. However, I would say that it is certainly flawed, since it is missing the last (freestanding) promenade. After the sixth movement, Mussorgsky’s included a fifth promenade, which is extremely similar to the first. Ravel, who had been varying the promenades progressively more and more, through changes in the way they were orchestrated, may have felt that this reprise of the original was superfluous—that he didn’t know what to do with it. On the other hand, some critics feel it is important to present the original promenade between paintings one more time, before the promenades change again in a new way, becoming subsumed into the main movements themselves, with “Catacombs” and “The Bogatyr Gates” featuring promenade themes in which the observer and the gallery seems to become indistinguishable from the people in the paintings themselves.

We played the last movement of Ravel’s arrangement of Pictures at an Exhibition (and, I’m not sure, but maybe one of the promenades as well?) my senior year in high school. The Orchestra (and later Advanced Orchestra) class was normally a string-only group, although we did have collaborations with the band and the concert choir. My last two years, the collaboration was ramped up quite a bit, and we were playing multiple concerts every year with a full-ish symphony orchestra, with the two teachers splitting the conducting duties.

As we were rehearsing, the night before our performance at the (or rather a*)** All-Northwest festival and competition, I remember the band teacher, Mr. Nail, trying to get us excited about the upcoming show. He talked about how wonderful it was being able to play real symphonic music, and he said, “It’s really great, like this, to play the original of something, rather than an arrangement.” Obviously, he was thinking of his experience as a band teacher, commonly leading performances of pieces from various musical genres that had been arranged for concert band, frequently by other harried secondary-school music teachers; conducting a symphony orchestra was evidently something he had really come to miss. However, he got a lot of weird looks from the string players. We were used to playing a repertoire or original works with only occasional arrangements, and while Ravel’s orchestration of Mussorgsky has been called the greatest arrangement of all time, it was still obviously an arrangement.

* There were at least three “All-Northwest” high school music festivals under different auspices.

** For me, this has to be “a,” not “an,” here, but I’m not entirely sure why.
David Marjanović says

May 12, 2023 at 6:24 am

…Dashes and spaces of various lengths have long been available in Unicode (and for longer than that in MS Word, I think!), so I don’t understand that particular problem.

** For me, this has to be “a,” not “an,” here, but I’m not entirely sure why.

That, on the other hand, is fascinating.
Jongseong Park says

May 12, 2023 at 8:14 am

** For me, this has to be “a,” not “an,” here, but I’m not entirely sure why.

Are you stressing the “a” as [eɪ̯] for contrastive emphasis rather than as pronouncing it as an unstressed [ə]? That would explain it, because there is no hiatus to resolve with a euphonic [n] when the “a” is pronounced as a diphthong.
languagehat says

May 12, 2023 at 9:19 am

Say what, now?

As John Cowan said here:

As for Emily Dickinson, the unbowdlerized urtexts weren’t published until the 1920s, so they are still in copyright, although the versions published in her lifetime are in the public domain.
languagehat says

May 12, 2023 at 9:20 am

(Insert rant about stupid, venal US copyright laws here.)
Seong of Baekje says

May 12, 2023 at 10:43 am

FYI: Harvard Square’s Raven Used Books is moving out your way, Hat, to Shelburne Falls.
Terry K. says

May 12, 2023 at 11:26 am

at the (or rather a*)** All-Northwest festival and competition

** For me, this has to be “a,” not “an,” here, but I’m not entirely sure why.

I think because of the parentheses. Or rather, the speech equivalent that they represent. The pause you could say, but a certain kind of pause (different than a pause for emphasis).

I also feel like there’s something pleasing to the rhyme of “the, or rather a”, which would be lost if saying [eɪ̯] or “an” instead of a (stressed) [ə].

Or maybe it’s because the word is stressed. Not, though, because of “a” being a diphthong as Jongseong Park suggested, because, as noted, I read it as a stressed [ə], no diphthong, which for me works fine.

Interesting, in light of the topic, to note that, even where the written form is the original, there’s an element of transcribing speech.
languagehat says

May 12, 2023 at 1:14 pm

FYI: Harvard Square’s Raven Used Books is moving out your way, Hat, to Shelburne Falls.

Hooray!

I read it as a stressed [ə], no diphthong, which for me works fine.

Interesting. For me, that’s impossible; if it’s stressed it has to be [eɪ̯].
John Cowan says

May 12, 2023 at 2:24 pm

Emily Dickinson’s poetry would have been altogether different if she’d had access to full sheets of paper rather than a collection of scraps.

Then again, she might have simply torn the full sheets into scraps first. They might indeed have been different if they were written on pieces of toilet paper.

the advisor ventured an opinion that the calculations might be improved if they were written on “real paper”

To be fair to the advisor, it used to be that the appearance of mathematical material served as a rough gauge of how likely it was to be right, as no one would pay through the nose to have something set in type unless it had gone through review. The popularity of TeX, however, has allowed the proliferation of complete bollocks that is beautifully typeset. As an extreme example, see Doug Zongker’s paper “Chicken Chicken Chicken: Chicken Chicken”.

the adapted version has to be text, whether handwritten, printed, or stored digitally as a string of characters

Indeed. I do not think it is the etymological fallacy to say that transcriptio is necessarily a kind of scriptio.

multiple concerts every year

Presumably this is multiple-2 (or is it multiple-1? I forget already). My high-school orchestra included a subset of band players: in the band I performed the traditional role of a Second Trombone, whereas in the orchestra I was the only trombone, and my services were by no means always required. But the same was not true for the flutists, for example.

if it’s stressed it has to be [eɪ̯]

I agree; I have no trouble using stressed [æn] if a vowel follows, however: “not [ði] apple but [æn] apple”, for instance.
Keith Ivey says

May 12, 2023 at 2:46 pm

From listening to old radio programs, I know there’s at least one (now obsolete) usage of “transcribe” that doesn’t involve text. Being “transcribed” meant being recorded (on acetate discs), and programs were announced as being “transcribed” to distinguish them from live programs.
languagehat says

May 12, 2023 at 3:28 pm

I have no trouble using stressed [æn] if a vowel follows, however

Why “however”? That’s the same phenomenon. You don’t say [ən].
Brett says

May 12, 2023 at 4:01 pm

@languagehat: I’ll certainly believe that what the publishers would like to claim, but I would want to see some meaningful case law before I accept that it is true. I don’t doubt Howe and Parrish believe it, because it comports with their own views, that the manuscripts and publications are substantially different works. However, it seems to me to fly in the face of the historical understanding of copyright; so by using their own belief in that copyright claim as evidence supporting the thesis that the works are indeed different is just τὸ ἐν ἀρχῇ αἰτεῖσθαι.

I can think of at least two counterarguments, which seem strong to me (but I am not a lawyer, much less a copyright expert). If the manuscripts and the original published editions of Dickinson’s poems are different enough creative works to be separately copyrightable, that implies the ἡ εις άτοπον απαγωγη that the copy editor (for Dickinson obviously had no participation in the editing of her own poems) must have made a significant enough creative contribution to be considered a coauthor. This is not (and, so far as I understand, has never been) a defensible position in copyright law. Moreover, even had the copyeditor legally merited coauthorship at the time of publication, they probably disclaimed authorship at the time of publication by listing only Dickinson as the author. It is certainly not possible to add new authors to a previously published work in order to change the term of the copyright.

@Terry K.: As I wrote “the (or rather a),” I wasn’t thinking of a rhyming pronunciation, but rather the strong pronunciations of both articles. Those make them close, but also clearly not rhyming. However, now that you mentioned it, the rhyming weak pronunciations sound even better.

@John Cowan: That’s definitely multiple². As I have indicated, I find multiple¹ to be ungrammatical in many cases, although it appears in a fair number of fixed constructions that remain in vibrant use.
languagehat says

May 12, 2023 at 4:03 pm

I’ll certainly believe that what the publishers would like to claim, but I would want to see some meaningful case law before I accept that it is true.

Try quoting a chunk of her poetry in print without permission and you’ll get a quick education.
Brett says

May 12, 2023 at 4:11 pm

I’m not going to pick a fight with Harvard’s deep pockets. However, until somebody produces case law, rather than cease-and-desist letters, as evidence, I will remain unconvinced.
John Cowan says

May 12, 2023 at 4:54 pm

Copyright Circular 14, which is one of the Copyright Office’s plain-English explanations of copyright law, says right at the top:

A derivative work is a work based on or derived from one or more already exist-ing works. Common derivative works include translations, musical arrangements, motion picture versions of literary material or plays, art reproductions, abridgments, and condensations of preexisting works. Another common type of derivative work is a “new edition” of a preexisting work in which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work.

The use of the word editorial implies that the editor need not be a coauthor. Smith’s edition of Shakespeare has an independent copyright from Jones’s edition, but neither Smith nor Jones is an author of the plays and poems.

I believe (though IANAL either) that this is black-letter enough that there will not be case law about it.

That’s definitely multiple²

Right, we agree. I just couldn’t be bothered to look up which meaning gets which superscript,
languagehat says

May 12, 2023 at 5:09 pm

Your “Copyright Circular 14” link is null and void.
David Marjanović says

May 12, 2023 at 5:34 pm

That would explain it, because there is no hiatus to resolve with a euphonic [n] when the “a” is pronounced as a diphthong.

That’s only if an actual [j] gets inserted; some Americans do that (SCIYENCE!!!), but it’s not universal.
Brett says

May 12, 2023 at 5:37 pm

@John Cowan: My understanding is that there is no meaningful legal distinction between an author and and editor who provides a significant degree of creative input. The editor does not need to be listed as an “author”; a listing as an “editor” or something else is fine. However, if a contributor allows something to be published without their name anywhere, then the term of the copyright cannot be based on the lifetime of that contributor, instead falling under the “work for hire” rules.

However, that quibble is not really what Dickinson’s case is about. Editorial changes can produce a sufficiently novel work to merit a separate copyright. However, those changes must be substantially and creative. Choosing the order in which the poems in a collection are published can be sufficient for a compilation copyright, but a reprint that puts the poems in chronological would not (barring some exceptional circumstances) be sufficient. Similarly, if editorial changes consist solely of reverting from the published versions to facsimiles of the poet’s handwritten submissions, that does not seem to entail any meaningful creative contribution on the part of the editor.
Y says

May 12, 2023 at 6:28 pm

The photographs of the manuscripts can certainly be copyrighted, though.
John Cowan says

May 13, 2023 at 12:23 am

The link is https://www.copyright.gov/circs/circ14.pdf

In this case it is the first edition that had the editorial input (namely bowdlerizing it) and the second edition did not, but that’s a distinction without a difference.
Brett says

May 13, 2023 at 2:07 am

@John Cowan: No, (unless I misunderstand you) that distinction is the critical point The original editor(s) made a creative contribution to the first publication. I do not see how removing the work of one (or more) contributor can create a sufficiently novel work for it to acquire a new ©.

Take another ἡ εις άτοπον απαγωγη: If the first edition of a book were published with a foreword by another author, nobody believes that a reprint without the foreword would be entitled to a new copyright.
Bloix says

May 13, 2023 at 1:43 pm

Now for me, The Great Gate of Kiev demands the over-the-top Ravel orchestration –
rozele says

May 13, 2023 at 8:36 pm

Raven Used Books
a gain for hat, but what a loss for The Square*! Raven (does it take the direct article?) is a store that felt like a long-overdue step out of the gentrification-(a/k/a Harvard-)driven nightmare that left the Harvard Bookstore basement as (i think) the only place to find used books for a few years in an area that was spoiled for choice even after the first wave of closures in the late 1980s.

musical transcriptions
also important in the other senses: notating recorded music, especially from field recordings, and rendering handwritten notation into typeset versions.

anyone who’s interested in the latter should take a look at the Klezmer Institute’s incredible Kisselgof-Makonovetsky Digital Manuscript Project**, which has transcribed two massive manuscript collections of yiddish music (with marginalia in multiple languages), using the skills and time of musicians from across the klezmer world. they’re breaking new ground in the digital humanities in some exciting ways, from the collaborative transcription process to how they’re preparing the material to make new kinds of corpus research possible.

and the former is also a live subject in the yiddish music world. the best-regarded teachers of traditional yiddish song basically tell their students to work directly with recordings of older singers whenever possible***, since the transcriptions are often so incomplete or just plain bad in exactly the areas most important to the style. and we’re lucky in having quite a lot of transcribed melodies starting around 1900****, because a few of the founding yiddish folklorists didn’t think that texts were the only important things to collect from singers.

.
* generally non-rhotic for even rhotic townies like me.
** try to avoid calling it KMFDM, and try even harder to avoid pronouncing the acronym (it can only end badly).
*** more possible now than it used to be, thanks to online projects like the Ruth Rubin collection at YIVO.
**** and not just of stuff being fetishized for its age, either: shmuel lehman’s collections of underworld songs and songs sung by sex workers have melodies!
AntC says

May 13, 2023 at 10:58 pm

underworld songs and songs sung by sex workers have melodies!

” Seeräuberjenny”, What’s picking a lock compared to buying shares? What’s breaking into a bank compared to founding one?
Lars Mathiesen (he/him/his) says

May 14, 2023 at 3:14 am

AFAIR, the way it works in Europe sidesteps the question of whether audio recordings or sheet music should be covered by Berne Convention copyright — there are separate protection domains for this, and also for industrial design separate from patents.

(It made sense back when sheet music was printed from plates that someone had spent weeks producing with manual punches and whatnot. I haven’t tried to find out if a score for some Händel mass that Wilhelm Hansen typesets on a computer and prints is covered in the same way, ditto for a video of a concert performance of same. I think I’m allowed to type the score into SIbelius myself, or perform it myself if I could, without breaking those protections).
drasvi says

May 14, 2023 at 4:53 am

“someone had spent weeks producing with manual punches and whatnot”

Well, IF copiright existed to protect authors, translators and workers from businesses (in this case, regulating re-use of the same plates) I would be much less hostile to it.
Jongseong Park says

May 14, 2023 at 5:08 am

That would explain it, because there is no hiatus to resolve with a euphonic [n] when the “a” is pronounced as a diphthong.

That’s only if an actual [j] gets inserted; some Americans do that (SCIYENCE!!!), but it’s not universal.

I should perhaps have clarified that I was talking specifically about the types of hiatuses that English tends to avoid. English has no problem following segments ending in high vocoids like [iː, eɪ̯, aɪ̯, aʊ̯, oʊ̯, uː] with a vowel. Following “a” pronounced as [eɪ̯] with a vowel is therefore not a problem.

However, if you have a vowel such as [ə] that does not end in a high vocoid and follow it with another vowel, you end up with a hiatus that tends to be avoided in a number of ways. In non-rhotic accents of English, this is where [r] is typically inserted. You can also insert a phonetic glottal stop [ʔ].

In the case of the articles “a” [ə] and “the” [ðə], the usual outcome is the use of the alternate forms “an” [ən] and “the” [ði] that avoids the hiatus. It is also becoming common to hear “the” [ðə] even before a vowel, though in this case [ʔ] is inserted.
AntC says

May 14, 2023 at 6:58 am

In the case of the articles “a” [ə] and “the” [ðə], the usual outcome is the use of the alternate forms “an” [ən] …

I’ve just bumped into this example:

XXX lacks an ubiquitous language: some words used to describe the domain model are ambiguous or are not used consistently, …

“an ubiquitous” might follow the rules, but it sounds wrong to me. I’d say “a ʔ ubiquitous” — although it still sounds awkward. Inserting [r] sounds worse than “an”. I guess the dyseuphony is that adjective being 4 syllables.

Editorially, I’d rephrase to “XXX lacks ubiquitous terminology”. (The authors are software engineers; for some of them English is not their first language.)
Lars Mathiesen (he/him/his) says

May 14, 2023 at 8:21 am

@drasvi, I don’t think people went stealing printing plates from music publishers, but if you try selling photocopies of Wilhelm Hansen editions, you better not come to their notice.
AntC says

May 14, 2023 at 8:50 am

I don’t think people went stealing printing plates from music publishers …

When printing switched from wooden to metal plates, there was suddenly a flood of wood blocks, Good quality wood: tight-grained, water-resistant, sturdy.

There’s a barn somewhere in Sussex whose exterior walls are print-blocks from fabric printing. Beautiful flowery stuff/William Morris-like. (Or rather there was some 30 years ago.)
languagehat says

May 14, 2023 at 9:38 am

“an ubiquitous” might follow the rules, but it sounds wrong to me.

That’s because the “rules” have been applied by someone with a very primitive understanding of what vowels and consonants are. Yes, the letter u usually represents a vowel, but in general when it starts a word it is pronounced with the consonantal y preceding it, as is the case in ubiquitous (and usually). Since the (actual) a/an rule operates on the basis of sound rather than spelling, it should be “a ubiquitous” (i.e., “a yubiquitous”), but alas, the level of linguistic understanding is so primitive that editors and other people responsible for things getting into print look at u, see a vowel, and mechanically use “an,” regardless of the fact that that’s not how the actual language works.
PlasticPaddy says

May 14, 2023 at 11:52 am

I have one slight niggle. Shakespeare always writes “an eunuch” and “an union” (by contrast, both “a hundred” and “an hundred” appear). This could be just convention, or accidental (small sample) or because initial long u was aspirated or diphthongised (so “an hyunuch” or “an eeyunion”).
David Marjanović says

May 14, 2023 at 12:11 pm

“an hyunuch”

whaaaaat
FJ says

May 14, 2023 at 12:56 pm

Yeah, those words had /iu̯/ back then.
David Eddyshaw says

May 14, 2023 at 1:07 pm

Still do, in Welsh English.
Jongseong Park says

May 14, 2023 at 10:22 pm

I thought it would go without saying, but it’s maybe worth clarifying that r-insertion never takes place after the articles “a” and “the”.

Shakespeare always writes “an eunuch” and “an union”

He also writes “an universal”, “an urinal”, “an usurer”, and “an usurped”. These were probably pronounced with falling diphthongs in his time. Welsh English has retained a falling diphthong [ɪʊ̯], but I’m not familiar with its exact distribution. At least in some accents words such as “union” seem to be pronounced with [jɪʊ̯].

There is also an example of “an one” in Shakespeare.
languagehat says

May 15, 2023 at 8:23 am

When did “one” stop being pronounced as written (rhyming with “bone”) and start being pronounced /wʌn/?
David L says

May 15, 2023 at 8:48 am

One of the few tinges of Northern English that I acquired from my parents, particularly my father, is that I tend to* pronounce ‘one’ to rhyme with ‘gone.’

*[ala Brett]: when I say ‘tend to’ I mean that I’m not completely consistent, what with the competing influences of Southern English and US English.
Jongseong Park says

May 16, 2023 at 1:04 am

According to the Online Etymology Dictionary:

Originally pronounced as it still is in only, atone, alone, and in dialectal good ‘un, young ‘un, etc.; the now-standard pronunciation “wun” began c. 14c. in southwest and west England (Tyndale, a Gloucester man, spells it won in his Bible translation), and it began to be general 18c.
languagehat says

May 16, 2023 at 7:49 am

Thanks! So “an one” in Shakespeare is to be expected.
David Eddyshaw says

May 16, 2023 at 10:41 am

I thought it would go without saying, but it’s maybe worth clarifying that r-insertion never takes place after the articles “a” and “the”

https://www.youtube.com/watch?v=Ddhrz0XKySQ
Ryan says

May 16, 2023 at 11:52 am

> It is also becoming common to hear “the” [ðə] even before a vowel, though in this case [ʔ] is inserted.

I would say “the apple” commonly with long e or occasionally glottal stop, but “the orange” would be either long e or liaison, not glottal stop. (Glottal stop sounds like it puts emphasis on the orange, but I don’t think I would say it that way. I’d use long e if I was distinguishing the fruit.) I’m not positive, as these things are tough to monitor in oneself, but I feel like plural oranges is more likely to draw a long e, while liaison feels right for singular orange.

D’you eat thorange?
D’you eat thee oranges?

There’s something more about apple than ae, because “the anteater” comes out with liaison, not glottal stop, as well as lenition or even elision of the t in ant. If I try to say it with glottal stop, I end up pronouncing the t as well, and feel I’ve tripped myself up, because I would not normally pronounce the t.
J.W. Brewer says

May 16, 2023 at 1:57 pm

Just re “stealing plates” etc., the U.K. has since the 1950’s (and maybe some other Commonwealth nations have joined in) had a special type of copyright for “typographical arrangement” of a published edition (defined to include a musical score in addition to regular words-on-a-page), typically of an older work that is itself in the public domain. Unlike regular copyright these days, it has a much shorter term (25 years from the end of the year in which the edition was first published), but during that term it protects the rightsholder against rivals just printing and selling a rival edition scanned from theirs, thus (the theory goes) protecting an adequate return on the investment necessary to typeset and publish a new edition of a public-domain work and requiring a rival to make the same investment or pay a licensing fee. Obviously there have been some fairly dramatic changes in typesetting and publishing technology since the U.K. Parliament first did this in 1956 which have tended to lower the cost and capital-intensiveness of the process. Last I knew (although I may not have been keeping up) there was no comparable provision in U.S. law.
Brett says

May 16, 2023 at 4:17 pm

@J.W. Brewer: Yes, I know some jurisdictions offer copyright protection according to the “sweat of the brow” (that is, based on the labor required to create a work, irrespective of its originality). What you are describing sounds like a special case of that.
Jongseong Park says

May 17, 2023 at 1:49 am

@David Eddyshaw

Thanks for that Marx Brothers clip. I stand corrected, though I’m not entirely convinced about the elephants to our discussion.

Nothing Survives Transcription.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments