Speech Recognition for Newly Documented Languages.

Alexis Michaud writes for HimalCo (Himalayan Corpora, which “proposes to build parallel corpora for three sub-groups of the Sino-Tibetan family, covering a total of 8 little-described oral languages”):

Automatic speech recognition tools have strong potential for facilitating language documentation. This blog note reports on highly encouraging tests using automatic transcription in the documentation of Yongning Na, a Sino-Tibetan language of Southwest China. After twenty months of fieldwork (spread over twelve years, from 2006 to 2017), 14 hours of speech had been recorded, of which 5.5 hours were transcribed (200 minutes of narratives and 130 minutes of morphotonology elicitation sessions). Oliver Adams, the author of mam, an open-source software tool for developing multilingual acoustic models, volunteered to experiment with these data. He trained a single-speaker automatic speech transcription tool over the transcribed materials and applied it to untranscribed audio files. The error rate is low: on the order of 11% of errors in phoneme identification. This makes the automatic transcriptions useful as a canvas for the linguist, who corrects mistakes and produces the translation in collaboration with language consultants.

There’s a detailed description of the results, as well as a link to the submitted version of the paper. Boy, a useful automatic transcription tool would be a godsend — think how much effort goes into doing it manually. (Thanks, Trevor!)


  1. And beyond the important work of documentation, there is the promise of developing materials to refine speech recognition & text output in recognized scripts & orthographies. The potential for speech-to-text in many non-dominant / “less-resource” languages is also exciting.

  2. Hi Steve, many thanks for this very encouraging token of interest. There is clearly some demand. As pointed out by Don Osborn, this is only the tip of the iceberg: there is potential for many applications.
    This prompted us to explain in detail what we did: setting out the whole process for an audience of linguists, in ‘narrative mode’. A 23-page article is now up for comments on Academia.edu: “Integrating automatic transcription into the language documentation workflow”.
    This comes as a complement to the technical paper (hard for linguists to decipher), “Phonemic transcription of low-resource tonal languages”, which is now published, and available online here: https://halshs.archives-ouvertes.fr/halshs-01656683/

Speak Your Mind