Author makes case for data-driven language learning
LAWRENCE – You could consider Nina Vyatkina an evangelist for data-driven learning and open educational resources as they apply to helping students learn a second language.
The case for the synergistic benefits of those two approaches and the results of a new study demonstrating their effectiveness form the content of Vyatkina’s new book, “Corpus Applications in Language Teaching and Research: The Case of Data-Driven Learning of German” (Routledge, 2024).
The University of Kansas professor of German and applied linguistics is a believer in students directly using collections of word usage – corpora – to help them understand and gain fluency in their target language.
“Simply put, a corpus is a curated assembly of naturally occurring texts chosen to represent a specific state or variant of a language,” Vyatkina said. “That's still a mouthful, but the notion of corpora goes way, way back in literary scholarship.
“A great example of a corpus is the compilation of the Oxford English Dictionary. They wanted to have not only a list of words and their definitions, but usage examples. So in 1880, James Murray, the first editor-in-chief, put out the call to submit sentences with the words collected from books and over the years had collected millions of slips of paper, which he stored in something the size of a garden shed. And that was a prototype of our contemporary corpus.
“Of course, with modern electronics, this all became much, much easier,” Vyatkina said.
A popular corpus that Vyatkina cites as groundbreaking for its time (it debuted in 2008) and still useful today is COCA, the Corpus of Contemporary American English – which ranks and cross-references billions of words taken from newspapers, books, TV, webpages, etc., between 1990 and 2019. COCA has a user interface much like a Google search. For instance, learners of English can search to see which adjective — “big” or “large” — pairs most commonly – and thus best – with the noun “problem,” which will help them use English more idiomatically.
But not all corpora are so well-organized as COCA, Vyatkina said. Nor are there as many useful corpora for languages other than English (LOTE), even if one accounts for the smaller number of LOTE speakers. Nor were there many useful teaching guides to using these LOTE corpora, Vyatkina wrote.
So her book is an attempt to bridge all those divides.
“One part is a survey of what has been done in teaching German with this method and research on the effectiveness of teaching with this method,” the author said. “I have a new empirical study. I actually did use this method with KU students of German ... and it turned out to be very effective, and students liked the method.
“Then the final part is the introduction of the open educational resource,” Vyatkina said.
In cooperation with the KU Open Language Resource Center, Vyatkina and co-author Schirin Kourehpaz, a multiterm lecturer in German language, published in 2020 a free online course titled Incorporating Corpora with links to active German corpora, making their use seamless to the teacher and student.
“You have exercises, lesson plans, explanations to teachers how to use it, and then you can click on links and you will be taken directly to the corpus, and that is what is needed for teachers and learners to use it,” Vyatkina said.
The proof this method’s success, she said, is in faster, better language acquisition.
“This is why we call it data-driven learning,” Vyatkina said, “because this teaching method is a little different from what we think of as traditional teaching, where the teacher presents a rule, and then the learners practice using the language. Here, it is the opposite procedure. They look at the examples first, and then, under the teacher's guidance, they infer a rule or a pattern.”
Vyatkina also believes that the proven success of data-driven learning in acquiring German can be repeated with other languages and perhaps other fields.
“Absolutely,” Vyatkina said. “I review a lot of research that has been done on other languages, and I try to make this connection in every chapter of my book.”
Image: Nina Vyatkina with her new book, "Corpus Applications in Language Teaching and Research: The Case of Data-Driven Learning of German” (Routledge, 2024). Credit: Rick Hellman, KU News Service.
Inset: James Murray in his scriptorium, sometime prior to 1910. Credit: Photographer unknown, public domain.