Working together towards an ideal infrastructure for language learner corpora

Abstract

In this article we give an overview of first-hand experiences and starting points for best practices from projects in seven European countries dedicated to learner corpus research and the creation of language learner corpora. The corpora and tools involved in LCR are becoming more and more important, and the careful preparation and easy retrieval, and reusability of corpora and tools has likewise become more important. But with a lack of agreed solutions for many aspects of LCR, interoperability between learner corpora or exchanging data from different learner corpus projects is still challenging. We will illustrate how concepts like metadata, anonymization, error taxonomies and linguistic annotations, as well as tools, toolchains or data formats can individually pose challenges and how they might be solved.

Publication
Widening the Scope of Learner Corpus Research. Selected Papers from the 4th Learner Corpus Research Conference 2017
Previous