2

Working together towards an ideal infrastructure for language learner corpora

In this article we give an overview of first-hand experiences and starting points for best practices from projects in seven European countries dedicated to learner corpus research and the creation of language learner corpora. The corpora and tools …

A Generic Data Workflow for Building Annotated Text Corpora

We present an abstract and generic workflow, and detail how it has been implemented to build and annotate learner corpora. This workflow has been developed through an interdisciplinary collaboration between linguists, who annotate and use corpora, …

Automated L1 identification in English learner essays and its implications for language transfer

This article focuses on automatic text classification which aims at identifying the first language (L1) background of learners of English. A particular question arising in the context of automated L1 identification is whether any features that are …

Challenges of building a CMC corpus for analyzing writer's style by age: The DiDi project

Special Issue: Building and annotating corpora of computer-mediated discourse. Issues and Challenges at the Inteface of Corpus and Computational Linguistics

Establishing a Standardised Procedure for Building Learner Corpora

Decisions at the outset of preparing a learner corpus are of crucial importance for how the corpus can be built and how it can be analysed later on. This paper presents a generic workflow to build learner corpora while taking into account the needs …

Rapid Adaptation of NE Resolvers for Humanities Domains using Active Annotation