Egon W. Stemle
Egon W. Stemle
Home
Posts
Publications
Talks
Contact
CV
Light
Dark
Automatic
1
The PAISÀ Corpus of Italian Web Texts
PAISÀ is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing …
Verena Lyding
,
Egon Stemle
,
Claudia Borghetti
,
Marco Brunello
,
Sara Castagnoli
,
Felice Dell Orletta
,
Henrik Dittmann
,
Alessandro Lenci
,
Vito Pirrelli
PDF
Cite
Open Corpus Interface for Italian Language Learning
In this article, we present the multi-faceted interface to the open PAISÀ corpus of Italian. Created within the project PAISÀ …
Verena Lyding
,
Claudia Borghetti
,
Henrik Dittmann
,
Lionel Nicolas
,
Egon Stemle
PDF
Cite
High-Accuracy Phrase Translation Acquisition Through Battle-Royale Selection
In this paper, we report on an unsupervised greedy-style process for acquiring phrase translations from sentence-aligned parallel …
Lionel Nicolas
,
Egon W. Stemle
,
Klara Kranebitter
,
Verena Lyding
PDF
Cite
Constructing concept relation maps to support building concept systems in comparative legal terminology
Graphical tools to organise and represent knowledge are useful in terminology work to facilitate building concept systems. Creating and …
Klara Kranebitter
,
Egon W. Stemle
PDF
Cite
Towards high-accuracy bilingual phrase acquisition from parallel corpora
We report on on-going work to derive translations of phrases from parallel corpora. We describe an unsupervised and knowledge-free …
Lionel Nicolas
,
Egon W. Stemle
,
Klara Kranebitter
PDF
Cite
Annotating Archaeological Texts: An Example of Domain-Specific Annotation in the Humanities
Developing content extraction methods for Humanities domains raises a number of chal- lenges, from the abundance of non-standard entity …
Francesca Bonin
,
Fabio Cavulli
,
Aronne Noriller
,
Massimo Poesio
,
Egon W. Stemle
PDF
Cite
The Humanities Research Portal: Human Language Technology Meets Humanities Publication Archives
Massimo Poesio
,
Eduard Barbu
,
Francesca Bonin
,
Fabio Cavulli
,
Asif Ekbal
,
Egon Stemle
,
Christian Girardi
PDF
Cite
PaddyWaC: A Minimally-Supervised Web-Corpus of Hiberno-English
Small, manually assembled corpora may be available for less dominant languages and dialects, but producing web-scale resources remains …
Brian Murphy
,
Egon W. Stemle
PDF
Cite
Structure-Preserving Pipelines for Digital Libraries
Most existing HLT pipelines assume the input is pure text or, at most, HTML and either ignore (logical) document structure or remove …
Massimo Poesio
,
Eduard Barbu
,
Egon W. Stemle
,
Christian Girardi
PDF
Cite
Anaphoric Annotation of Wikipedia and Blogs in the Live Memories Corpus
Kepa Joseba Rodríguez
,
Francesca Delogu
,
Jannick Versley
,
Egon W. Stemle
,
Massimo Poesio
PDF
Cite
«
»
Cite
×