I’m Elena. I’m a computational linguist: I’m interested in Linguistics, technology and the intersection between them. I currently work at the NLP&IR research group at UNED University, where I’m pursuing my PhD under the supervision of Julio Gonzalo and Constantine Lignos. I’m particularly interested in studying how we can use technology to understand language contact and language change. My research has led to the creation of Observatorio Lázaro, an observatory that automatically monitors anglicism usage in the Spanish press.
Prior to that, I spent a decade working on different language technology projects at various organizations, such as the Information Sciences Institute at University of Southern California, Fundéu, Molino de Ideas, McLean Hospital or UNED Digital Humanities Lab.
I am also highly involved in dissemination activities that bridge the gap between Linguistics and the general public: I write a column about language at Spanish newspaper elDiario.es, a column that was awarded with the Miguel Delibes National Journalism Award in 2017. I sometimes write at linguistics magazine Archiletras, where I also serve as editorial board member. In 2016 I wrote the pop linguistics book Anatomía de la Lengua.
PhD in NLP, ongoing
UNED University
MS in Computational Linguistics
Brandeis University
BA in Linguistics
Universidad Complutense de Madrid (UCM)
A Python library that automatically detects lexical borrowings (or loanwords) in Spanish
COrpus of AngLicisms in the SpAnish PresS. With Constantine Lignos
An observatory of anglicism usage in the Spanish press.
A Twitter bot that tweets new anglicisms found in the Spanish press.
A shared task on automatic detection of borrowings at IberLEF 2021. Organized with Luis Espinosa Anke, Julio Gonzalo, Constantine Lignos and Jordi Porta.
A PyTorch model that classifies Spanish text as being easy to read (plain language) or not.
A scraper for extracting the text of news articles via RSS.
Analysis and visualizations in Python of a corpus of Spanish political speeches from 1937 to 2019.
Named Entity Recognition for podcast transcripts. With Julian Fernandez, Kristen Sheets and Linxuan Yang.
A project on annotation and classification of non literal tweets. With Qingwen Ye and Julia Cathcart.
A corpus of Spanish subtitles from LOTR, Star Wars, OITNB, GoT, HIMYM, etc.
A corpus linguistics project supported by Fundeu on the evolution of the Spanish language on the media during the 20th century. With Leticia Martín-Fuertes and Molino de Ideas.
A rule-based automatic language detector based on the syllable structure of words. Current supported languages: Spanish, French, Italian, Portuguese, Catalan, Latin and Basque.
I occasionally write a column about language for Spanish newspaper elDiario.es, a column that was awarded with the Miguel Delibes National Journalism Award (Premio Nacional de Periodismo Miguel Delibes) in 2017 for an article about conceptual metaphor and cancer (Metáforas peligrosas. El cáncer como lucha).
I also write for Archiletras, a pop Linguistics magazine where I’m also member of the editorial board.
In 2016 I wrote the pop linguistics book Anatomía de la Lengua.
From 2012 to 2015 I was a radio contributor at Spanish National Radio (RNE) on a weekly section about language and Linguistics.
Some of my personal writing can be read in my old blog (in Spanish).
These are the columns and other journalistic contributions I have written so far: