I’m Elena. I’m a computational linguist: I’m interested in Linguistics, technology and the intersection between them. I currently work as a research programmer at the Information Sciences Institute of University of Southern California.
I am particularly interested in computational approaches to language contact and loanword detection: I did my MS thesis on automatic extraction of anglicisms under the supervision of Constantine Lignos, which led to the creation of Observatorio Lázaro, an observatory that monitors anglicism usage in the Spanish press. I’m also one of the organizers of ADoBo, the shared task on automatic detection of borrowings that will take place at IberLEF 2021.
I am also highly involved in activities that bridge the gap between Linguistics and the general public: I write a column about language at Spanish newspaper eldiario.es and at linguistics magazine Archiletras, where I also serve as editorial board member. In 2017, I received the Miguel Delibes National Journalism Award for a column about metaphor and cancer. In 2016 I wrote the pop linguistics book Anatomía de la Lengua.
MS in Computational Linguistics, 2020
Brandeis University
BA in Linguistics, 2010
Universidad Complutense de Madrid
A shared task on automatic detection of borrowings at IberLEF 2021. Organized with Luis Espinosa Anke, Julio Gonzalo, Constantine Lignos and Jordi Porta.
A PyTorch model that classifies Spanish text as being easy to read (plain language) or not.
A scraper for extracting the text of news articles via RSS.
An observatory of anglicism usage in the Spanish press.
Analysis and visualizations in Python of a corpus of Spanish political speeches from 1937 to 2019.
Named Entity Recognition for podcast transcripts. With Julian Fernandez, Kristen Sheets and Linxuan Yang.
A project on annotation and classification of non literal tweets. With Qingwen Ye and Julia Cathcart.
A corpus of Spanish subtitles from LOTR, Star Wars, OITNB, GoT, HIMYM, etc.
A corpus linguistics project supported by Fundeu on the evolution of the Spanish language on the media during the 20th century. With Leticia Martín-Fuertes and Molino de Ideas.
A rule-based automatic language detector based on the syllable structure of words. Current supported languages: Spanish, French, Italian, Portuguese, Catalan, Latin and Basque.
I occasionally write a column about language for Spanish newspaper eldiario.es, a column that was awarded with the Miguel Delibes National Journalism Award (Premio Nacional de Periodismo Miguel Delibes) in 2017 for an article about conceptual metaphor and cancer (Metáforas peligrosas. El cáncer como lucha).
I also write for Archiletras, a pop Linguistics magazine where I’m also member of the editorial board.
In 2016 I wrote the pop linguistics book Anatomía de la Lengua.
From 2012 to 2015 I was a radio contributor at Spanish National Radio (RNE) on a weekly section about language and Linguistics.
Some of my personal writing can be read in my old blog (in Spanish).
These are the columns and other journalistic contributions I have written so far: