NLP Annotation Workshop


Saturday, September 29, 2012 (All day)


Atkinson Hall, Calit2 Auditorium, University of California, San Diego

View Agenda, Watch Recordings of Talks, and Download Posters from Workshop

Planning Committee: 

  • Wendy Chapman, UCSD
  • Guergana Savova, CHIP, Harvard-MIT
  • Noémie Elhadad, Columbia University
  • Brett South, University of Utah
  • Harry Hochheiser, University of Pittsburgh
  • Danielle Mowery, University of Pittsburgh


Goals of the Workshop

The goals of the workshop are to explore practical and research aspects of annotation of biomedical text, with a focus on clinical text annotation. Topics will include but are not limited to

  • Shared resources, including lexical and semantic annotated corpora
  • Creation of layered annotations
  • Tools to support manual annotation of text
  • Techniques for improving the efficiency of manual annotation, including active learning and crowdsourcing
  • Techniques for improving the quality of annotation
  • Domain adaptation
  • Evaluation of annotation quality
  • Semantic models of schema, guidelines, and annotations


Keynote speaker: Bob Carpenter, PhD., Columbia University

Bob Carpenter received his Ph.D. in cognitive and computer science in 1989 from the University of Edinburgh. Between 1988 and 1996, he was a post-doc through associate professor at Carnegie Mellon University. He worked in the multimedia communications research group at Bell Laboratories between 1996 and 2000, then at the speech-recognition startup SpeechWorks, between 2000 and joining Alias-i (LingPipe) in early 2002.  He is now a research scientist in the Department of Statistics at Columbia University, focusing on computation for Bayesian inference.


Inferring Gold Standards from Crowdsourced Annotations


In this talk, I'll show how model-based techniques originally developed for analyzing multiple diagnostic tests in epidemiology may be applied to inferring a gold-standard corpora from crowdsourced annotations.  The standard models also infer annotator accuracies and biases.  Hierarchical models extend these models to overall task difficulty.  The surprising result is that neither high inter-annotator agreement nor high accuracy is required to derive corpora of measurably high quality.  For example a handful of very noisy annotators (e.g., 75% accuracy, substantial category bias, and less than 50% inter-annotator agreement) can be used to generate a near-perfect gold-standard corpus.  Further advantages of the model-based approach to annotation is a calibration of posterior uncertainty on an item-by-item basis, with the obvious application to active learning and less obvious application to learning and evaluating using a probabilistic notion of a corpus.


***This workshop was sponsored by iDASH and ShARe. It was co-located with the 2nd annual IEEE International Conference on Healthcare Informatics, Imaging and Systems Biology (HISB), and two parallel iDASH workshops for Privacy and Imaging.