Multi-criterion active learning in conditional random fields

Christopher T. Symons, Nagiza F. Samatova, Ramya Krishnamurthy, Byung H. Park, Tarik Umar, David Buttler, Terence Critchlow, David Hysom

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

Conditional Random Fields (CRFs), which are popular supervised learning models for many Natural Language Processing (NLP) tasks, typically require a large collection of labeled data for training. In practice, however, manual annotation of text documents is quite costly. Furthermore, even large labeled training sets can have arbitrarily limited performance peaks if they are not chosen with care. This paper considers the use of multi-criterion active learning for identification of a small but sufficient set of text samples for training CRFs. Our empirical results demonstrate that our method is capable of reducing the manual annotation costs, while also limiting the retraining costs that are often associated with active learning. In addition, we show that the generalization performance of CRFs can be enhanced through judicious selection of training examples.

Original languageEnglish
Title of host publicationProcedings - 18th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2006
Pages323-331
Number of pages9
DOIs
StatePublished - 2006
Event18th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2006 - Arlington, VA, United States
Duration: Oct 13 2006Oct 15 2006

Publication series

NameProceedings - International Conference on Tools with Artificial Intelligence, ICTAI
ISSN (Print)1082-3409

Conference

Conference18th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2006
Country/TerritoryUnited States
CityArlington, VA
Period10/13/0610/15/06

Fingerprint

Dive into the research topics of 'Multi-criterion active learning in conditional random fields'. Together they form a unique fingerprint.

Cite this