Abstract
The present paper investigates the issues of lexical chains and word sense disambiguation and the strong connection between them. We propose a system that extracts words from unstructured text and provides sets of lexical chains and also words and their disambiguation based on WordNet's synsets. We test three unsupervised algorithms, each with three similarity measures based on the concept of Information Content. To evaluate the system we compare the results against manually annotated files containing disambiguated words.
Original language | English |
---|---|
Pages (from-to) | 197-212 |
Number of pages | 16 |
Journal | UPB Scientific Bulletin, Series C: Electrical Engineering |
Volume | 73 |
Issue number | 4 |
State | Published - 2011 |
Externally published | Yes |
Keywords
- Clustering algorithms
- Lexical chains
- Semantic distance
- Word sense disambiguation