Catalogo Articoli (Spogli Riviste)
OPAC HELP
Titolo: Corpus-based statistical screening for content-bearing terms
Autore: Kim, W; Wilbur, WJ;
- Indirizzi:
- NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA NIH Bethesda MD USA 20894 Informat, Natl Lib Med, Bethesda, MD 20894 USA
- Titolo Testata:
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY
fascicolo: 3,
volume: 52,
anno: 2001,
pagine: 247 - 259
- SICI:
- 1532-2882(20010201)52:3<247:CSSFCT>2.0.ZU;2-A
- Fonte:
- ISI
- Lingua:
- ENG
- Soggetto:
- RETRIEVAL; WORDS;
- Tipo documento:
- Article
- Natura:
- Periodico
- Settore Disciplinare:
- Social & Behavioral Sciences
- Citazioni:
- 33
- Recensione:
- Indirizzi per estratti:
- Indirizzo: Kim, W NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bldg 38A,Rm 8S806,8600 Rockville Pike, Bethesda, MD 20894 USA NIH Bldg 38A,Rm 8S806,8600 Rockville Pike Bethesda MD USA 20894 USA
-
-
-
- Citazione:
- W. Kim e W.J. Wilbur, "Corpus-based statistical screening for content-bearing terms", J AM SOC IN, 52(3), 2001, pp. 247-259
Abstract
An important problem in the indexing of natural language text is how to identify those words and phrases that reflect the content of the text. In general, automatic indexing has dealt with this problem by removing instances of a few hundred common words known as stop words, and treating the remaining words as though they were content bearing. This approach is acceptable for some applications such as statistical estimates of the similarity of queries and documents for the purpose of document retrieval. However, when theindexing terms are to be examined by a human as a means of accessing the literature, it greatly improves efficiency if most of the noncontent-bearingwords and phrases can be eliminated from the indexing, Here we present three statistical techniques for identifying content-bearing phrases within a natural language database. We demonstrate the effectiveness of the methods on test data, and show how all three methods can be combined to produce a single improved method.
ASDD Area Sistemi Dipartimentali e Documentali, Università di Bologna, Catalogo delle riviste ed altri periodici
Documento generato il 25/05/13 alle ore 10:32:22