There are two automatic query expansion techniques that have become nearly universal and have been shown to increase performance [42]. The first involves the use of stemming, which is the process of removing suffixes from the base words. For example, the words ``helping,'' ``helped,'' and ``helps'' would all be stemmed to ``help.'' In a system that uses stemming, all words are stemmed before indexing. In addition, all query terms are stemmed before retrieval. Thus, a query for the word ``helping'' would match a document that contains the word ``helps.''
The second technique is the use of stop lists, which are lists of words that are commonly found in documents and not indexed. Articles and conjunctions, such as ``the,'' ``a,'' ``and,'' and ``but'' typically comprise the majority of a stop list.
Most modern systems use both of these techniques, and their use has become so commonplace that they are often omitted from publications describing systems that use them. While there is some considerable literature regarding stemming algorithms and construction of stop lists, their inclusion or exclusion in any system is orthogonal to the issues addressed in this paper.