The MetaCrawler Softbot was created to address the problems outlined above. MetaCrawler is a Software robot that aggregates Web search services for users. MetaCrawler presents users with a single unified interface. Users enter queries, and MetaCrawler forwards those queries in parallel to the sundry search services. MetaCrawler then collates the results and ranks them into a single list, returning to the user the sum of knowledge from the best Web search services. The key idea is that the MetaCrawler allows the user to express what to search for and frees the user from having to remember where or how.
MetaCrawler has several features that enable the user to obtain high quality results. Users can instruct the MetaCrawler to find pages with either all of the words in their query, any of the words in their query, or all of the words in their query as a phrase. Users can also specify if a word must or must not appear in any reference returned, as well as if a reference comes from a particular geographic or logical location (e.g. France or boeing.com). MetaCrawler has several additional features that will not be discussed here, but are explained in detail at the MetaCrawler WWW site[9].
In order to accomplish its goal, MetaCrawler must be adept at several tasks. It needs to take user queries and format them appropriately for each search service. Next, it must be able to correctly parse the results and aggregate the results from the other services. Finally, it must be able to analyze the results to eliminate duplicates and perform other checks to ensure quality. Figure
1 details the MetaCrawler's control flow.It is important to note that accomplishing these tasks is only sufficient to get MetaCrawler to work. There are several additional requirements in order to make it usable, which in turn strongly impact how MetaCrawler is designed. First and foremost, MetaCrawler needs a user interface that the average person can use. Second, MetaCrawler must be able to perform its tasks for the user as quickly as possible. Finally, MetaCrawler must be able to adapt to a rapidly changing environment.