MetaCrawler presents users with a single intelligent interface that controls several powerful information sources, formats the user's queries for each service it uses, collects the references obtained from those services, and optionally downloads those references to ensure availability and quality. It then removes duplicate references and collates the rest into a single list for the user. The user need know only what he or she is looking for the MetaCrawler Softbot takes care of how and where. We have paid special attention to performance, making it a practical tool for Web searching.
MetaCrawler's architecture is a modular harness; we are able to plug in, modify, and remove services easily. This enables us to combine many different resources without difficulty. It is adaptable to new and modified services. Because of its minimal resource requirements, it is very portable, able to exist as a server on enterprise-scale UNIX servers or as a user application on a Windows box. Also, it exhibits reasonable scaling properties, especially when viewed as a client application. The research service has been available to the public for over a year, with a current query rate of over 150,000 per day.