WHAT SHOULD I BE DOING INSTEAD OF THIS?
 
Home · Articles · News · Science & Technology · Cincinnati Company Helps Organize and Catalogue the Web

Cincinnati Company Helps Organize and Catalogue the Web

Science and Technology

By Pete Shuler · February 24th, 2000 · Science & Technology
0 Comments
     
Tags:
The Internet holds a seemingly all-encompassing collection of knowledge and opinion. Servers around the world house volumes of information on every imaginable topic -- history, art, science, religion, philosophy, literature, entertainment and how-to-do-anything.

But, perhaps because the Internet is so vast, the Web sites that contain a specific piece of information often are frustratingly elusive. The Web often seems like a library that holds every book ever published and stores them in a huge, jumbled pile.

Search engines and Internet directories go a long way toward organizing this pile. Yahoo!, AltaVista, Excite, Lycos and Hotbot are some of the most popular sites on the Web and serve as both doorway and map for millions of users.

Although each of these is a powerful tool individually, using multiple search engines increases Internet coverage and raises the probability of snaring the desired information. Because each employs proprietary methods to discover, index and rank Web sites, search engines provide results that vary greatly. Furthermore, they might not add new sites for several weeks. Combining the power of many search engines overcomes the shortcomings of each.

But using numerous search engines for involved research is a tedious, repetitive task. Searchers must bounce back and forth between result lists and Web sites and comb past already examined sites to find fresh information.

In May 1997, Mahendra Vora, recognizing that few companies had developed technology to improve Web-based research, founded IntelliSeek to automate and refine the process of information retrieval on the Internet. Since then, this Cincinnati-based company has developed automated search software that succeeds in improving both the accuracy and the efficiency of Internet-based research.

BullsEye 2 and BullsEye 2 Pro (www.intelliseek. com) use intelligent agent technology, meaning they're programmed to make decisions and perform tasks, such as interfacing with search engines, without human intervention. Specifically, BullsEye seamlessly and automatically employs multiple search engines by translating a user's query into the format required by each engine, submitting the query to each and returning the results, formatted for uniformity, to the user.

Just before displaying those results, BullsEye attempts to connect to each Web site. If no connection is made, perhaps because the site no longer exists, that "dead link" is eliminated from BullsEye's results.

Other search software uses similar technology, as do metasearch engines such as Metacrawler (www.meta-crawler.com ), Dogpile (www.dogpile.com) and Mamma.com (www.mamma.com). Most of these tools, however, submit queries to only a dozen or so search engines, while BullsEye uses more than 700 general-purpose and specialized search engines.

"I don't believe any other research product uses even 50 percent of the search engines BullsEye uses," said Kelly Baker, director of marketing communications at IntelliSeek, based in Sharonville.

BullsEye also differs from its competitors in that it's capable of analyzing the textual content of Web sites.

Embedded within BullsEye is linguistic analysis technology that "reads" the pages of each site returned by the search engines. Based on this reading, BullsEye then determines whether the site's content matches the user's query.

This process filters out Web sites that search engines categorize incorrectly. Like a lazy student reading Cliffs Notes, some search engines index a site based only on its title or certain codes in the site's programming. These titles and codes, both entered by Web page designers, might have little to do with the site's content. Since advertising revenue is generally determined by the number of visits to a site, designers often use popular search terms such as "sex" and "erotic" to increase traffic, even when a site has nothing to do with these topics.

And even search engines that accurately index a site's keywords might produce irrelevant results. A site, for example, that contains the words "Ohio" on one page, "sales" on another and "taxes" on a third might have nothing to do with sales taxes in Ohio.

Although not perfect, BullsEye's linguistic analysis technology dramatically improves the accuracy of some search engines. To find restaurants in Cincinnati that serve pecan pie, one might enter "Cincinnati," "restaurant" and "pecan pie" into a search engine. In response to this query, AltaVista returned 200 sites, including home pages for the Cincinnati Ballet, Bogart's, the Academy of Medicine of Cincinnati and numerous restaurants around the country.

Focused on AltaVista's results, BullsEye whittled away more than 170 irrelevant sites and returned 29 sites, all of which contained the requisite terms and 15 of which contained information on Cincinnati restaurants that served pecan pie. While not perfect, the linguistic analysis technology in BullsEye narrowed the results down to a manageable number.

"We're able to get a fantastic degree of relevancy, because BullsEye can reapply the query to the actual text of documents as they exist on the Internet at the moment you performed the search," said Chris Connaughton, vice president of research and development at IntelliSeek.

This technology and the extensive collection of search engines provide BullsEye's power, while its integrated browser and split screen layout increase search efficiency. BullsEye displays search results, ranked according to relevance, in one of its two main screens. Clicking on a Web page title or address in this list opens that site in the second screen, a Web browser that's fully integrated into the product. This layout allows the user to view sites and the list of results simultaneously instead of switching back and forth between the two. The integrated browser also allows users to download files, bookmark sites and surf to other links without leaving BullsEye.

BullsEye further facilitates the task of scrolling through sites by highlighting and counting the occurrence of search terms in each document. A line above the browser screen shows the number of search terms in the site being viewed. This keyword count enables users to ignore sites containing only a few occurrences of the requested words. Forward and reverse arrows allow users to skip from keyword to keyword within a site, jumping over pages of text to examine the context in which the highlighted keywords appear.

While researching the band Morphine, these features allowed me to quickly determine if a site warranted further examination or if it contained information on the narcotic instead of the band.

I could have further focused and accelerated my Morphine research by restricting my search to BullsEye's entertainment topic area. Searching within topics such as news, shopping, jobs and entertainment employs only category-specific search engines that are more likely to generate relevant results. These engines also reduce search and analysis time by returning more focused, fewer results to BullsEye.

The software topic area, for example, utilizes 67 search engines that allow users to pick through the Internet and find only those Web sites containing software reviews, developers' tools, downloadable software, drivers or games.

In addition to being a powerful and flexible research tool, BullsEye also contains information management and tracking features. Both BullsEye 2 and BullsEye 2 Pro maintain a query history that allows users to rerun searches without reentering terms -- an invaluable feature to searchers who frequently enter long or complex queries. Both versions also allow users to save search results so they can be examined later without rerunning the search.

The Pro version also contains a tracking function that automatically searches the Web and updates results at user-specified intervals. The updated results contain only newly discovered sites or those that have changed since the last search. BullsEye 2 Pro can automatically transfer these updates to any device that accepts e-mail.

Users of both versions can create and print or e-mail reports containing search results. The Pro version allows users to type comments, notes and instructions next to each search result, while BullsEye 2 allows just one annotation at the top of the report. The Pro version also produces the reports in a format that can be read by Palm Pilots.

BullsEye 2, which derives revenue for IntelliSeek through a small ad banner above the two main screens, is available for no charge at www.intelliseek.com. BullsEye 2 Pro should be available at the same site by the end of March for $149. Version 1.5 is currently available for the same price.

Another IntelliSeek product worth noting is Invisible Web.com (www.invisibleweb.com ), a directory of 12,000 databases that exist in formats unrecognized by search engines and, consequently, don't turn up in search engine results. This collection, which IntelliSeek employees catalogued while developing BullsEye, offers medical, expert and other specialty databases along with catalogs; dictionaries and guides and directories covering a variety of topics.

Dozens of companies have entered the market for Internet search solutions since IntelliSeek's establishment, but the Cincinnati company has remained at the forefront by providing both professional and casual researchers with the latest technology and user-friendly features.

 
 
 
 

 

comments powered by Disqus
 
Close
Close
Close