LinkSearch Prototype

Roughly speaking, the prototype works by viewing each hypertext link in the Web as a sample query --- namely the text on the link --- and a good answer to that query --- namely the URL that the link points to.

For example, if most hypertext links with "linux" in them point to, then that will be the answer most expected for the query "linux".

Of course, a query may contain several words that are not all present in any single hypertext link. For example, users may wish to provide context words in a query such as "142 exams" where 142 provides context for the item of interest --- namely, exams for that course. To handle such queries, we look for paths of hypertext links in the Web graph containing all the query words.

Finally, Web server logs are used to guide the search engine crawler to web pages of interest. In this way, we do not stop the crawling process at some arbitrary depth limit. Rather, we crawl the pages that people have browsed recently.