The goal of the LinkSearch project is to build a Google-like search engine.
Like Google, we aim to provide the highest quality rankings possible by analyzing
hypertext links in the web.
Unlike Google, LinkSearch...
- is an open source project under the GPL
- is an "open research" project where developers are encouraged to
experiment with new techniques
- is more aggressive in its use of links (and link paths) to provide more accurate results for a wide class of queries
- is a local search engine and will work in combination with the Apache server or Squid proxy
- is real-time and detects changes in the web site immediately by analyzing the Apache/Squid logs
We should note that while LinkSearch only looks at URLs found in local log files to "crawl" the Web, it
can also be used as a global search engine particularly if the Squid proxy log is used.
That's because it will pick up the "slice" of the Web that is typically
browsed by people in the institution generating those logs. Observe that this "slice" captures
the particular interests of the people involved. For example, if you search for "weather", you will
get weather local to your particular city. A global search engine would give you a general page on weather.
Demos
You can try out the new open source version here:
Project Members
We are looking for people to test and optimize the current code.
Technique
Proposed Design of Open Source Version
Source Code
The source is in CVS. You should be able to
compile it on most Linux installations. You will need to use
anonymous CVS to download the code. Basically, you need to type: cvs -z3 -d:pserver:anonymous@cvs.linksearch.sourceforge.net:/cvsroot/linksearch co src.
IR Resources
Developer Resources
Some relevant papers
Back to the LinkSearch project summary at SourceForge.