Project info for ASPseek

Share This Created 18 Apr 2002 at 09:48 UTC by kir, last modified 18 Apr 2002 at 09:58 UTC by kir.


Freshmeat page:


ASPSeek is a web search engine, written in C++. It consists of an indexing robot, a search daemon, and a search frontend (CGI or Apache module). ASPseek uses a mix of SQL tables and binary files as a storage. It can index as many as a few million URLs and search for words and phrases, use wildcards, and do a Boolean search. Search results can be limited to time period given, site or Web space (set of sites) and sorted by relevance (PageRanks are used) or date.

ASPSeek is optimized for multiple sites (threaded index, async DNS lookups, grouping results by site, Web spaces), but can be used for searching one site as well. ASPSeek can work with multiple languages/encodings at once (including multibyte encodings such as Chinese) due to Unicode storage mode.

Other features include stopwords and ispell support, a charset and language guesser, HTML templates for search results, excerpts, and query words highlighting.

Full set of documentation is included.

License: GPL

This project has the following developers:

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page