I have a website that features a vast amount of static content, with mailing list archives totaling over 700,000 pages. I'm searching for a user-friendly, open-source site search engine that would be easy to manage. I've considered using Nutch, but it appears to be quite challenging to set up and maintain. Any suggestions would be greatly appreciated! Thanks in advance!
3 Answers
This is an intriguing topic! If you don’t find any reliable solutions, I’d be interested in creating a reusable tool for integrated searching within a domain. Keep us posted, OP; I’m on the lookout for my next side project!
I recommend checking out lunr.js! It's lightweight and runs well in a Cloudflare worker, which can help improve search performance without placing too much load on your servers.
If your data is organized in a structured format, you might find that using a Lucene-based index—like Solr, OpenSearch, or Elasticsearch—could be pretty straightforward. It’s a solid option for handling large sets of static web pages!

Related Questions
How to Build a Custom GPT Journalist That Posts Directly to WordPress
Cloudflare Origin SSL Certificate Setup Guide
How To Effectively Monetize A Site With Ads