Text search engines

Text search engines allow you to index and search text documents

Created: by Pradeep Gowda Updated: Sep 15, 2023 Tagged: databases · text-search

Smaller alternatives to Elasticsearch without all the features

valeriansaliou/sonic: 🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

  • does not ship with a web frontend
  • from HN comment: “Sonic here only returns document identifiers so you will never be able to get document information back. This is very useful though if all you want to do is index text data and then get the stored information from another data store.”

quickwit-oss/tantivy: Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust.

  • written in Rust
  • quickwit-oss/tantivy-cli has a quick onboarding via command line and provides an API that returns JSON search results.
  • “UI” is the JSON search endpoint

meilisearch/meilisearch: A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow.

nathell/smyrna

  • written in Clojure
  • “Smyrna is a concordancer and statistical analyzer for metadata-rich corpora in Polish.”

zinclabs/zinc: ZincSearch. A lightweight alternative to elasticsearch that requires minimal resources, written in Go.

  • written in Go

Manticore: a faster alternative to Elasticsearch in C++ with a 21-year history

Vespa.ai “store, search, organize and make machine-learned inferences over big data at serving time.”

Commercial offerings

  • Algolia - powers Hackers News search.

Pagefind | Pagefind — Static low-bandwidth search at scale

is a fully static search library that aims to perform well on large sites, while using as little of your users’ bandwidth as possible, and without hosting any infrastructure.

Pagefind runs after Hugo, Eleventy, Jekyll, Next, Astro, SvelteKit, or any other website framework. The installation process is always the same: Pagefind only requires a folder containing the built static files of your website, so in most cases no configuration is needed to get started.

After indexing, Pagefind adds a static search bundle to your built files, which exposes a JavaScript search API that can be used anywhere on your site. Pagefind also provides a prebuilt UI that can be used with no configuration. (You can see the prebuilt UI at the top of this page.)

The goal of Pagefind is that websites with tens of thousands of pages should be searchable by someone in their browser, while consuming as little bandwidth as possible. Pagefind’s search index is split into chunks, so that searching in the browser only ever needs to load a small subset of the search index. Pagefind can run a full-text search on a 10,000 page site with a total network payload under 300kB, including the Pagefind library itself. For most sites, this will be closer to 100kB.

Articles