Show HN: txtai: open-source, production-focused vector search and RAG

dmezzetti 12 days ago

Hello, author of txtai here. txtai was created back in 2020 starting with semantic search of medical literature. It has since grown into a framework for vector search, retrieval augmented generation (RAG) and large language model (LLM) orchestration/workflows.

The goal of txtai is to be simple, performant, innovative and easy-to-use. It had vector search before many current projects existed. Semantic Graphs were added in 2022 before the Generative AI wave of 2023/2024. GraphRAG is a hot topic but txtai had examples of doing this earlier in 2024.

txtai has a commitment to quality and performance, especially with local models. For example, it's vector embeddings component automatically streams vectors to disk during indexing and uses mmaped arrays to enable indexing large datasets locally on a single node. txtai's BM25 component is built from the ground up to work efficiently in Python leading to 6x better memory utilization and faster search performance than the BM25 Python library most commonly used.

txtai is Apache 2.0 licensed and all code is available at https://github.com/neuml/txtai