Vector Database | SOFT CAT .ai

A vector database is a storage system designed for embedding vectors. It lets you save millions of vectors and quickly find the ones most similar to a query vector. This similarity search is the backbone of RAG, semantic search, and recommendation systems.

How it works: You store documents as vectors (generated by an embedding model). When a query comes in, it gets converted to a vector too. The database uses approximate nearest neighbour (ANN) algorithms to find the stored vectors closest to the query vector. Results come back ranked by similarity, typically in milliseconds.

Popular options: Pinecone (managed, cloud-native), Weaviate (open source, hybrid search), Qdrant (open source, Rust-based), Chroma (lightweight, good for prototyping), and pgvector (PostgreSQL extension if you want to keep everything in Postgres). Each makes different trade-offs on scale, features, and operational complexity.

When you need one: If you are building a RAG system, a semantic search feature, or any application that needs to find “similar” items based on meaning rather than exact keywords. For small datasets (under 10,000 items), you can get away with brute-force cosine similarity in memory. Beyond that, a proper vector database pays for itself in speed and scalability.

Related terms