Vector database
Vector database is a database that stores embeddings (semantic vectors) and quickly finds similar ones. It is a key component in building a RAG.
A vector database is a special database that stores embeddings that change the meaning of text or images into a list of numbers (vectors) and quickly finds items with similar meanings. It can be likened to the arrangement method in a library where books with similar content are placed close together, rather than in alphabetical order of title, so that when you find a book, similar books are right next to it.
Existing databases can only be found if words match exactly, but vector search finds things that make sense even if the expressions are different, such as refund policies and methods of getting money back. Thanks to this characteristic, it has become a key part of the RAG system, which finds and feeds related documents to AI, and is widely used in recommendation systems and similar image searches.
However, since search quality is greatly dependent on the performance of the embedding model and the method of cutting documents, the introduction of a vector DB does not guarantee good search.
✅ Why it matters
- Enables searches that make sense even if the words are different
- It is a key part of RAG construction and is essential for the introduction of corporate AI
- It is used for similarity searches of various data such as text and images
⚠️ Limits and debates
- Search quality largely depends on the embedding model and document segmentation method
- As data grows, storage and search costs become significant
- Existing databases may be better for precise keyword or condition searches