Embedding
An embedding is a numerical representation of text (or image) that lets AI systems compare semantic similarity.
Also known as: vector embedding, text embedding
An embedding is a numerical representation — a vector of several hundred numbers — that captures the semantic meaning of a piece of text or an image. Two pieces of text with similar meaning get similar embedding vectors, even if they use different words. Embeddings are the foundation of RAG systems: when you ask a question, it is converted to an embedding, which is then used to find documents with similar embeddings in a vector database. This is why modern AI search can find relevant documents even when none of the search terms appear in the document itself.
Related terms
- RAG (Retrieval-Augmented Generation) — RAG is a technique where a language model answers based on the business's own documents — instead of only its general training.
- Vector database — A vector database is a database optimised for storing and searching embeddings — the foundation of RAG systems and AI search.
- LLM (Large Language Model) — An LLM is a large language model trained on enormous text volumes that can generate, summarise, and analyse text in a human-like way.