melanny20 7 июл в 11:00

How we implemented vector search in Postgres Pro

Простой

7 мин

553

Блог компании Postgres ProfessionalPostgreSQL*

Обзор

Перевод

Автор оригинала: LesnoyChelovek

Imagine you go to an online store to buy a pair of sneakers. You open the description of a model you like, and the site immediately suggests similar items — and they really are similar. How does this work?

The answer is both simple and complex: it's vector search — one of the most promising technologies changing the way we work with information.

From words and numbers to vectors

Traditional databases excel at searching for exact values: numbers, dates, strings. But what if you need to find not exact matches, but semantically similar objects?

Let's imagine we have two words: "automobile" and "car." For a human, it's obvious they are synonyms. But for a computer, they are just different sets of characters. With vector representation, every such word, or even an entire text, image, product, movie, or user, can be represented as a set of numbers — a vector.

Each number in such a vector is a characteristic of the object in a multi-dimensional feature space. The closer two vectors are to each other, the more similar the objects they represent.

Why is this needed?

To understand how important vector search is in real life, let's look at a few specific scenarios.

Semantic search. A standard keyword search loses a lot of context. For example, the query "how to fix an automobile" would ignore an article titled "DIY car repair." Vector search, however, understands the meaning and finds relevant materials, even if the words don't match literally.
Recommendation systems. Services like Kinopoisk, Dzen, and VK successfully use vector search to recommend content. They compare user interests and content characteristics to offer the most suitable recommendations.
Computer vision and image recognition. For example, you upload a photo of a nice T-shirt to a store's app, and it immediately suggests similar models from the catalog. This also works thanks to the vector representation of images.
Generative AI and RAG (Retrieval-Augmented Generation). Modern chatbots (like ChatGPT + RAG) use vector search to quickly find relevant information before generating a response, making it more accurate and informative.

How does vector search work?

The process of finding the nearest neighbors (Nearest Neighbor Search) in a massive dataset of vectors is a complex task. The naive approach, which involves iterating through and comparing all vectors, is inefficient and requires huge computational resources.

This is why special Approximate Nearest Neighbor (ANN) search algorithms are used, particularly HNSW (Hierarchical Navigable Small World). They allow for finding close vectors quickly by using a hierarchical data structure and approximate calculations, sacrificing a little accuracy for a significant gain in speed.

How ANN Works

Imagine you have millions or even billions of vectors. For instance, a vector for every product in a huge marketplace or for every article on the internet. If, for every search query (e.g., the vector of a product a user is looking at), you had to calculate the distance to every other vector in the database to find the absolute closest match, it would be incredibly slow. This exact search (Exact Nearest Neighbor, ENN) simply doesn't scale for real-world tasks.

This is where Approximate Nearest Neighbors (ANN) algorithms come into play. Their main idea is to find very close neighbors, perhaps without a 100% guarantee that the found neighbor is the absolute closest, but doing so orders of magnitude faster. For most practical applications (recommendations, semantic search), the small probability of missing the perfect match is more than compensated for by the colossal gain in speed and the ability to process huge volumes of data in real-time.

How HNSW Works

One of the most popular and effective ANN algorithms today is HNSW (Hierarchical Navigable Small World). Its trick lies in building a special multi-level data structure, similar to a graph. Imagine several layers: on the very top layer, there are very few "nodes" (vectors), but the connections between them are "long," allowing for rapid movement between distant regions of the vector space. With each subsequent level down, the number of nodes increases, and the connections between them become shorter, providing more detailed navigation in a local area. When a search query (a new vector) arrives, the search begins on the topmost, sparse layer.

The algorithm quickly finds the nearest node on this level and then "descends" to the level below, continuing the search in the vicinity of the found node. This process is repeated, allowing it to efficiently "zoom in" on the target, skipping huge parts of the space where there are certainly no close vectors. Thanks to this hierarchical structure and "navigation in a small world," HNSW achieves an excellent balance between search speed and the accuracy of finding truly close neighbors. It is this algorithm, adapted to work with disks and filters, that forms the basis of pgpro_vector indexes.

Vector search inside Postgres Pro: pgpro_vector

Previously, implementing vector search required using separate, specialized databases like FAISS, Milvus, or Qdrant. This meant additional infrastructure costs, integration complexity, and maintenance overhead.

Now, the pgpro_vector extension is available in Postgres Pro, adding powerful vector search directly into the familiar PostgreSQL environment. This extension:

Implements the HNSW algorithm.
Allows creating special indexes for fast nearest neighbor search.
Supports working with filters and multi-column conditions.

A simple example of using pgpro_vector

Let's consider a simple example of using pgpro_vector, inspired by working with word embeddings like GloVe. Such examples can be very helpful for understanding the extension's capabilities.

Before creating the table and index, you need to install the necessary extensions. It is important to note an architectural feature of pgpro_vector: the ganntype data type is moved to a separate extension. This ensures that your vector data in tables remains safe, even if you decide to modify or delete the extensions that implement specific search algorithms (e.g., gannhnsw). This improves data storage reliability.

Installing extensions

-- First, install the extension with the base data type
CREATE EXTENSION ganntype;
-- Then, install the main gann extension (required for index methods)
CREATE EXTENSION gann;
-- Next, the extensions for specific HNSW index types
CREATE EXTENSION gannhnsw CASCADE;
CREATE EXTENSION hnswstream CASCADE;
CREATE EXTENSION mc_hnsw CASCADE;

Creating a table

CREATE TABLE word_embeddings (
    id BIGSERIAL PRIMARY KEY,
    word TEXT UNIQUE,
    embedding ganntype(50) -- 50 dimensions for the GloVe example
);

Generating and importing vectors. Generate word vectors using GloVe (or any other model) and load them into the table. The documentation includes a Python script for importing from a vectors.txt file.

Creating an index. Choose a suitable index type and metric. Cosine distance is often used for finding similar words. If filtering is needed, hnsw_stream or mc_hnsw are good choices.

-- Example of an hnsw_stream index for cosine distance
CREATE INDEX idx_word_embeddings_hnsw_stream_cos ON word_embeddings
USING gann (embedding hnsw_stream_a) -- hnsw_stream_a for cosine
WITH (M = 16, ef_construction = 100); -- HNSW parameters

Searching. Perform a nearest neighbor search using SQL.

-- Find the 10 words most similar to 'king'
SELECT word, embedding <=> (SELECT embedding FROM word_embeddings WHERE word = 'king') AS distance
FROM word_embeddings
ORDER BY embedding <=> (SELECT embedding FROM word_embeddings WHERE word = 'king')
LIMIT 10;
-- Example "king - man + woman = queen"
SELECT word, embedding <=> (
    (SELECT embedding FROM word_embeddings WHERE word = 'king')
    - (SELECT embedding FROM word_embeddings WHERE word = 'man')
    + (SELECT embedding FROM word_embeddings WHERE word = 'woman')
) AS distance
FROM word_embeddings
ORDER BY distance
LIMIT 10;

This classic example of vector arithmetic shows how semantic relationships can be identified. In the search results, the word 'queen' will be among the closest vectors. If 'queen' isn't in the top spot, it's often due to the word's polysemy in the training corpus (e.g., 'queen' as a monarch, a music band, or a chess piece). Nevertheless, its appearance in the top results clearly demonstrates how vector analogies work.

Don't forget to set the gann.hnsw_stream.efsearch parameter (or the equivalent for other index types) to control search quality before running queries:
```
SET gann.hnsw_stream.efsearch = 100; -- Higher is more accurate, but slower
```

Technical details: different scenarios and different indexes

pgpro_vectorprovides three index types, optimized for different tasks:

gannhnsw — maximally fast search without filtering. Suitable for tasks like image recognition.
hnsw_stream — allows using WHERE conditions and returning an unlimited number of results. A good choice for product recommendations with filtering by category.
mc_hnsw (multi-column) — a multi-column index, ideal for searching vector data with additional attributes. For example, finding similar users of a specific age and region.

This makes pgpro_vector a versatile tool for a wide variety of tasks.

What to Pay Attention To

Like any technology, vector search has its nuances:

ANN search is approximate. The higher the accuracy you want, the lower the search speed might be. This is a delicate balance that needs to be tuned for the specific task.
On large datasets, creating and using indexes requires significant amounts of RAM. If there isn't enough memory, indexes are split into parts (clusters), which can reduce query speed.
The higher the dimensionality of the vectors, the more resources are needed to process them. For high-dimensional vectors (e.g., >1000), a binary quantization method is used to reduce their size and speed up the search.

Why vector search is the future

The main challenge for businesses today is to find relevant data quickly and efficiently. Vector search solves this problem by moving from simple character matching to understanding the meaning of the data.

The pgpro_vector extension makes this technology available to every Postgres Pro user right now. You no longer need separate, complex systems: all the power of vector search can be implemented directly in your familiar and reliable DBMS.

By integrating vector search into your projects, you can:

Increase the relevance of search results.
Improve the quality of recommendations.
Make the user experience more personal and enjoyable.

Хабы: