Firestore Vector Search

You can now perform vector searches on transactional Firestore – Firestore Vector Search data without duplicating data into another vector search solution, preserving operational simplicity and efficiency.

We will see leveraging Cloud Firestore to perform efficient K-nearest neighbor (KNN) vector searches. 

How does Firestore Vector Search work?

  • Vectors are mathematical objects that represent the magnitude and direction of a quantity. They can be used to represent data in a way that makes it easier to compare and search.
  • Embeddings are vectors that represent the meaning of a word or phrase. They are created by training a neural network on a large corpus of text and learning the relationships between words.
  • Vector databases are databases that are optimized for storing and searching vector data. They allow for efficient nearest neighbor search, which is the process of finding the most similar vectors to a given query vector.
Vector Search
Firestore Vector Search

Vector search works by comparing the query vector to all of the vectors in the database. The vectors that are most similar to the query vector are returned as the search results.

The similarity between two vectors can be measured using a variety of distance metrics. The most common distance metric is cosine similarity, which measures the angle between two vectors.

We can now:

  • Store vector values in Cloud Firestore
  • Create and manage KNN vector indexes for optimized querying
  • Execute KNN queries using supported vector distance functions, enabling you to find the nearest neighbors to a given vector”

Firestore introduced native integrations with popular orchestration frameworks like LangChain and LlamaIndex, making it seamless to utilize Firestore vector search. 

To further streamline your workflow, a new Firestore extension automatically computes vector embeddings and creates web services for effortless vector searches from web or mobile applications.

Firestore Vector Search Extension
Firestore Vector Search

Vector Search with Firestore extension

  1. Click on the Extensions tab.
Firestore extensions
Firestore Vector Search

2. Click on “Vector Search with Firestore extension”.

Vector Search with Firestore extension
Vector Search with Firestore extension

3. Enable the required services

Firestore Extension Services
Firestore Extension Services

4. Configure the extension:

  • Select Vertex AI as the LLM
  • Collection path: notes
  • Default query limit: 3
  • Input field name: text
  • Output field name: embedding
  • Status field name:* *status*
  • Embed existing documents: Yes
  • Update existing documents: Yes
  • Cloud Function location: us-central1

Write operation with a vector embedding

The following example shows how to store a vector embedding in a Cloud Firestore document:

from google.cloud import firestore
from google.cloud.firestore_v1.vector import Vector

firestore_client = firestore.Client()
collection = firestore_client.collection("coffee-beans")
doc = {
  "name": "Kahawa coffee beans",
  "description": "Information about the Kahawa coffee beans.",
  "embedding_field": Vector([1.0 , 2.0, 3.0])
}

collection.add(doc)

Vector distances

Nearest-neighbor queries support the following options for vector distance:

  • EUCLIDEAN: Measures the EUCLIDEAN distance between the vectors. To learn more, see Euclidean.
  • COSINE: Compares vectors based on the angle between them which lets you measure similarity that isn’t based on the vector’s magnitude. We recommend using DOT_PRODUCT with unit normalized vectors instead of COSINE distance, which is mathematically equivalent to better performance. To learn more see Cosine similarity to learn more.
  • DOT_PRODUCT: Similar to COSINE but is affected by the magnitude of the vectors. To learn more, see Dot product.
Firestore Vector Search Extension
Firestore Vector Search Extension

Limitations

As you work with vector embeddings, note the following limitations:

  • Maximum Embedding Dimension: 2048. For larger indexes, use dimensionality reduction techniques.
  • Maximum Documents Returned: 1000 documents can be returned from a nearest-neighbor query.
  • Real-time Snapshot Listeners: Not supported in vector search.
  • Inequality Filters: These cannot be used to pre-filter data.
  • Supported Client Libraries: Only Python and Node.js client libraries support vector search.

Pricing

Firestore customers are charged for the number of KNN vector index entries read during the computation and document reads only for resultant documents matching the query. 

For detailed pricing, please refer to the pricing page.

Valuable comments