One of Vertex AI Gemini API features is the function calling

Function Calling in Gemini

The Vertex AI Gemini API, developed by Google DeepMind, offers a range of generative AI models suitable for various multimodal applications. One of its key features is function calling, which simplifies the developer’s process of obtaining organized data outputs from generative models. 

This enables developers to integrate their models with external systems, ensuring the generated content remains accurate and up-to-date by leveraging relevant data from other APIs.

Function Calling in Gemini
Function Calling in Gemini

What is function calling in Gemini?

Function declarations serve as a guide for generative models, clarifying the purpose and parameters of a function. 

  • When you include function declarations in a query, the model responds with a structured object containing the relevant function names and their corresponding arguments, tailored to the user’s query.
  • Importantly, the model doesn’t execute the function but provides the necessary information, enabling you to call the function using your preferred language, library, or framework.

The following diagram illustrates how function calling works:

How Function Calling Works
Function Calling in Gemini

Setup and requirements – Function Calling in Gemini

Before you can start using function calling in Gemini, you need to enable the Vertex AI API and install the latest version of the Vertex AI Python client library.

Enable Vertex AI API

To enable the Vertex AI API, follow these steps:

  1. In your browser, navigate to the Vertex AI API Service Details page.
  2. Click the Enable button to enable the Vertex AI API in your Google Cloud project.

How function calling works

Before we get started with parameter extraction and function calling, let’s walk through the steps of function calling and which components are used at runtime.

Working of Function Calling
Working of Function Calling

User input to Gemini API

  • When a user submits a prompt, it is sent to the Gemini API, which includes one or more function declarations defined by the developer within a tool. 
  • These declarations inform the Gemini model about the available functions it can invoke and the correct syntax for calling them, enabling the model to generate accurate and relevant responses.
User Input -> Gemini API -> Function Call

The Gemini API returns a Function Call

In response to the user’s input and prompt, Gemini generates a Function Call response that provides structured data, which includes:

  • The name of the relevant function to invoke
  • The corresponding parameters to use with that function
Gemini API -> Function Call

Make an API request

  • Next, you’ll utilize the function name and parameters provided by Gemini to make an API request to an external system or API, retrieving the necessary information. 
  • This process occurs outside the scope of the Gemini API and SDK and is implemented by the developer in the application code. 

For instance, you might employ the requests library in Python to call a REST API and receive a JSON response. Alternatively, you can use your preferred approach and client library to call the function and retrieve the required data.

API Parameters -> External API -> API Response

Return the API Response to Gemini

  • Lastly, you’ll feed the API response back into the Gemini model, enabling it to generate a response to the end user’s initial prompt or trigger another Function Call response if the model requires additional information to refine its response. 
  • This iterative process allows the Gemini model to leverage external data and engage in a dynamic conversation with the user, providing more accurate and informative responses.
API Response -> Function Response -> Model Output

Use cases of function calling

Use cases
Use cases

You can use function calling for the following tasks:

  • Extract entities from natural language stories: Extract lists of characters, relationships, things, and places from a story.
  • Query and understand SQL databases using natural language: Ask the model to convert questions such as What percentage of orders are returned? into SQL queries and create functions that submit these queries to BigQuery.
  • Help customers interact with businesses: Create functions that connect to a business API, allowing the model to provide accurate answers to queries such as Do you have the Pixel 8 Pro in stock? or Is there a store in Mountain View, CA that I can visit to try it out?

Build generative AI applications by connecting to public APIs, such as:

  • Convert between currencies: Create a function that connects to a currency exchange app, allowing the model to provide accurate answers to queries such as What's the exchange rate for euros to dollars today?
  • Get the weather for a given location: Create a function that connects to an API of a meteorological service, allowing the model to provide accurate answers to queries such as What's the weather like in Paris?
  • Convert an address to latitude and longitude coordinates: Create a function that converts structured location data into latitude and longitude coordinates. Ask the model to identify the street address, city, state, and postal code in queries such as I want to get the lat/lon coordinates for the following address: 1600 Amphitheatre Pkwy, Mountain View, CA 94043, US. 

Create a function calling application – Function Calling in Gemini

Gemini-API-AI-function-calling
Gemini-API-AI-function-calling

To enable a user to interface with the model and use function calling, you must create code that performs the following tasks:

Pricing

The pricing for function calling is based on the number of characters within the text inputs and outputs. 

To learn more, see Vertex AI pricing.

Valuable comments