GeminiClient

The GeminiClient is an alternative to the default OpenAI backend and connects to Google’s Gemini API using the Google Generative AI SDK. It supports:

  • Full chat history handling
  • Tool/function calling
  • Configurable model selection

⚠️ Note: Summarization and token management are not yet implemented in the current version.


Requirements

You must set the GEMINI_API_KEY environment variable:

export GEMINI_API_KEY="your-secret-key"

The key is not configured in YAML for security reasons.


Configuration

To enable the GeminiClient, override the AI client config in your config.yml:

ai_client:
  client: gemini
  config:
    model: "gemini-1.5-flash"
    system_content: "You are a helpful assistant."

Field explanation:

  • client: Set this to "gemini" to activate the Gemini backend.
  • model: Must be a valid Gemini model identifier. Examples:
    • "gemini-1.0-pro"
    • "gemini-1.5-pro"
    • "gemini-1.5-flash"
  • system_content: Initial system message injected into the context. This is currently sent as a plain message since Gemini doesn’t support system roles explicitly.

How it works

GeminiClient wraps a persistent genai.Client().chats.create() session and sends messages or function results using send_message().

Each message can optionally contain:

  • plain user text, or
  • a function result (structured as a Part), which the model understands as a prior tool output.

Function calling is handled via the function_declarations tool, and responses are parsed for potential function calls with response.function_calls.


Features

  • Persistent chat thread
  • Function calling via tool integration
  • JSON-based memory export via store_memory()
  • Basic set_plan() support (sends a system message)

Limitations:

  • No token tracking or summarization
  • Limited support for anyOf types in schemas (e.g., Optional[int] is converted to int)
  • Requires schema post-processing to strip unsupported fields (like title, default, or certain union types)