Skip to content

Vertex

batchling is compatible with Vertex through any supported framework

The following endpoints are made batch-compatible by Vertex:

  • /v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent
  • /v1beta1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent

Check model support and batch pricing

Before sending batches, review the provider's official pricing page for supported models and batch pricing details.

The Batch API docs for Vertex can be found on the following URL:

https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini

Example Usage

Environment required

Vertex batches require VERTEX_PROJECT_ID, VERTEX_GCS_PREFIX, and either GEMINI_API_KEY or GOOGLE_API_KEY to be loaded in your environment before running batches.

Here's an example showing how to use batchling with Vertex:

vertex_example.py
import asyncio
import os

from dotenv import load_dotenv
from google import genai

from batchling import batchify

load_dotenv()


async def build_tasks() -> list:
    """Build Gemini requests."""
    client = genai.Client(
        vertexai=True, project=os.getenv(key="VERTEX_PROJECT_ID"), location="us-central1"
    ).aio
    questions = [
        "Who is the best French painter? Answer in one short sentence.",
        "What is the capital of France?",
    ]
    return [
        client.models.generate_content(
            model="gemini-2.5-flash-lite",
            contents=question,
        )
        for question in questions
    ]


async def main() -> None:
    """Run the Gemini example."""
    tasks = await build_tasks()
    responses = await asyncio.gather(*tasks)
    for response in responses:
        print(f"{response.model_version} answer:\n{response.text}\n")


async def run_with_batchify() -> None:
    """Run `main` inside `batchify` for direct script execution."""
    async with batchify(vertex_gcs_prefix=os.getenv("VERTEX_GCS_PREFIX")):
        await main()


if __name__ == "__main__":
    asyncio.run(run_with_batchify())