Skip to main content

Streaming Responses with Gemini

Get responses as they are generated, useful for real-time applications.

Typescript
Python

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });

const result = await model.generateContentStream(
  "Tell a fairy tale about a wise owl."
);

for await (const chunk of result.stream) {
  process.stdout.write(chunk.text());
}

from google import genai

# Initialize the model
model = genai.GenerativeModel("gemini-2.0-flash")

# Request streaming response
response = model.generate_content(
    "Tell a fairy tale about a wise owl.",
    stream=True
)

for chunk in response:
    print(chunk.text, end="")

Streaming Options

chunkSize: Size of response chunks
timeout: Maximum time to wait for chunks

Best Practices for Streaming

Implement proper error handling
Consider implementing a timeout
Handle partial responses appropriately

Streaming Options
Best Practices for Streaming