This folder contains guides to help you explore all Gemini API features using complete end-to-end code examples.
When you're confident in your Gemini capabilities, the examples folder will be an endless source of inspiration on how to mix those capabilities together.
If you're new to Gemini API, you should start with these two guides:
- Authentication: Start here to learn how you can set up your API key so you can get access to the Gemini API.
- Get Started: Learn how to make your first calls to the Gemini API and get a quick overview of everything it can do.
Then learn about how to Get Started with the other models that you can use with the API:
- Veo: Get started with video generation using the Veo models.
- Imagen: Get started with our image generation model. (REST version).
- Thinking models: The thinking models are, as their names imply, capable of deeper chains of thoughts than the classical models, this guide will show you how to use those thinking capabilities to solve complex problems.
- Lyria RealTime: The Lyria RealTime model let's your generate music and prompt the model in real-time to have it mis it for you live.
- Text-to-speech: The TTS models let you generate speeches with one or even two speakers!
- More to come very soon...
There're multiple ways to call the models using the Gemini API, these other Get Started guides will then show you the other ways to call the model:
- Get started with Live API: Get started with the live API with this comprehensive overview of its capabilities
- OpenAI compatibility: Did you know that you could use Gemini using the OpenAI SDK?
- More to come...
Finally, these guides will deep-dive into specific capabilities of the Gemini models and API:
- Grounding: Learn how to use diffrent ways (Google Search, Youtube, url context) to ground your answers with external sources.
- Search Grounding: Deep-dive into the Google search grounding capabilities.
- Function Calling: Discover how to have Gemini call you own function and enhaced its capabilites.
- Image-out: Get to know how the Gemini model can directly output images and edit them through multi-turn discussion.
- Spatial understanding: Learn how to use Gemini's spatial understanding capabilities to detect what's in your images, and reason about them
- Video understanding: Learn how to use Gemini's video understanding capabilities to analyze what's in your videos
- Get started with Live API tools: Now you know everything about the Live API, go to the next level and learn how to use tools with it!
These guides will walk you through the various use cases of the Gemini API:
- Asynchronous requests: Learn how to use Python's async/await API with the Gemini SDK to parallelize calls.
- Counting Tokens Tokens are the basic inputs to the Gemini models. Through this
- Models: Learn about the different models and parameters available in the Gemini API.
- Working with files: Use the Gemini API to upload files (audio, video, images, code, text) and perform actions with them through the Gemini models.
- Audio: Learn how to use the Gemini API with audio files.
- JSON mode: Discover how to use JSON mode.
- PDF files. Learn how to work with PDF files, and upload text and images.
- System Instructions: Give models additional context on how to respond by setting system instructions.
- Streaming: Learn how to use streaming for single interactions, and for chat.
- Embeddings: Create high quality and task-specific embeddings.
- Tuning: Learn how to improve model performance on a specific task through tuning.
- Video: Upload a video to the Gemini API and use it in your prompt.
- AI Tutors with LearnLM: Demonstrates how to craft AI tutoring experiences using system instructions aligned with learning science principles.