Gemini

The Gemini component is an AI component that allows users to connect to Google's Gemini multimodal AI models. It can carry out the following tasks:

Release Stage

Alpha

Configuration

The component definition and tasks are defined in the definition.yaml and tasks.yaml files respectively.

Setup

In order to communicate with Google, the following connection details need to be provided. You may specify them directly in a pipeline recipe as key-value pairs within the component's setup block, or you can create a Connection from the Integration Settings page and reference the whole setup as setup: ${connection.<my-connection-id>}.

FieldField IDTypeNote
API Keyapi-keystringFill in your Gemini API key. To find your keys, visit your Gemini's API Keys page.

Supported Tasks

Chat

Gemini's multimodal models understand text and images. They generate text outputs in response to prompts that can include text and images. The inputs to these models are also referred to as "prompts". Designing a prompt is how you guide the model, usually by providing instructions or examples to successfully complete a task.

InputField IDTypeDescription
Task ID (required)taskstringTASK_CHAT
Model (required)modelstringID of the model to use. The value is one of the following: gemini-2.5-pro: Optimized for enhanced thinking and reasoning, multimodal understanding, advanced coding, and more. gemini-2.5-flash: Optimized for adaptive thinking, cost efficiency. gemini-2.5-flash-lite: Optimized for most cost-efficient model supporting high throughput. gemini-2.5-flash-image-preview: Optimized for precise, conversational image generation and editing.
Enum values
  • gemini-2.5-pro
  • gemini-2.5-flash
  • gemini-2.5-flash-lite
  • gemini-2.5-flash-image-preview
StreamstreambooleanWhether to incrementally stream the response using server-sent events (SSE).
PromptpromptstringThe main text instruction or query for the model.
Imagesimagesarray[string]URI references or base64 content of input images.
Audioaudioarray[string]URI references or base64 content of input audio.
Videosvideosarray[string]URI references or base64 content of input videos.
Documentsdocumentsarray[string]URI references or base64 content of input documents. Different vendors might have different constraints on the document format. For example, Gemini supports only PDF.
System Messagesystem-messagestringInstruction to set the assistant's behavior, tone, or persona. Different vendors might name this field differently.
Chat Historychat-historyarray[object]Conversation history, each message includes a role and content.
Max Output Tokenmax-output-tokensintegerThe maximum number of tokens to generate in the model output.
TemperaturetemperaturenumberA parameter that controls the randomness and creativity of a large language model's output by adjusting the probability of the next word it chooses. A low temperature (e.g., near 0) produces more deterministic, focused, and consistent text, while a high temperature (e.g., near 1) leads to more creative, random, and varied output.
Top-Ptop-pnumberA parameter, also known as nucleus sampling, that controls the randomness and creativity of the generated text by selecting a dynamic subset of tokens. It works by sorting all possible next tokens by their probability, and then summing their probabilities from highest to lowest until the cumulative sum reaches the specified top-p value (a number between 0 and 1). The model then randomly selects the next token only from this "nucleus" of high-probability tokens. A higher top-p value creates a larger, more diverse set of possible words, leading to more creative and potentially unpredictable output, while a lower top-p value restricts the choice to a smaller, more focused set of highly probable words, resulting in more factual and conservative output.
Top-Ktop-kintegerA text generation parameter that limits the selection of the next token to the K most probable tokens, discarding the rest to control randomness and maintain coherence. By specifying a fixed number of top tokens, top-k acts as a "safety net," preventing nonsensical choices, but a small K can also stifle creativity and lead to repetitive outputs. It is often used in conjunction with other parameters like temperature and top-p to fine-tune the LLM's output. Note that OpenAI and Mistral models don't have the top-k exposed.
SeedseedintegerA random seed used to control the stochasticity of text generation to produce repeatable outputs
Contentscontentsarray[object]The input contents to the model. Each item represents a user or model turn composed of parts (text or images).
Toolstoolsarray[object]Tools available to the model, e.g., function declarations.
Tool Configtool-configobjectConfiguration for tool usage and function calling.
Safety Settingssafety-settingsarray[object]Safety settings for content filtering.
System Instructionsystem-instructionobjectA system instruction to guide the model behavior.
Generation Configgeneration-configobjectGeneration configuration for the request.
Cached Contentcached-contentstringThe name of a cached content to use as context. Format: cachedContents/{cachedContent}.
Input Objects in Chat

Chat History

Conversation history, each message includes a role and content.

FieldField IDTypeNote
PartspartsarrayParts of the content.
RolerolestringThe producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. Optional. The value is one of the following: USER: User content. MODEL: Model content.
Enum values
  • USER
  • MODEL

Parts

Parts of the content.

FieldField IDTypeNote
ThoughtthoughtbooleanIndicates if the part is a thought from the model.
Thought Signaturethought-signaturestringOpaque signature for the thought (base64-encoded bytes).
Video Metadatavideo-metadataobjectOptional video metadata (only with blob or fileData video content).

Video Metadata

Optional video metadata (only with blob or fileData video content).

FieldField IDTypeNote
End Offsetend-offsetstringThe end offset of the video (duration string, e.g. "3.5s").
FPSfpsnumberFrame rate of the video sent to the model. Range (0.0, 24.0].
Start Offsetstart-offsetstringThe start offset of the video (duration string, e.g. "3.5s").

Contents

The input contents to the model. Each item represents a user or model turn composed of parts (text or images).

FieldField IDTypeNote
PartspartsarrayParts of the content.
RolerolestringThe producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. Optional. The value is one of the following: USER: User content. MODEL: Model content.
Enum values
  • USER
  • MODEL

Parts

Parts of the content.

FieldField IDTypeNote
ThoughtthoughtbooleanIndicates if the part is a thought from the model.
Thought Signaturethought-signaturestringOpaque signature for the thought (base64-encoded bytes).
Video Metadatavideo-metadataobjectOptional video metadata (only with blob or fileData video content).

Video Metadata

Optional video metadata (only with blob or fileData video content).

FieldField IDTypeNote
End Offsetend-offsetstringThe end offset of the video (duration string, e.g. "3.5s").
FPSfpsnumberFrame rate of the video sent to the model. Range (0.0, 24.0].
Start Offsetstart-offsetstringThe start offset of the video (duration string, e.g. "3.5s").

Tools

Tools available to the model, e.g., function declarations.

FieldField IDTypeNote
Code Executioncode-executionobjectTool that executes code generated by the model, and automatically returns the result to the model.
Function Declarationsfunction-declarationsarrayFunctions the model may call.
Google Searchgoogle-searchobjectGoogleSearch tool type. Tool to support Google Search in Model. Powered by Google.
Google Search Retrievalgoogle-search-retrievalobjectTool to retrieve public web data for grounding, powered by Google.
URL Contexturl-contextobjectTool to support URL context retrieval.

Function Declarations

Functions the model may call.

FieldField IDTypeNote
DescriptiondescriptionstringA brief description of the function.
NamenamestringThe name of the function to call.
ParametersparametersobjectDescribes the parameters to this function. Reflects the Open API 3.03 Parameter Object string Key: the name of the parameter. Parameter names are case sensitive. Schema Value: the Schema defining the type used for the parameter.

Parameters

Describes the parameters to this function. Reflects the Open API 3.03 Parameter Object string Key: the name of the parameter. Parameter names are case sensitive. Schema Value: the Schema defining the type used for the parameter.

FieldField IDTypeNote
Any OfanyOfarrayValue must satisfy any of the sub-schemas.
DefaultdefaultobjectDefault value for the field (ignored for validation).
DescriptiondescriptionstringOptional description of the schema.
EnumenumarrayEnum values for STRING with enum format.
FormatformatstringOptional format of the data.
ItemsitemsobjectSchema of elements for ARRAY type.
Max Itemsmax-itemsintegerMaximum number of elements for ARRAY type.
Max Lengthmax-lengthintegerMaximum length for STRING type.
Max Propertiesmax-propertiesintegerMaximum number of properties for OBJECT type.
MaximummaximumnumberMaximum value for INTEGER/NUMBER types.
Min Itemsmin-itemsintegerMinimum number of elements for ARRAY type.
Min Lengthmin-lengthintegerMinimum length for STRING type.
Min Propertiesmin-propertiesintegerMinimum number of properties for OBJECT type.
MinimumminimumnumberMinimum value for INTEGER/NUMBER types.
NullablenullablebooleanIndicates if the value may be null.
PatternpatternstringRegex pattern constraint for STRING type.
PropertiespropertiesobjectProperties for OBJECT type.
Property Orderingproperty-orderingarrayOrder of properties for OBJECT type (non-standard).
RequiredrequiredarrayRequired properties for OBJECT type.
TitletitlestringOptional title of the schema.
TypetypestringRequired data type of the schema.
Enum values
  • TYPE_UNSPECIFIED
  • STRING
  • NUMBER
  • INTEGER
  • BOOLEAN
  • ARRAY
  • OBJECT

Google Search Retrieval

Tool to retrieve public web data for grounding, powered by Google.

FieldField IDTypeNote
Dynamic Retrieval Configdynamic-retrieval-configobjectSpecifies the dynamic retrieval configuration for the given source.

Dynamic Retrieval Config

Specifies the dynamic retrieval configuration for the given source.

FieldField IDTypeNote
Dynamic Thresholddynamic-thresholdnumberThe threshold to be used in dynamic retrieval. If not set, a system default value is used.
ModemodestringThe mode of the predictor to be used in dynamic retrieval. The value is one of the following: MODE_UNSPECIFIED: Always trigger retrieval. MODE_DYNAMIC: Run retrieval only when system decides it is necessary.
Enum values
  • MODE_UNSPECIFIED
  • MODE_DYNAMIC

GoogleSearch tool type. Tool to support Google Search in Model. Powered by Google.

FieldField IDTypeNote
Time Range Filtertime-range-filterobjectFilter search results to a specific time range. If customers set a start time, they must set an end time (and vice versa).

Time Range Filter

Filter search results to a specific time range. If customers set a start time, they must set an end time (and vice versa).

FieldField IDTypeNote
End Timeend-timestringExclusive end of the interval. If specified, a Timestamp matching this interval will have to be before the end. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".
Start Timestart-timestringInclusive start of the interval. If specified, a Timestamp matching this interval will have to be the same or after the start. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

Tool Config

Configuration for tool usage and function calling.

FieldField IDTypeNote
Function Calling Configfunction-calling-configobjectConfiguration for specifying function calling behavior.

Function Calling Config

Configuration for specifying function calling behavior.

FieldField IDTypeNote
Allowed Function Namesallowed-function-namesarrayA set of function names that, when provided, limits the functions the model will call. This should only be set when the Mode is ANY or VALIDATED. Function names should match [FunctionDeclaration.name]. When set, model will predict a function call from only allowed function names.
ModemodestringSpecifies the mode in which function calling should execute. If unspecified, the default value will be set to AUTO. The value is one of the following: MODE_UNSPECIFIED: Unspecified function calling mode. This value should not be used. AUTO: Default model behavior, model decides to predict either a function call or a natural language response. ANY: Model is constrained to always predicting a function call only. If "allowedFunctionNames" are set, the predicted function call will be limited to any one of "allowedFunctionNames", else the predicted function call will be any one of the provided "functionDeclarations". NONE: Model will not predict any function call. Model behavior is same as when not passing any function declarations. VALIDATED: Model decides to predict either a function call or a natural language response, but will validate function calls with constrained decoding. If "allowedFunctionNames" are set, the predicted function call will be limited to any one of "allowedFunctionNames", else the predicted function call will be any one of the provided "functionDeclarations".
Enum values
  • MODE_UNSPECIFIED
  • AUTO
  • ANY
  • NONE

Safety Settings

Safety settings for content filtering.

FieldField IDTypeNote
Harm CategorycategorystringThe category of a rating for safety. The value is one of the following: HARM_CATEGORY_UNSPECIFIED: Category is unspecified. HARM_CATEGORY_DEROGATORY: PaLM - Negative or harmful comments targeting identity and/or protected attribute. HARM_CATEGORY_TOXICITY: PaLM - Content that is rude, disrespectful, or profane. HARM_CATEGORY_VIOLENCE: PaLM - Describes scenarios depicting violence against an individual or group, or general descriptions of gore. HARM_CATEGORY_SEXUAL: PaLM - Contains references to sexual acts or other lewd content. HARM_CATEGORY_MEDICAL: PaLM - Promotes unchecked medical advice. HARM_CATEGORY_DANGEROUS: PaLM - Dangerous content that promotes, facilitates, or encourages harmful acts. HARM_CATEGORY_HARASSMENT: Gemini - Harassment content. HARM_CATEGORY_HATE_SPEECH: Gemini - Hate speech and content. HARM_CATEGORY_SEXUALLY_EXPLICIT: Gemini - Sexually explicit content. HARM_CATEGORY_DANGEROUS_CONTENT: Gemini - Dangerous content. HARM_CATEGORY_CIVIC_INTEGRITY: Gemini - Content that may be used to harm civic integrity. DEPRECATED: use enableEnhancedCivicAnswers instead.
Enum values
  • HARM_CATEGORY_UNSPECIFIED
  • HARM_CATEGORY_DEROGATORY
  • HARM_CATEGORY_TOXICITY
  • HARM_CATEGORY_VIOLENCE
  • HARM_CATEGORY_SEXUAL
  • HARM_CATEGORY_MEDICAL
  • HARM_CATEGORY_DANGEROUS
  • HARM_CATEGORY_HARASSMENT
  • HARM_CATEGORY_HATE_SPEECH
  • HARM_CATEGORY_SEXUALLY_EXPLICIT
  • HARM_CATEGORY_DANGEROUS_CONTENT
Harm Block ThresholdthresholdstringBlock at and beyond a specified harm probability. The value is one of the following: HARM_BLOCK_THRESHOLD_UNSPECIFIED: Threshold is unspecified. BLOCK_LOW_AND_ABOVE: Content with NEGLIGIBLE will be allowed. BLOCK_MEDIUM_AND_ABOVE: Content with NEGLIGIBLE and LOW will be allowed. BLOCK_ONLY_HIGH: Content with NEGLIGIBLE, LOW, and MEDIUM will be allowed. BLOCK_NONE: All content will be allowed. OFF: Turn off the safety filter.
Enum values
  • HARM_BLOCK_THRESHOLD_UNSPECIFIED
  • BLOCK_LOW_AND_ABOVE
  • BLOCK_MEDIUM_AND_ABOVE
  • BLOCK_ONLY_HIGH
  • BLOCK_NONE
  • OFF

System Instruction

A system instruction to guide the model behavior.

FieldField IDTypeNote
PartspartsarrayParts of the content.
RolerolestringThe producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. Optional. The value is one of the following: USER: User content. MODEL: Model content.
Enum values
  • USER
  • MODEL

Parts

Parts of the content.

FieldField IDTypeNote
ThoughtthoughtbooleanIndicates if the part is a thought from the model.
Thought Signaturethought-signaturestringOpaque signature for the thought (base64-encoded bytes).
Video Metadatavideo-metadataobjectOptional video metadata (only with blob or fileData video content).

Video Metadata

Optional video metadata (only with blob or fileData video content).

FieldField IDTypeNote
End Offsetend-offsetstringThe end offset of the video (duration string, e.g. "3.5s").
FPSfpsnumberFrame rate of the video sent to the model. Range (0.0, 24.0].
Start Offsetstart-offsetstringThe start offset of the video (duration string, e.g. "3.5s").

Generation Config

Generation configuration for the request.

FieldField IDTypeNote
Candidate Countcandidate-countintegerNumber of candidates to generate.
Enable Enhanced Civic Answersenable-enhanced-civic-answersbooleanEnables enhanced civic answers.
Frequency Penaltyfrequency-penaltynumberFrequency penalty applied proportional to the number of times a token has been seen.
LogprobslogprobsintegerNumber of top logprobs to return at each decoding step (1-5). Only valid if response-logprobs is true.
Max Output Tokensmax-output-tokensintegerThe maximum number of tokens to generate in the response.
Media Resolutionmedia-resolutionstringMedia resolution for multimodal generation. Controls how many tokens are budgeted for media understanding and reframing. The value is one of the following: MEDIA_RESOLUTION_UNSPECIFIED: Media resolution has not been set. MEDIA_RESOLUTION_LOW: Media resolution set to low (64 tokens). MEDIA_RESOLUTION_MEDIUM: Media resolution set to medium (256 tokens). MEDIA_RESOLUTION_HIGH: Media resolution set to high (zoomed reframing with 256 tokens).
Enum values
  • MEDIA_RESOLUTION_UNSPECIFIED
  • MEDIA_RESOLUTION_LOW
  • MEDIA_RESOLUTION_MEDIUM
  • MEDIA_RESOLUTION_HIGH
Presence Penaltypresence-penaltynumberPresence penalty applied to next-token logprobs if token already seen.
Response Logprobsresponse-logprobsbooleanIf true, export the logprobs results in response.
Response MIME Typeresponse-mime-typestringDesired response MIME type (e.g., application/json for JSON mode).
Response Modalitiesresponse-modalitiesarrayRequested modalities of the response. Empty means text only.
Response Schemaresponse-schemaobjectJSON Schema to constrain the response when using JSON mode.
SeedseedintegerSeed used in decoding.
Speech Configspeech-configobjectSpeech generation configuration.
Stop Sequencesstop-sequencesarrayList of sequences that will stop further token generation.
TemperaturetemperaturenumberSampling temperature, controls randomness.
Thinking Configthinking-configobjectConfig for thinking features.
Top Ktop-knumberTop-k sampling cutoff.
Top Ptop-pnumberNucleus sampling probability mass.

Response Schema

JSON Schema to constrain the response when using JSON mode.

FieldField IDTypeNote
Any OfanyOfarrayValue must satisfy any of the sub-schemas.
DefaultdefaultobjectDefault value for the field (ignored for validation).
DescriptiondescriptionstringOptional description of the schema.
EnumenumarrayEnum values for STRING with enum format.
FormatformatstringOptional format of the data.
ItemsitemsobjectSchema of elements for ARRAY type.
Max Itemsmax-itemsintegerMaximum number of elements for ARRAY type.
Max Lengthmax-lengthintegerMaximum length for STRING type.
Max Propertiesmax-propertiesintegerMaximum number of properties for OBJECT type.
MaximummaximumnumberMaximum value for INTEGER/NUMBER types.
Min Itemsmin-itemsintegerMinimum number of elements for ARRAY type.
Min Lengthmin-lengthintegerMinimum length for STRING type.
Min Propertiesmin-propertiesintegerMinimum number of properties for OBJECT type.
MinimumminimumnumberMinimum value for INTEGER/NUMBER types.
NullablenullablebooleanIndicates if the value may be null.
PatternpatternstringRegex pattern constraint for STRING type.
PropertiespropertiesobjectProperties for OBJECT type.
Property Orderingproperty-orderingarrayOrder of properties for OBJECT type (non-standard).
RequiredrequiredarrayRequired properties for OBJECT type.
TitletitlestringOptional title of the schema.
TypetypestringRequired data type of the schema.
Enum values
  • TYPE_UNSPECIFIED
  • STRING
  • NUMBER
  • INTEGER
  • BOOLEAN
  • ARRAY
  • OBJECT

Speech Config

Speech generation configuration.

FieldField IDTypeNote
Language Codelanguage-codestringLanguage code (BCP 47) for speech synthesis.
Enum values
  • de-DE
  • en-AU
  • en-GB
  • en-IN
  • en-US
  • es-US
  • fr-FR
  • hi-IN
  • pt-BR
  • ar-XA
  • es-ES
  • fr-CA
  • id-ID
  • it-IT
  • ja-JP
  • tr-TR
  • vi-VN
  • bn-IN
  • gu-IN
  • kn-IN
  • ml-IN
  • mr-IN
  • ta-IN
  • te-IN
  • nl-NL
  • ko-KR
  • cmn-CN
  • pl-PL
  • ru-RU
  • th-TH
Multi Speaker Voice Configmulti-speaker-voice-configobjectConfiguration for the multi-speaker setup. Mutually exclusive with voice-config.
Voice Configvoice-configConfiguration for the voice to use. Union type.

Multi Speaker Voice Config

Configuration for the multi-speaker setup. Mutually exclusive with voice-config.

FieldField IDTypeNote
Speaker Voice Configsspeaker-voice-configsarrayAll the enabled speaker voices.

Speaker Voice Configs

All the enabled speaker voices.

FieldField IDTypeNote
SpeakerspeakerstringThe name of the speaker to use. Should match the name used in the prompt.
Voice Configvoice-configConfiguration for the voice to use. Union type.

Thinking Config

Config for thinking features.

FieldField IDTypeNote
Include Thoughtsinclude-thoughtsbooleanWhether to include thoughts in the response when available.
Thinking Budgetthinking-budgetintegerThe number of thought tokens the model should generate.
The parts Object

Parts

parts must fulfill one of the following schemas:

FieldField IDTypeNote
TexttextstringInline text content.
FieldField IDTypeNote
BlobblobobjectRaw media bytes. Text should use the 'text' field instead.
FieldField IDTypeNote
Function Callfunction-callobjectPredicted function call with name and arguments.
FieldField IDTypeNote
Function Responsefunction-responseobjectResult of a function call with name and structured response.
FieldField IDTypeNote
File Datafile-dataobjectURI-based data reference with MIME type.
FieldField IDTypeNote
Executable Codeexecutable-codeobjectCode generated by the model that is meant to be executed.
FieldField IDTypeNote
Code Execution Resultcode-execution-resultobjectResult of executing the ExecutableCode.
OutputField IDTypeDescription
Texts (optional)textsarray[string]Simplified text output extracted from candidates. Each string represents the concatenated text content from the corresponding candidate's parts, including thought processes when include-thoughts is enabled. This field provides easy access to the generated text without needing to traverse the candidate structure. Updated in real-time during streaming.
Images (optional)imagesarray[image/webp]Images output extracted and converted from candidates. This field provides easy access to the generated images as base64-encoded strings. The original binary data is removed from the candidates field to prevent raw binary exposure in JSON output. This field is only available when the model supports image generation.
Usage (optional)usageobjectToken usage statistics: prompt tokens, completion tokens, total tokens, etc.
Candidates (optional)candidatesarray[object]Complete candidate objects from the model containing rich metadata and structured content. Each candidate includes safety ratings, finish reason, token counts, citations, content parts (including thought processes when include-thoughts is enabled), and other detailed information. This provides full access to all response data beyond just text. Updated incrementally during streaming with accumulated content and latest metadata.
Usage Metadata (optional)usage-metadataobjectMetadata on the generation request's token usage.
Prompt Feedback (optional)prompt-feedbackobjectFeedback on the prompt including any safety blocking information.
Model Version (optional)model-versionstringThe model version used to generate the response.
Response ID (optional)response-idstringIdentifier for this response.
Output Objects in Chat

Candidates

FieldField IDTypeNote
Average Logprobsavg-logprobsnumberAverage log probability score of the candidate.
Citation Metadatacitation-metadataobjectCitation metadata for generated content, listing sources.
ContentcontentobjectBase structured datatype with producer role and ordered parts.
Finish Reasonfinish-reasonstringReason why the model stopped generating for a candidate. The value is one of the following: FINISH_REASON_UNSPECIFIED: Default value. This value is unused. STOP: Natural stop point of the model or provided stop sequence. MAX_TOKENS: The maximum number of tokens as specified in the request was reached. SAFETY: The response candidate content was flagged for safety reasons. RECITATION: The response candidate content was flagged for recitation reasons. LANGUAGE: The response candidate content was flagged for using an unsupported language. OTHER: Unknown reason. BLOCKLIST: Token generation stopped because the content contains forbidden terms. PROHIBITED_CONTENT: Token generation stopped for potentially containing prohibited content. SPII: Token generation stopped because the content potentially contains Sensitive Personally IDentifiable Information (SPII). MALFORMED_FUNCTION_CALL: The function call generated by the model is invalid. IMAGE_SAFETY: Token generation stopped because generated images contain safety violations. UNEXPECTED_TOOL_CALL: Model generated a tool call but no tools were enabled in the request. TOO_MANY_TOOL_CALLS: Model called too many tools consecutively, thus the system exited execution.
Enum values
  • FINISH_REASON_UNSPECIFIED
  • STOP
  • MAX_TOKENS
  • SAFETY
  • RECITATION
  • LANGUAGE
  • OTHER
  • BLOCKLIST
  • PROHIBITED_CONTENT
  • SPII
  • MALFORMED_FUNCTION_CALL
  • IMAGE_SAFETY
  • UNEXPECTED_TOOL_CALL
  • TOO_MANY_TOOL_CALLS
Grounding Attributionsgrounding-attributionsarrayAttribution information for sources that contributed to a grounded answer.
Grounding Metadatagrounding-metadataobjectMetadata returned to client when grounding is enabled.
IndexindexintegerPosition of the candidate in the returned list.
Logprobs Resultlogprobs-resultobjectLog probabilities for generated tokens.
Safety Ratingssafety-ratingsarraySafety ratings applied to this candidate.
Token Counttoken-countintegerToken count for this candidate.
URL Context Metadataurl-context-metadataobjectMetadata related to URL context retrieval tool.

Content

FieldField IDTypeNote
PartspartsarrayParts of the content.
RolerolestringThe producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. Optional. The value is one of the following: USER: User content. MODEL: Model content.
Enum values
  • USER
  • MODEL

Parts

FieldField IDTypeNote
ThoughtthoughtbooleanIndicates if the part is a thought from the model.
Thought Signaturethought-signaturestringOpaque signature for the thought (base64-encoded bytes).
Video Metadatavideo-metadataobjectOptional video metadata (only with blob or fileData video content).

Video Metadata

FieldField IDTypeNote
End Offsetend-offsetstringThe end offset of the video (duration string, e.g. "3.5s").
FPSfpsnumberFrame rate of the video sent to the model. Range (0.0, 24.0].
Start Offsetstart-offsetstringThe start offset of the video (duration string, e.g. "3.5s").

Safety Ratings

FieldField IDTypeNote
BlockedblockedbooleanWhether the content was blocked by this rating.
Harm CategorycategorystringHarm category.
ProbabilityprobabilitystringProbability level of harm.

Citation Metadata

FieldField IDTypeNote
CitationscitationsarrayCitations to sources for a specific response.

Citations

FieldField IDTypeNote
End Indexend-indexintegerOptional. End of the attributed segment, exclusive.
LicenselicensestringOptional. License for the GitHub project that is attributed as a source for segment. License info is required for code citations.
Start Indexstart-indexintegerOptional. Start of segment of the response that is attributed to this source. Index indicates the start of the segment, measured in bytes.
URIuristringOptional. URI that is attributed as a source for a portion of the text.

Grounding Attributions

FieldField IDTypeNote
ContentcontentobjectGrounding source content that makes up this attribution.
Source IDsource-idobjectIdentifier for the source contributing to this attribution.

Content

FieldField IDTypeNote
PartspartsarrayParts of the content.
RolerolestringThe producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. Optional. The value is one of the following: USER: User content. MODEL: Model content.
Enum values
  • USER
  • MODEL

Parts

FieldField IDTypeNote
ThoughtthoughtbooleanIndicates if the part is a thought from the model.
Thought Signaturethought-signaturestringOpaque signature for the thought (base64-encoded bytes).
Video Metadatavideo-metadataobjectOptional video metadata (only with blob or fileData video content).

Video Metadata

FieldField IDTypeNote
End Offsetend-offsetstringThe end offset of the video (duration string, e.g. "3.5s").
FPSfpsnumberFrame rate of the video sent to the model. Range (0.0, 24.0].
Start Offsetstart-offsetstringThe start offset of the video (duration string, e.g. "3.5s").

Source ID

FieldField IDTypeNote
Grounding Passage IDgrounding-passageobjectIdentifier for an inline passage.
Semantic Retriever Chunksemantic-retriever-chunkobjectIdentifier for a Chunk fetched via Semantic Retriever

Grounding Passage ID

FieldField IDTypeNote
Part Indexpart-indexintegerIndex of the part within the GroundingPassage.content
Passage IDpassage-idstringID of the passage matching the request's GroundingPassage.id

Semantic Retriever Chunk

FieldField IDTypeNote
ChunkchunkstringName of the Chunk containing the attributed text
SourcesourcestringName of the Semantic Retriever source (e.g. corpora/123)

Logprobs Result

FieldField IDTypeNote
Top Candidatestop-candidatesarrayLength = total number of decoding steps.

Top Candidates

FieldField IDTypeNote
Log ProbabilitylogprobnumberThe candidate's log probability.
TokentokenstringThe candidate's token string value.
Token IDtoken-idintegerThe candidate's token id value.

URL Context Metadata

FieldField IDTypeNote
URL Metadataurl-metadataarrayList of url context.

URL Metadata

FieldField IDTypeNote
Retrieved URLretrieved-urlstringRetrieved url by the tool.
URL Retrieval Statusurl-retrieval-statusstringRetrieval status for URL-based context. The value is one of the following: URL_RETRIEVAL_STATUS_UNSPECIFIED: Default value. This value is unused. URL_RETRIEVAL_STATUS_SUCCESS: URL retrieval is successful. URL_RETRIEVAL_STATUS_ERROR: URL retrieval is failed due to error. URL_RETRIEVAL_STATUS_PAYWALL: URL retrieval is failed because the content is behind paywall. URL_RETRIEVAL_STATUS_UNSAFE: URL retrieval is failed because the content is unsafe.
Enum values
  • URL_RETRIEVAL_STATUS_UNSPECIFIED
  • URL_RETRIEVAL_STATUS_SUCCESS
  • URL_RETRIEVAL_STATUS_ERROR
  • URL_RETRIEVAL_STATUS_PAYWALL
  • URL_RETRIEVAL_STATUS_UNSAFE

Grounding Metadata

FieldField IDTypeNote
Grounding Chunksgrounding-chunksarraySupporting references retrieved from grounding source
Grounding Supportsgrounding-supportsarrayList of grounding support
Retrieval Metadataretrieval-metadataobjectRetrieval metadata for grounding flow
Search Entry Pointsearch-entry-pointobjectGoogle search entry for follow-up web searches
Web Search Queriesweb-search-queriesarrayWeb search queries for follow-up search

Grounding Chunks

FieldField IDTypeNote
WebwebobjectWeb grounding chunk

Web

FieldField IDTypeNote
TitletitlestringTitle of the chunk
URIuristringURI reference of the chunk

Grounding Supports

FieldField IDTypeNote
Confidence Scoresconfidence-scoresarrayConfidence scores aligned with groundingChunkIndices
Grounding Chunk Indicesgrounding-chunk-indicesarrayIndices into groundingChunks that support the claim
SegmentsegmentobjectSegment of the content

Segment

FieldField IDTypeNote
End Indexend-indexintegerEnd byte offset in the Part (exclusive)
Part Indexpart-indexintegerIndex of the Part in its parent Content
Start Indexstart-indexintegerStart byte offset in the Part (inclusive)
TexttextstringText of the segment

Search Entry Point

FieldField IDTypeNote
Rendered Contentrendered-contentstringWeb content snippet suitable for embedding
SDK Blobsdk-blobstringBase64-encoded JSON of &lt;search term, search url&gt; tuples

Retrieval Metadata

FieldField IDTypeNote
Google Search Dynamic Retrieval Scoregoogle-search-dynamic-retrieval-scorenumberLikelihood [0,1] that Google Search could help answer the prompt

Usage Metadata

FieldField IDTypeNote
Cache Tokens Detailscache-tokens-detailsarrayOutput only. List of modalities of the cached content in the request input.
Cached Content Token Countcached-content-token-countintegerNumber of tokens in the cached part of the prompt (the cached content)
Candidates Token Countcandidates-token-countintegerTotal number of tokens across all the generated response candidates.
Candidates Tokens Detailscandidates-tokens-detailsarrayList of modalities that were returned in the response.
Prompt Token Countprompt-token-countintegerNumber of tokens in the prompt. When cachedContent is set, this is still the total effective prompt size meaning this includes the number of tokens in the cached content.
Prompt Tokens Detailsprompt-tokens-detailsarrayList of modalities that were processed in the request input.
Thoughts Token Countthoughts-token-countintegerNumber of tokens of thoughts for thinking models.
Tool-use Prompt Token Counttool-use-prompt-token-countintegerNumber of tokens present in tool-use prompt(s).
Tool-use Prompt Tokens Detailstool-use-prompt-tokens-detailsarrayList of modalities that were processed for tool-use request inputs.
Total Token Counttotal-token-countintegerTotal token count for the generation request (prompt + response candidates).

Prompt Tokens Details

FieldField IDTypeNote
ModalitymodalitystringContent Part modality. Indicates the media type of a content part. The value is one of the following: MODALITY_UNSPECIFIED: Unspecified modality. TEXT: Plain text. IMAGE: Image. VIDEO: Video. AUDIO: Audio. DOCUMENT: Document, e.g. PDF.
Enum values
  • MODALITY_UNSPECIFIED
  • TEXT
  • IMAGE
  • VIDEO
  • AUDIO
  • DOCUMENT
Token Counttoken-countintegerNumber of tokens.

Cache Tokens Details

FieldField IDTypeNote
ModalitymodalitystringContent Part modality. Indicates the media type of a content part. The value is one of the following: MODALITY_UNSPECIFIED: Unspecified modality. TEXT: Plain text. IMAGE: Image. VIDEO: Video. AUDIO: Audio. DOCUMENT: Document, e.g. PDF.
Enum values
  • MODALITY_UNSPECIFIED
  • TEXT
  • IMAGE
  • VIDEO
  • AUDIO
  • DOCUMENT
Token Counttoken-countintegerNumber of tokens.

Candidates Tokens Details

FieldField IDTypeNote
ModalitymodalitystringContent Part modality. Indicates the media type of a content part. The value is one of the following: MODALITY_UNSPECIFIED: Unspecified modality. TEXT: Plain text. IMAGE: Image. VIDEO: Video. AUDIO: Audio. DOCUMENT: Document, e.g. PDF.
Enum values
  • MODALITY_UNSPECIFIED
  • TEXT
  • IMAGE
  • VIDEO
  • AUDIO
  • DOCUMENT
Token Counttoken-countintegerNumber of tokens.

Tool Use Prompt Tokens Details

FieldField IDTypeNote
ModalitymodalitystringContent Part modality. Indicates the media type of a content part. The value is one of the following: MODALITY_UNSPECIFIED: Unspecified modality. TEXT: Plain text. IMAGE: Image. VIDEO: Video. AUDIO: Audio. DOCUMENT: Document, e.g. PDF.
Enum values
  • MODALITY_UNSPECIFIED
  • TEXT
  • IMAGE
  • VIDEO
  • AUDIO
  • DOCUMENT
Token Counttoken-countintegerNumber of tokens.

Prompt Feedback

FieldField IDTypeNote
Block Reasonblock-reasonstringSpecifies the reason why the prompt was blocked. The value is one of the following: BLOCK_REASON_UNSPECIFIED: Default value. This value is unused. SAFETY: Prompt was blocked due to safety reasons. Inspect safetyRatings to understand which safety category blocked it. OTHER: Prompt was blocked due to unknown reasons. BLOCKLIST: Prompt was blocked due to the terms which are included from the terminology blocklist. PROHIBITED_CONTENT: Prompt was blocked due to prohibited content. IMAGE_SAFETY: Candidates blocked due to unsafe image generation content.
Enum values
  • BLOCK_REASON_UNSPECIFIED
  • SAFETY
  • OTHER
  • BLOCKLIST
  • PROHIBITED_CONTENT
  • IMAGE_SAFETY
Safety Ratingssafety-ratingsarraySafety rating for a piece of content. The safety rating contains the category of harm and the harm probability level in that category for a piece of content. Content is classified for safety across a number of harm categories and the probability of the harm classification is included here.

Safety Ratings

FieldField IDTypeNote
BlockedblockedbooleanWhether the content was blocked by this rating.
Harm CategorycategorystringHarm category.
ProbabilityprobabilitystringProbability level of harm.

Cache

Context caching allows you to cache input tokens and reference them in subsequent requests, reducing costs and improving performance for repetitive large contexts. This task supports creating, listing, getting, updating, and deleting cached content with proper time-to-live (TTL) management. The minimum input token count for context caching is 1,024 for 2.5 Flash and 4,096 for 2.5 Pro models.

InputField IDTypeDescription
Task ID (required)taskstringTASK_CACHE
Operation (required)operationstringThe cache operation to perform. The value is one of the following: create: Create a new cached content. list: List all cached contents. get: Retrieve a specific cached content. update: Update an existing cached content (only expiration time can be updated). delete: Delete a cached content.
Enum values
  • create
  • list
  • get
  • update
  • delete
Model (required)modelstringID of the model to use for caching. Required for create operations. The model is immutable after creation. The value is one of the following: gemini-2.5-pro: Optimized for enhanced thinking and reasoning, multimodal understanding, advanced coding, and more. gemini-2.5-flash: Optimized for Adaptive thinking, cost efficiency. gemini-2.0-flash-lite: Optimized for Most cost-efficient model supporting high throughput.
Enum values
  • gemini-2.5-pro
  • gemini-2.5-flash
  • gemini-2.0-flash-lite
Cache Namecache-namestring[GET, UPDATE, DELETE] The name of the cached content for get, update, or delete operations. Format: cachedContents/{cachedContent}. Required for get, update, and delete operations.
Promptpromptstring[CREATE] The main text instruction or query to be cached for create operations.
Imagesimagesarray[string][CREATE] URI references or base64 content of input images to be cached for create operations.
Audioaudioarray[string][CREATE] URI references or base64 content of input audio to be cached for create operations.
Videosvideosarray[string][CREATE] URI references or base64 content of input videos to be cached for create operations.
Documentsdocumentsarray[string][CREATE] URI references or base64 content of input documents to be cached for create operations. Different vendors might have different constraints on the document format. For example, Gemini supports only PDF.
System Messagesystem-messagestring[CREATE] A system message to guide model behavior for create operations. Takes precedence over system-instruction.
Display Namedisplay-namestring[CREATE] Optional. The user-provided name of the cached content for create operations.
System Instructionsystem-instructionobject[CREATE] Optional. A system instruction to guide the model behavior for create operations.
Contentscontentsarray[object]The input contents to cache for create operations. Each item represents a user or model turn composed of parts (text or images). This is the main content that will be cached for reuse in subsequent requests.
Toolstoolsarray[object][CREATE] Optional. Tools available to the model for create operations, e.g., function declarations.
Tool Configtool-configobjectConfiguration for tool usage and function calling.
TTLttlstring[CREATE, UPDATE] Time to live duration for the cached content in Google Duration format. A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s". Must be at least 60 seconds. Maximum is 7 days (604800 seconds).
Expire Timeexpire-timestring[CREATE, UPDATE] Absolute expiration time for the cached content in RFC3339 format. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".
Page Sizepage-sizeinteger[LIST] Optional. The maximum number of cached contents to return for list operations. Default is 50.
Page Tokenpage-tokenstring[LIST] Optional. A page token from a previous list operation for pagination.
Input Objects in Cache

System Instruction

[CREATE] Optional. A system instruction to guide the model behavior for create operations.

FieldField IDTypeNote
PartspartsarrayParts of the content.
RolerolestringThe producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. Optional. The value is one of the following: USER: User content. MODEL: Model content.
Enum values
  • USER
  • MODEL

Parts

Parts of the content.

FieldField IDTypeNote
ThoughtthoughtbooleanIndicates if the part is a thought from the model.
Thought Signaturethought-signaturestringOpaque signature for the thought (base64-encoded bytes).
Video Metadatavideo-metadataobjectOptional video metadata (only with blob or fileData video content).

Video Metadata

Optional video metadata (only with blob or fileData video content).

FieldField IDTypeNote
End Offsetend-offsetstringThe end offset of the video (duration string, e.g. "3.5s").
FPSfpsnumberFrame rate of the video sent to the model. Range (0.0, 24.0].
Start Offsetstart-offsetstringThe start offset of the video (duration string, e.g. "3.5s").

Contents

The input contents to cache for create operations. Each item represents a user or model turn composed of parts (text or images). This is the main content that will be cached for reuse in subsequent requests.

FieldField IDTypeNote
PartspartsarrayParts of the content.
RolerolestringThe producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. Optional. The value is one of the following: USER: User content. MODEL: Model content.
Enum values
  • USER
  • MODEL

Parts

Parts of the content.

FieldField IDTypeNote
ThoughtthoughtbooleanIndicates if the part is a thought from the model.
Thought Signaturethought-signaturestringOpaque signature for the thought (base64-encoded bytes).
Video Metadatavideo-metadataobjectOptional video metadata (only with blob or fileData video content).

Video Metadata

Optional video metadata (only with blob or fileData video content).

FieldField IDTypeNote
End Offsetend-offsetstringThe end offset of the video (duration string, e.g. "3.5s").
FPSfpsnumberFrame rate of the video sent to the model. Range (0.0, 24.0].
Start Offsetstart-offsetstringThe start offset of the video (duration string, e.g. "3.5s").

Tools

[CREATE] Optional. Tools available to the model for create operations, e.g., function declarations.

FieldField IDTypeNote
Code Executioncode-executionobjectTool that executes code generated by the model, and automatically returns the result to the model.
Function Declarationsfunction-declarationsarrayFunctions the model may call.
Google Searchgoogle-searchobjectGoogleSearch tool type. Tool to support Google Search in Model. Powered by Google.
Google Search Retrievalgoogle-search-retrievalobjectTool to retrieve public web data for grounding, powered by Google.
URL Contexturl-contextobjectTool to support URL context retrieval.

Function Declarations

Functions the model may call.

FieldField IDTypeNote
DescriptiondescriptionstringA brief description of the function.
NamenamestringThe name of the function to call.
ParametersparametersobjectDescribes the parameters to this function. Reflects the Open API 3.03 Parameter Object string Key: the name of the parameter. Parameter names are case sensitive. Schema Value: the Schema defining the type used for the parameter.

Parameters

Describes the parameters to this function. Reflects the Open API 3.03 Parameter Object string Key: the name of the parameter. Parameter names are case sensitive. Schema Value: the Schema defining the type used for the parameter.

FieldField IDTypeNote
Any OfanyOfarrayValue must satisfy any of the sub-schemas.
DefaultdefaultobjectDefault value for the field (ignored for validation).
DescriptiondescriptionstringOptional description of the schema.
EnumenumarrayEnum values for STRING with enum format.
FormatformatstringOptional format of the data.
ItemsitemsobjectSchema of elements for ARRAY type.
Max Itemsmax-itemsintegerMaximum number of elements for ARRAY type.
Max Lengthmax-lengthintegerMaximum length for STRING type.
Max Propertiesmax-propertiesintegerMaximum number of properties for OBJECT type.
MaximummaximumnumberMaximum value for INTEGER/NUMBER types.
Min Itemsmin-itemsintegerMinimum number of elements for ARRAY type.
Min Lengthmin-lengthintegerMinimum length for STRING type.
Min Propertiesmin-propertiesintegerMinimum number of properties for OBJECT type.
MinimumminimumnumberMinimum value for INTEGER/NUMBER types.
NullablenullablebooleanIndicates if the value may be null.
PatternpatternstringRegex pattern constraint for STRING type.
PropertiespropertiesobjectProperties for OBJECT type.
Property Orderingproperty-orderingarrayOrder of properties for OBJECT type (non-standard).
RequiredrequiredarrayRequired properties for OBJECT type.
TitletitlestringOptional title of the schema.
TypetypestringRequired data type of the schema.
Enum values
  • TYPE_UNSPECIFIED
  • STRING
  • NUMBER
  • INTEGER
  • BOOLEAN
  • ARRAY
  • OBJECT

Google Search Retrieval

Tool to retrieve public web data for grounding, powered by Google.

FieldField IDTypeNote
Dynamic Retrieval Configdynamic-retrieval-configobjectSpecifies the dynamic retrieval configuration for the given source.

Dynamic Retrieval Config

Specifies the dynamic retrieval configuration for the given source.

FieldField IDTypeNote
Dynamic Thresholddynamic-thresholdnumberThe threshold to be used in dynamic retrieval. If not set, a system default value is used.
ModemodestringThe mode of the predictor to be used in dynamic retrieval. The value is one of the following: MODE_UNSPECIFIED: Always trigger retrieval. MODE_DYNAMIC: Run retrieval only when system decides it is necessary.
Enum values
  • MODE_UNSPECIFIED
  • MODE_DYNAMIC

GoogleSearch tool type. Tool to support Google Search in Model. Powered by Google.

FieldField IDTypeNote
Time Range Filtertime-range-filterobjectFilter search results to a specific time range. If customers set a start time, they must set an end time (and vice versa).

Time Range Filter

Filter search results to a specific time range. If customers set a start time, they must set an end time (and vice versa).

FieldField IDTypeNote
End Timeend-timestringExclusive end of the interval. If specified, a Timestamp matching this interval will have to be before the end. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".
Start Timestart-timestringInclusive start of the interval. If specified, a Timestamp matching this interval will have to be the same or after the start. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

Tool Config

Configuration for tool usage and function calling.

FieldField IDTypeNote
Function Calling Configfunction-calling-configobjectConfiguration for specifying function calling behavior.

Function Calling Config

Configuration for specifying function calling behavior.

FieldField IDTypeNote
Allowed Function Namesallowed-function-namesarrayA set of function names that, when provided, limits the functions the model will call. This should only be set when the Mode is ANY or VALIDATED. Function names should match [FunctionDeclaration.name]. When set, model will predict a function call from only allowed function names.
ModemodestringSpecifies the mode in which function calling should execute. If unspecified, the default value will be set to AUTO. The value is one of the following: MODE_UNSPECIFIED: Unspecified function calling mode. This value should not be used. AUTO: Default model behavior, model decides to predict either a function call or a natural language response. ANY: Model is constrained to always predicting a function call only. If "allowedFunctionNames" are set, the predicted function call will be limited to any one of "allowedFunctionNames", else the predicted function call will be any one of the provided "functionDeclarations". NONE: Model will not predict any function call. Model behavior is same as when not passing any function declarations. VALIDATED: Model decides to predict either a function call or a natural language response, but will validate function calls with constrained decoding. If "allowedFunctionNames" are set, the predicted function call will be limited to any one of "allowedFunctionNames", else the predicted function call will be any one of the provided "functionDeclarations".
Enum values
  • MODE_UNSPECIFIED
  • AUTO
  • ANY
  • NONE
The parts Object

Parts

parts must fulfill one of the following schemas:

FieldField IDTypeNote
TexttextstringInline text content.
FieldField IDTypeNote
BlobblobobjectRaw media bytes. Text should use the 'text' field instead.
FieldField IDTypeNote
Function Callfunction-callobjectPredicted function call with name and arguments.
FieldField IDTypeNote
Function Responsefunction-responseobjectResult of a function call with name and structured response.
FieldField IDTypeNote
File Datafile-dataobjectURI-based data reference with MIME type.
FieldField IDTypeNote
Executable Codeexecutable-codeobjectCode generated by the model that is meant to be executed.
FieldField IDTypeNote
Code Execution Resultcode-execution-resultobjectResult of executing the ExecutableCode.
OutputField IDTypeDescription
Operation (optional)operationstringThe cache operation that was performed.
Cached Content (optional)cached-contentobject[CREATE, GET, UPDATE] The cached content object for create, get, and update operations. Not returned for delete operations.
Cached Contents (optional)cached-contentsarray[object][LIST] List of cached contents for list operations.
Next Page Token (optional)next-page-tokenstring[LIST] Token for retrieving the next page of results for list operations.
Output Objects in Cache

Cached Content

FieldField IDTypeNote
Create Timecreate-timestringCreation time of the cached content in RFC3339 format.
Display Namedisplay-namestringOptional. The user-provided name of the cached content.
Expire Timeexpire-timestringExpiration time of the cached content in RFC3339 format.
ModelmodelstringThe name of the Model to use for cached content.
NamenamestringThe resource name referring to the cached content. Format: cachedContents/{cachedContent}
Update Timeupdate-timestringLast update time of the cached content in RFC3339 format.
Usage Metadatausage-metadataobjectToken usage statistics for the cached content.

Usage Metadata

FieldField IDTypeNote
Audio Duration Secondsaudio-duration-secondsintegerDuration of audio in seconds.
Image Countimage-countintegerNumber of images.
Text Counttext-countintegerNumber of text characters.
Total Token Counttotal-token-countintegerTotal number of tokens that the cached content consumes.
Video Duration Secondsvideo-duration-secondsintegerDuration of video in seconds.

Cached Contents

FieldField IDTypeNote
Create Timecreate-timestringCreation time of the cached content in RFC3339 format.
Display Namedisplay-namestringOptional. The user-provided name of the cached content.
Expire Timeexpire-timestringExpiration time of the cached content in RFC3339 format.
ModelmodelstringThe name of the Model to use for cached content.
NamenamestringThe resource name referring to the cached content. Format: cachedContents/{cachedContent}
Update Timeupdate-timestringLast update time of the cached content in RFC3339 format.
Usage Metadatausage-metadataobjectToken usage statistics for the cached content.

Usage Metadata

FieldField IDTypeNote
Audio Duration Secondsaudio-duration-secondsintegerDuration of audio in seconds.
Image Countimage-countintegerNumber of images.
Text Counttext-countintegerNumber of text characters.
Total Token Counttotal-token-countintegerTotal number of tokens that the cached content consumes.
Video Duration Secondsvideo-duration-secondsintegerDuration of video in seconds.

Example Recipes

version: v1beta
component:
  gemini:
    type: gemini
    task: TASK_CHAT
    input:
      model: gemini-2.5-pro
      stream: ${variable.stream}
      prompt: ${variable.prompt}
      images:
        - ${variable.image:base64}
      documents:
        - ${variable.document:base64}
      system-message: You are a helpful assistant.
      temperature: 1
      top-p: 1
    setup: ${connection.gemini}
variable:
  prompt:
    title: Prompt
    description: Prompt to instruct the model
    type: string
  document:
    title: Document
    description: Document to convert to Markdown
    type: document
  image:
    title: image
    description: Image to take a look
    type: image
  stream:
    title: Enable Stream
    description: whether to enable streaming
    type: boolean
output:
  texts:
    title: texts[0]
    value: ${gemini.output.texts[0]}
  usage:
    title: usage
    value: ${gemini.output.usage}
  candidates:
    title: candidates
    value: ${gemini.output.candidates}
  usage-metadata:
    title: usage-metadata
    value: ${gemini.output.usage-metadata}
  prompt-feedback:
    title: prompt-feedback
    value: ${gemini.output.prompt-feedback}
  model-version:
    title: model-version
    value: ${gemini.output.model-version}
  response-id:
    title: response-id
    value: ${gemini.output.response-id}