The OpenAI component is an AI component that allows users to connect the AI models served on the OpenAI Platform.
It can carry out the following tasks:
The component definition and tasks are defined in the definition.yaml and tasks.yaml files respectively.
Setup
In order to communicate with OpenAI, the following connection details need to be
provided. You may specify them directly in a pipeline recipe as key-value pairs
within the component's setup block, or you can create a Connection from
the Integration Settings
page and reference the whole setup as setup: ${connection.<my-connection-id>}.
Field
Field ID
Type
Note
API Key
api-key
string
Fill in your OpenAI API key. To find your keys, visit your OpenAI's API Keys page.
Organization ID
organization
string
Specify which organization is used for the requests. Usage will count against the specified organization's subscription quota.
Supported Tasks
Text Generation
OpenAI's text generation models (often called generative pre-trained transformers or large language models) have been trained to understand natural language, code, and images. The models provide text outputs in response to their inputs. The inputs to these models are also referred to as "prompts". Designing a prompt is essentially how you “program” a large language model model, usually by providing instructions or some examples of how to successfully complete a task.
Input
Field ID
Type
Description
Task ID (required)
task
string
TASK_TEXT_GENERATION
Model (required)
model
string
ID of the model to use. Enum values
o1
o1-preview
o1-mini
gpt-4o-mini
gpt-4o
gpt-4o-2024-05-13
gpt-4o-2024-08-06
gpt-4-turbo
gpt-4-turbo-2024-04-09
gpt-4-0125-preview
gpt-4-turbo-preview
gpt-4-1106-preview
gpt-4-vision-preview
gpt-4
gpt-4-0314
gpt-4-0613
gpt-4-32k
gpt-4-32k-0314
gpt-4-32k-0613
gpt-3.5-turbo
gpt-3.5-turbo-16k
gpt-3.5-turbo-0301
gpt-3.5-turbo-0613
gpt-3.5-turbo-1106
gpt-3.5-turbo-0125
gpt-3.5-turbo-16k-0613
Prompt (required)
prompt
string
The prompt text.
System Message
system-message
string
The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.".
Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}.
Temperature
temperature
number
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top-p but not both. .
N
n
integer
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Max Tokens
max-tokens
integer
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. .
Presence Penalty
presence-penalty
number
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Frequency Penalty
frequency-penalty
number
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content.
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
Tool Choice
tool-choice
any
Controls which (if any) tool is called by the model. 'none' means the model will not call any tool and instead generates a message. 'auto' means the model can pick between generating a message or calling one or more tools. 'required' means the model must call one or more tools.
Input Objects in Text Generation
Chat History
Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}.
Either a URL of the image or the base64 encoded image data.
Prediction
Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content.
Field
Field ID
Type
Note
Content
content
string
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
Tools
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
A description of what the function does, used by the model to choose when and how to call the function.
Name
name
string
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Parameters
parameters
object
The parameters the functions accepts, described as a JSON Schema object. Omitting parameters defines a function with an empty parameter list.
Strict
strict
boolean
Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field.
The response-type Object
Response Type
response-type must fulfill one of the following schemas:
Total number of tokens used (prompt + completion).
Prompt Token Details
Field
Field ID
Type
Note
Audio tokens
audio-tokens
integer
Audio input tokens present in the prompt.
Cached tokens
cached-tokens
integer
Cached tokens present in the prompt.
Completion Token Details
Field
Field ID
Type
Note
Accepted prediction tokens
accepted-prediction-tokens
integer
When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio tokens
audio-tokens
integer
Audio input tokens generated by the model.
Reasoning tokens
reasoning-tokens
integer
Tokens generated by the model for reasoning.
Rejected prediction tokens
rejected-prediction-tokens
integer
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
The type of the tool. Currently, only function is supported.
Function
Field
Field ID
Type
Note
Arguments
arguments
string
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
Name
name
string
The name of the function to call.
Text Embeddings
Turn text into numbers, unlocking use cases like search.
Input
Field ID
Type
Description
Task ID (required)
task
string
TASK_TEXT_EMBEDDINGS
Model (required)
model
string
ID of the model to use. Enum values
text-embedding-ada-002
text-embedding-3-small
text-embedding-3-large
Text (required)
text
string
The text.
Dimensions
dimensions
integer
The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.
Output
Field ID
Type
Description
Embedding
embedding
array[number]
Embedding of the input text.
Speech Recognition
Turn audio into text.
Input
Field ID
Type
Description
Task ID (required)
task
string
TASK_SPEECH_RECOGNITION
Model (required)
model
string
ID of the model to use. Only whisper-1 is currently available. . Enum values
whisper-1
Audio (required)
audio
audio/*
The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. .
Prompt
prompt
string
An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language. .
Language
language
string
The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. .
Temperature
temperature
number
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. .
Output
Field ID
Type
Description
Text
text
string
Generated text.
Text to Speech
Turn text into lifelike spoken audio
Input
Field ID
Type
Description
Task ID (required)
task
string
TASK_TEXT_TO_SPEECH
Model (required)
model
string
One of the available TTS models: tts-1 or tts-1-hd . Enum values
tts-1
tts-1-hd
Text (required)
text
string
The text to generate audio for. The maximum length is 4096 characters.
Voice (required)
voice
string
The voice to use when generating the audio. Supported voices are alloy, echo, fable, onyx, nova, and shimmer. Enum values
alloy
echo
fable
onyx
nova
shimmer
Response Format
response-type
string
The format to audio in. Supported formats are mp3, opus, aac, and flac. Enum values
mp3
opus
aac
flac
Speed
speed
number
The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
Output
Field ID
Type
Description
Audio (optional)
audio
audio/wav
AI generated audio.
Text to Image
Generate or manipulate images with DALL·E.
Input
Field ID
Type
Description
Task ID (required)
task
string
TASK_TEXT_TO_IMAGE
Model (required)
model
string
The model to use for image generation. Enum values
dall-e-2
dall-e-3
Prompt (required)
prompt
string
A text description of the desired image(s). The maximum length is 1000 characters for dall-e-2 and 4000 characters for dall-e-3.
N
n
integer
The number of images to generate. Must be between 1 and 10. For dall-e-3, only n=1 is supported.
Quality
quality
string
The quality of the image that will be generated. hd creates images with finer details and greater consistency across the image. This param is only supported for dall-e-3. Enum values
standard
hd
Size
size
string
The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024 for dall-e-2. Must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 models. Enum values
256x256
512x512
1024x1024
1792x1024
1024x1792
N
style
string
The style of the generated images. Must be one of vivid or natural. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images. This param is only supported for dall-e-3. Enum values
version: v1beta
component:
mistral-0:
type: mistral-ai
task: TASK_TEXT_GENERATION_CHAT
input:
max-new-tokens: 100
model-name: open-mixtral-8x22b
prompt: |-
Generate a Picasso-inspired image based on the following user input:
${variable.prompt}
Using the specified Picasso period: ${variable.period}
Transform this input into a detailed text-to-image prompt by:
1. Identifying the key elements or subjects in the user's description
2. Adding artistic elements and techniques specific to the ${variable.period} period of Picasso's work
3. Including cubist or abstract features characteristic of the ${variable.period}
4. Suggesting a composition or scene layout typical of Picasso's work from this era
Enhance the prompt with vivid, descriptive language and specific Picasso-style elements from the ${variable.period}. The final prompt should begin with "Create an image in the style of Picasso's ${variable.period} period:" followed by the enhanced description.
safe: false
system-message: You are a helpful assistant.
temperature: 0.7
top-k: 10
top-p: 0.5
setup:
api-key: ${secret.INSTILL_SECRET}
openai-0:
type: openai
task: TASK_TEXT_TO_IMAGE
input:
model: dall-e-3
n: 1
prompt: |-
Using this primary color palette: ${variable.colour}
${mistral-0.output.text}
quality: standard
size: 1024x1024
style: vivid
setup:
api-key: ${secret.INSTILL_SECRET}
variable:
colour:
title: Colour
description: Describe the main colour to use i.e. blue, random
type: string
instill-ui-order: 1
period:
title: Period
description: |
Input different Picasso periods i.e. Blue, Rose, African, Synthetic Cubism, etc.
type: string
prompt:
title: Prompt
description: Input prompt here i.e. "A cute baby wombat"
type: string
output:
image:
title: Image
value: ${openai-0.output.results}
version: v1beta
component:
openai:
type: openai
task: TASK_TEXT_GENERATION
input:
model: gpt-4o-mini
n: 1
prompt: |-
Talk about this topic in ${variable.language} in a concise and beginner-friendly way:
${variable.prompt}
response-format:
type: text
system-message: You are a helpful assistant.
temperature: 1
top-p: 1
setup:
api-key: ${secret.INSTILL_SECRET}
variable:
language:
title: Language
description: Input a language i.e. Chinese, Japanese, French, etc.
type: string
prompt:
title: Prompt
description: Write the topic you want to ask about here i.e. "Tell me about small LLMs"
type: string
output:
result:
title: Result
value: ${openai.output.texts}