The Instill Model component is an AI component that allows users to connect the Al models served in the Instill Core platform.
It can carry out the following tasks:
Alpha
The component definition and tasks are defined in the definition.yaml and tasks.yaml files respectively.
Classify images into predefined categories.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_CLASSIFICATION |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Image (required) | image-base64 | string | Image base64. |
Output | Field ID | Type | Description |
---|
Category | category | string | The predicted category of the input. |
Score | score | number | The confidence score of the predicted category of the input. |
Detect, localize and delineate multiple objects in images.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_INSTANCE_SEGMENTATION |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Image (required) | image-base64 | string | Image base64. |
Output | Field ID | Type | Description |
---|
Objects | objects | array[object] | A list of detected instance bounding boxes. |
Output Objects in Instance Segmentation
Objects
Field | Field ID | Type | Note |
---|
Bounding Box | bounding-box | object | The detected bounding box in (left, top, width, height) format. |
Category | category | string | The predicted category of the bounding box. |
RLE | rle | string | Run Length Encoding (RLE) of instance mask within the bounding box. |
Score | score | number | The confidence score of the predicted instance object. |
Bounding Box
Field | Field ID | Type | Note |
---|
Height | height | number | Bounding box height value |
Left | left | number | Bounding box left x-axis value |
Top | top | number | Bounding box top y-axis value |
Width | width | number | Bounding box width value |
Detect and localize multiple keypoints of objects in images.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_KEYPOINT |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Image (required) | image-base64 | string | Image base64. |
Output | Field ID | Type | Description |
---|
Objects | objects | array[object] | A list of keypoint objects, a keypoint object includes all the pre-defined keypoints of a detected object. |
Output Objects in Keypoint
Objects
Field | Field ID | Type | Note |
---|
Bounding Box | bounding-box | object | The detected bounding box in (left, top, width, height) format. |
Keypoints | keypoints | array | A keypoint group is composed of a list of pre-defined keypoints of a detected object. |
Score | score | number | The confidence score of the predicted object. |
Keypoints
Field | Field ID | Type | Note |
---|
Visibility Score | v | number | visibility score of the keypoint. |
X Coordinate | x | number | x coordinate of the keypoint. |
Y Coordinate | y | number | y coordinate of the keypoint. |
Bounding Box
Field | Field ID | Type | Note |
---|
Height | height | number | Bounding box height value |
Left | left | number | Bounding box left x-axis value |
Top | top | number | Bounding box top y-axis value |
Width | width | number | Bounding box width value |
Detect and localize multiple objects in images.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_DETECTION |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Image (required) | image-base64 | string | Image base64. |
Output | Field ID | Type | Description |
---|
Objects | objects | array[object] | A list of detected objects. |
Output Objects in Detection
Objects
Field | Field ID | Type | Note |
---|
Bounding box | bounding-box | object | The detected bounding box in (left, top, width, height) format. |
Category | category | string | The predicted category of the bounding box. |
Score | score | number | The confidence score of the predicted category of the bounding box. |
Bounding Box
Field | Field ID | Type | Note |
---|
Height | height | number | Bounding box height value |
Left | left | number | Bounding box left x-axis value |
Top | top | number | Bounding box top y-axis value |
Width | width | number | Bounding box width value |
Detect and recognize text in images.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_OCR |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Image (required) | image-base64 | string | Image base64. |
Output | Field ID | Type | Description |
---|
Objects | objects | array[object] | A list of detected bounding boxes. |
Output Objects in OCR
Objects
Field | Field ID | Type | Note |
---|
Bounding Box | bounding-box | object | The detected bounding box in (left, top, width, height) format. |
Score | score | number | The confidence score of the predicted object. |
Text | text | string | Text string recognised per bounding box. |
Bounding Box
Field | Field ID | Type | Note |
---|
Height | height | number | Bounding box height value |
Left | left | number | Bounding box left x-axis value |
Top | top | number | Bounding box top y-axis value |
Width | width | number | Bounding box width value |
Classify image pixels into predefined categories.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_SEMANTIC_SEGMENTATION |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Image (required) | image-base64 | string | Image base64. |
Output | Field ID | Type | Description |
---|
Stuffs | stuffs | array[object] | A list of RLE binary masks. |
Output Objects in Semantic Segmentation
Stuffs
Field | Field ID | Type | Note |
---|
Category | category | string | Category text string corresponding to each stuff mask. |
RLE | rle | string | Run Length Encoding (RLE) of each stuff mask within the image. |
Generate texts from input text prompts.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_TEXT_GENERATION |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Prompt (required) | prompt | string | The prompt text. |
System Message | system-message | string | The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.". |
Seed | seed | integer | The seed. |
Temperature | temperature | number | The temperature for sampling. |
Max New Tokens | max-new-tokens | integer | The maximum number of tokens for model to generate. |
Output | Field ID | Type | Description |
---|
Text | text | string | Text. |
Generate texts from input text prompts and chat history.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_TEXT_GENERATION_CHAT |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Prompt (required) | prompt | string | The prompt text. |
System Message | system-message | string | The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.". |
Prompt Images | prompt-images | array[string] | The prompt images. |
Chat History | chat-history | array[object] | Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format: {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}. |
Seed | seed | integer | The seed. |
Temperature | temperature | number | The temperature for sampling. |
Max New Tokens | max-new-tokens | integer | The maximum number of tokens for model to generate. |
Input Objects in Text Generation Chat
Chat History
Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format: {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}.
Field | Field ID | Type | Note |
---|
Content | content | array | The message content. |
Role | role | string | The message role, i.e. 'system', 'user' or 'assistant'. |
Content
The message content.
Field | Field ID | Type | Note |
---|
Image URL | image-url | object | The image URL |
Text | text | string | The text content. |
Type | type | string | The type of the content part.
Enum values |
Image URL
The image URL
Field | Field ID | Type | Note |
---|
URL | url | string | Either a URL of the image or the base64 encoded image data. |
Output | Field ID | Type | Description |
---|
Text | text | string | Text. |
Generate images from input text prompts.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_TEXT_TO_IMAGE |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Prompt (required) | prompt | string | The prompt text. |
Samples | samples | integer | The number of generated samples, default is 1. |
Seed | seed | integer | The seed, default is 0. |
Aspect Ratio | negative-prompt | string | Keywords of what you do not wish to see in the output image. |
Aspect Ratio | aspect-ratio | string | Controls the aspect ratio of the generated image. Defaults to 1:1.
Enum values16:9 1:1 21:9 2:3 3:2 4:5 5:4 9:16 9:21
|
Output | Field ID | Type | Description |
---|
Images | images | array[image/jpeg] | Images. |
Answer questions based on a prompt and an image.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_VISUAL_QUESTION_ANSWERING |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Prompt (required) | prompt | string | The prompt text. |
System Message | system-message | string | The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.". |
Prompt Images | prompt-images | array[string] | The prompt images. |
Chat History | chat-history | array[object] | Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format: {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}. |
Seed | seed | integer | The seed. |
Temperature | temperature | number | The temperature for sampling. |
Max New Tokens | max-new-tokens | integer | The maximum number of tokens for model to generate. |
Input Objects in Visual Question Answering
Chat History
Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format: {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}.
Field | Field ID | Type | Note |
---|
Content | content | array | The message content. |
Role | role | string | The message role, i.e. 'system', 'user' or 'assistant'. |
Content
The message content.
Field | Field ID | Type | Note |
---|
Image URL | image-url | object | The image URL |
Text | text | string | The text content. |
Type | type | string | The type of the content part.
Enum values |
Image URL
The image URL
Field | Field ID | Type | Note |
---|
URL | url | string | Either a URL of the image or the base64 encoded image data. |
Output | Field ID | Type | Description |
---|
Text | text | string | Text. |
Generate texts from input text prompts and chat history.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_CHAT |
Model Name (required) | model-name | string | The Instill Model model to be used. |
Prompt (required) | prompt | string | The prompt text. |
System Message | system-message | string | The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.". |
Prompt Images | prompt-images | array[string] | The prompt images. |
Chat History | chat-history | array[object] | Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format: {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}. |
Seed | seed | integer | The seed. |
Temperature | temperature | number | The temperature for sampling. |
Max New Tokens | max-new-tokens | integer | The maximum number of tokens for model to generate. |
Input Objects in Chat
Chat History
Incorporate external chat history, specifically previous messages within the conversation. Please note that System Message will be ignored and will not have any effect when this field is populated. Each message should adhere to the format: {"role": "The message role, i.e. 'system', 'user' or 'assistant'", "content": "message content"}.
Field | Field ID | Type | Note |
---|
Content | content | array | The message content. |
Role | role | string | The message role, i.e. 'system', 'user' or 'assistant'. |
Content
The message content.
Field | Field ID | Type | Note |
---|
Image URL | image-url | object | The image URL |
Text | text | string | The text content. |
Type | type | string | The type of the content part.
Enum values |
Image URL
The image URL
Field | Field ID | Type | Note |
---|
URL | url | string | Either a URL of the image or the base64 encoded image data. |
Output | Field ID | Type | Description |
---|
Text | text | string | Text. |
This task refers to the process of generating vector embeddings from input data, which can be text or images. This transformation converts the data into a dense, fixed-length numerical representation that captures the essential features of the original input. These embeddings are typically used in machine learning tasks to represent complex data in a more structured, simplified form.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_EMBEDDING |
Data (required) | data | object | Input data. |
Parameter | parameter | object | Input parameter. |
Input Objects in Embedding
Data
Input data.
Field | Field ID | Type | Note |
---|
Embeddings | embeddings | array | List of input data to be embedded. |
Model | model | string | The model to be used for generating embeddings. It should be namespace/model-name/version . i.e. abrc/yolov7-stomata/v0.1.0 . You can see the version from the Versions tab of Model page. |
Parameter
Input parameter.
Field | Field ID | Type | Note |
---|
Dimensions | dimensions | integer | Number of dimensions in the output embedding vectors. |
Data Format | format | string | The data format of the embeddings. Defaults to float.
Enum values |
Input Type | input-type | string | The type of input data to be embedded (e.g., query, document). |
Truncate | truncate | string | How to handle inputs longer than the max token length. Defaults to 'End'.
Enum values |
The embeddings
Object
Embeddings
embeddings
must fulfill one of the following schemas:
Text
Field | Field ID | Type | Note |
---|
Text Content | text | string | When the input is text, the raw text is tokenized and processed into a dense, fixed-length vector that captures semantic information such as word meanings and relationships. These text embeddings enable tasks like sentiment analysis, search, or classification. |
Text | type | string | Must be "text" |
Image URL
Field | Field ID | Type | Note |
---|
Image URL | image-url | string | When the input is an image from a URL, the image is first fetched from the URL and then decoded into its original format. It is then processed into a fixed-length vector representing essential visual features like shapes and colors. These image embeddings are useful for tasks like image classification or similarity search, providing structured numerical data for complex visual inputs. |
Image URL | type | string | Must be "image-url" |
Image Base64
Field | Field ID | Type | Note |
---|
Image File | image-base64 | string | When the input is an image in base64 format, the base64-encoded data is first decoded into its original image form. The image is then processed and transformed into a dense, fixed-length numerical vector, capturing key visual features like shapes, colors, or textures. |
Image File | type | string | Must be "image-base64" |
Output | Field ID | Type | Description |
---|
Data | data | object | Output data. |
Output Objects in Embedding
Data
Field | Field ID | Type | Note |
---|
Embeddings | embeddings | array | List of generated embeddings. |
Embeddings
Field | Field ID | Type | Note |
---|
Created | created | integer | The Unix timestamp (in seconds) of when the embedding was created. |
Index | index | integer | The index of the embedding vector in the array. |
Embedding Vector | vector | array | The embedding vector. |