The Instill Artifact component is a data component that allows users to access files and perform RAG-based search and retrieval through catalogs in the Instill Core platform.
It can carry out the following tasks:
To use Artifact Component, you will need to set up the OpenAI API key for self-hosted deployment of Instill Core.
You can do this by setting the OPENAI_API_KEY
environment variable.
Please refer to configuring-the-embedding-feature
p.s. In Instill Cloud case, you do not need to set up the OpenAI API key.
Alpha
The component definition and tasks are defined in the definition.yaml and tasks.yaml files respectively.
Upload and process the files into chunks into Catalog.
Input Field ID Type Description Task ID (required) task
string TASK_UPLOAD_FILE
Options (required)options
object Choose to upload the files to existing catalog or create a new catalog.
The options
Object Options options
must fulfill one of the following schemas:
Existing Catalog
Field Field ID Type Note Catalog ID catalog-id
string Catalog ID that you input in the Catalog. File file
string Base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog. File Name file-name
string Name of the file, including the extension (e.g. example.pdf
). The length of this field is limited to 100 characters. Namespace namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. Option option
string Must be "existing catalog"
Create New Catalog
Field Field ID Type Note Catalog ID catalog-id
string Catalog ID for new catalog you want to create. Description description
string Description of the catalog. File file
string Base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog. File Name file-name
string Name of the file, including the extension (e.g. example.pdf
). The length of this field is limited to 100 characters. Namespace namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. Option option
string Must be "create new catalog"
Tags tags
array Tags for the catalog.
Output Field ID Type Description File file
object Result of uploading file into catalog. Status status
boolean The status of trigger file processing, if succeeded, return true.
Output Objects in Upload File File Field Field ID Type Note Catalog ID catalog-id
string ID of the catalog that you upload files. Create Time create-time
string Creation time of the file in ISO 8601 format. File Name file-name
string Name of the file. Type file-type
string Type of the file. File UID file-uid
string Unique identifier of the file. Size size
number Size of the file in bytes. Update Time update-time
string Update time of the file in ISO 8601 format.
Upload and process the files into chunks into Catalog.
Input Field ID Type Description Task ID (required) task
string TASK_UPLOAD_FILES
Options (required)options
object Choose to upload the files to existing catalog or create a new catalog.
The options
Object Options options
must fulfill one of the following schemas:
Existing Catalog
Field Field ID Type Note Catalog ID catalog-id
string Catalog ID that you input in the Catalog. File Names file-names
array Name of the file, including the extension (e.g. example.pdf
). The length of this field is limited to 100 characters. Files files
array Base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog. Namespace namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. Option option
string Must be "existing catalog"
Create New Catalog
Field Field ID Type Note Catalog ID catalog-id
string Catalog ID for new catalog you want to create. Description description
string Description of the catalog. File Names file-names
array Name of the file, including the extension (e.g. example.pdf
). The length of this field is limited to 100 characters. Files files
array Base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog. Namespace namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. Option option
string Must be "create new catalog"
Tags tags
array Tags for the catalog.
Output Field ID Type Description Files files
array[object] Files metadata in catalog. Status status
boolean The status of trigger file processing, if ALL succeeded, return true.
Output Objects in Upload Files Files Field Field ID Type Note Catalog ID catalog-id
string ID of the catalog that you upload files. Create Time create-time
string Creation time of the file in ISO 8601 format. File Name file-name
string Name of the file. Type file-type
string Type of the file. File UID file-uid
string Unique identifier of the file. Size size
number Size of the file in bytes. Update Time update-time
string Update time of the file in ISO 8601 format.
get the metadata of the files in the catalog.
Input Field ID Type Description Task ID (required) task
string TASK_GET_FILES_METADATA
Namespace (required) namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. Catalog ID (required) catalog-id
string Catalog ID that you input to search files in the Catalog.
Output Field ID Type Description Files files
array[object] Files metadata in catalog.
Output Objects in Get Files Metadata Field Field ID Type Note Catalog ID catalog-id
string ID of the catalog that you upload files. Create Time create-time
string Creation time of the file in ISO 8601 format. File Name file-name
string Name of the file. Type file-type
string Type of the file. File UID file-uid
string Unique identifier of the file. Size size
number Size of the file in bytes. Update Time update-time
string Update time of the file in ISO 8601 format.
get the metadata of the chunks from a file in the catalog.
Input Field ID Type Description Task ID (required) task
string TASK_GET_CHUNKS_METADATA
Catalog ID (required) catalog-id
string Catalog ID that you input to search files in the Catalog. Namespace (required) namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. File UID (required) file-uid
string The unique identifier of the file.
Output Field ID Type Description Chunks chunks
array[object] Chunks metadata of the file in catalog.
Output Objects in Get Chunks Metadata Field Field ID Type Note Chunk UID chunk-uid
string The unique identifier of the chunk. Create Time create-time
string The creation time of the chunk in ISO 8601 format. End Position end-position
integer The end position of the chunk in the file. File UID original-file-uid
string The unique identifier of the file. Retrievable retrievable
boolean The retrievable status of the chunk. Start Position start-position
integer The start position of the chunk in the file. Token Count token-count
integer The token count of the chunk.
get the file content in markdown format.
Input Field ID Type Description Task ID (required) task
string TASK_GET_FILE_IN_MARKDOWN
Catalog ID (required) catalog-id
string Catalog ID that you input to search files in the Catalog. Namespace (required) namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. File UID (required) file-uid
string The unique identifier of the file.
Output Field ID Type Description File UID original-file-uid
string The unique identifier of the file. Content content
string The content of the file in markdown format. Create Time create-time
string The creation time of the source file in ISO 8601 format. Update Time update-time
string The update time of the source file in ISO 8601 format.
Check if the specified file's processing status is done.
Input Field ID Type Description Task ID (required) task
string TASK_MATCH_FILE_STATUS
Catalog ID (required) catalog-id
string Catalog ID that you input to check files' processing status in the Catalog. Namespace (required) namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. File UID (required) file-uid
string The unique identifier of the file.
Output Field ID Type Description Status succeeded
boolean The status of the file processing, if succeeded, return true.
search the chunks in the catalog.
Input Field ID Type Description Task ID (required) task
string TASK_RETRIEVE
Catalog ID (required) catalog-id
string Catalog ID that you input to search files in the Catalog. Namespace (required) namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. Text Prompt (required) text-prompt
string The prompt string to search the chunks. Top K top-k
integer The number of top chunks to return. The range is from 1~20, and default is 5. File UIDs file-uids
array[string] Optional list of file UIDs to filter the results by. The elements of the list must be UUID-formatted strings. File Media Type file-media-type
string The media type to filter, empty for all. Enum values Content Type content-type
string The content type to filter, empty for all. Enum values
Output Field ID Type Description Chunks chunks
array[object] Chunks data from smart search.
Output Objects in Retrieve Chunks Field Field ID Type Note Chunk UID chunk-uid
string The unique identifier of the chunk. Similarity similarity-score
number The similarity score of the chunk. Source File Name source-file-name
string The name of the source file. Source File UID source-file-uid
string The UID of the source file. Text Content text-content
string The text content of the chunk.
Reply the questions based on the files in the catalog.
Input Field ID Type Description Task ID (required) task
string TASK_ASK
Catalog ID (required) catalog-id
string Catalog ID that you input to search files in the Catalog. Namespace (required) namespace
string Fill in your namespace, you can get namespace through the tab of switching namespace. Question (required) question
string The question to reply. Top K top-k
integer The number of top answers to return. The range is from 1~20, and default is 5. File UIDs file-uids
array[string] Optional list of file UIDs to filter the results by. The elements of the list must be UUID-formatted strings.
Output Field ID Type Description Answer answer
string Answers data from smart search. Chunks (optional)chunks
array[object] Chunks data to answer question.
Output Objects in Ask Chunks Field Field ID Type Note Chunk UID chunk-uid
string The unique identifier of the chunk. Similarity similarity-score
number The similarity score of the chunk. Source File Name source-file-name
string The name of the source file. Source File UID source-file-uid
string The UID of the source file. Text Content text-content
string The text content of the chunk.
YAML
version: v1beta
component:
artifact-0:
type: instill-artifact
task: TASK_ASK
input:
catalog-id: ${variable.catalog_name}
namespace: ${variable.namespace}
question: ${variable.question}
top-k: 5
variable:
catalog_name:
title: catalog-name
description: The name of your catalog i.e. "instill-ai"
type: string
namespace:
title: namespace
description: The namespace of your catalog i.e. "instill-ai"
type: string
question:
title: question
description: The question to ask your catalog i.e. "What is Instill AI doing?", "What is Artifact?"
type: string
output:
answer:
title: answer
value: ${artifact-0.output.answer}
Sync files from Google Drive to Instill Catalog.
YAML
version: v1beta
variable:
namespace:
title: Namespace
type: string
catalog:
title: Catalog
type: string
folder-link:
title: Folder Link
type: string
component:
read-folder:
type: google-drive
input:
shared-link: ${variable.folder-link}
read-content: true
setup:
refresh-token: ${secret.refresh-token-gd}
task: TASK_READ_FOLDER
sync:
type: instill-artifact
input:
namespace: ${variable.namespace}
catalog-id: ${variable.catalog}
third-party-files: ${read-folder.output.files}
task: TASK_SYNC_FILES
output:
sync-result:
title: Sync Result
value: ${sync.output}