The Pinecone component is a data component that allows users to build and search vector datasets.
It can carry out the following tasks:
Alpha
The component definition and tasks are defined in the definition.yaml and tasks.yaml files respectively.
In order to communicate with Pinecone, the following connection details need to be
provided. You may specify them directly in a pipeline recipe as key-value pairs
within the component's setup
block, or you can create a Connection from
the Integration Settings
page and reference the whole setup
as setup: ${connection.<my-connection-id>}
.
Field | Field ID | Type | Note |
---|
API Key (required) | api-key | string | Fill in your Pinecone AI API key. You can create an api key in Pinecone Console. |
Pinecone Index URL | url | string | Fill in your Pinecone index URL. It is in the form. |
Retrieve the ids of the most similar items in a namespace, along with their similarity scores.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_QUERY |
ID | id | string | The unique ID of the vector to be used as a query vector. If present, the vector parameter will be ignored. |
Vector (required) | vector | array[number] | An array of dimensions for the query vector. |
Top K (required) | top-k | integer | The number of results to return for each query. |
Namespace | namespace | string | The namespace to query. |
Filter | filter | object | The filter to apply. You can use vector metadata to limit your search. See more details here. |
Minimum Score | min-score | number | Exclude results whose score is below this value. |
Include Metadata | include-metadata | boolean | Indicates whether metadata is included in the response as well as the IDs. |
Include Values | include-values | boolean | Indicates whether vector values are included in the response. |
Output | Field ID | Type | Description |
---|
Namespace | namespace | string | The namespace of the query. |
Matches | matches | array[object] | The matches returned for the query. |
Output Objects in Query
Matches
Field | Field ID | Type | Note |
---|
ID | id | string | The ID of the matched vector. |
Metadata | metadata | json | Metadata. |
Score | score | number | A measure of similarity between this vector and the query vector. The higher the score, the more similar they are. |
Values | values | array | Vector data values. |
Writes vectors into a namespace. If a new value is upserted for an existing vector id, it will overwrite the previous value. This task will be soon replaced by TASK_BATCH_UPSERT
, which extends its functionality.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_UPSERT |
ID (required) | id | string | This is the vector's unique id. |
Values (required) | values | array[number] | An array of dimensions for the vector to be saved. |
Namespace | namespace | string | The namespace to query. |
Metadata | metadata | object | The vector metadata. |
Output | Field ID | Type | Description |
---|
Upserted Count | upserted-count | integer | Number of records modified or added. |
Writes vectors into a namespace. If a new value is upserted for an existing vector ID, it will overwrite the previous value.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_BATCH_UPSERT |
Vectors (required) | vectors | array[object] | Array of vectors to upsert |
Namespace | namespace | string | The namespace to query. |
Input Objects in Batch Upsert
Vectors
Array of vectors to upsert
Field | Field ID | Type | Note |
---|
ID | id | string | The unique ID of the vector. |
Metadata | metadata | object | The vector metadata. This is a set of key-value pairs that can be used to store additional information about the vector. The values can have the following types: string, number, boolean, or array of strings. |
Values | values | array | An array of dimensions for the vector to be saved. |
Output | Field ID | Type | Description |
---|
Upserted Count | upserted-count | integer | Number of records modified or added. |
Rerank documents, such as text passages, according to their relevance to a query. The input is a list of documents and a query. The output is a list of documents, sorted by relevance to the query.
Input | Field ID | Type | Description |
---|
Task ID (required) | task | string | TASK_RERANK |
Query (required) | query | string | The query to rerank the documents. |
Documents (required) | documents | array[string] | The documents to rerank. |
Top N | top-n | integer | The number of results to return sorted by relevance. Defaults to the number of inputs. |
Output | Field ID | Type | Description |
---|
Reranked Documents. | documents | array[string] | Reranked documents. |
Scores | scores | array[number] | The relevance score of the documents normalized between 0 and 1. |