Skip to main content

QueryPinecone

Description

Queries Pinecone for vectors that are similar to the input vector, or retrieves a vector by ID.

Tags

chatbot, gen ai, generative ai, llm, pinecone, query, similarity, vector

Properties

In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Pinecone API Key *Pinecone API KeyThe API key for the Pinecone service
Pinecone Index *Pinecone IndexThe name of the Pinecone index to use

Supports Expression Language, using FlowFile attributes and Environment variables.
Pinecone Namespace *Pinecone NamespacedefaultThe name of the Pinecone namespace to use

Supports Expression Language, using FlowFile attributes and Environment variables.
Record Reader *Record ReaderController Service:
RecordReaderFactory

Implementations:
AvroReader
CEFReader
CSVReader
ExcelReader
GrokReader
JsonPathReader
JsonTreeReader
ReaderLookup
ScriptedReader
Syslog5424Reader
SyslogReader
WindowsEventLogReader
XMLReader
YamlTreeReader
The Record Reader to use for reading the FlowFile
Record Writer *Record WriterController Service:
RecordSetWriterFactory

Implementations:
AvroRecordSetWriter
CSVRecordSetWriter
FreeFormTextRecordSetWriter
JsonRecordSetWriter
RecordSetWriterLookup
ScriptedRecordSetWriter
XMLRecordSetWriter
The Record Writer to use for writing the results
Query Strategy *Query StrategyQuery by Vector
  • Query by Vector
  • Query by ID
The strategy to use for querying Pinecone
Vector Record Path *Vector Record Path/embeddingsThe path to the vector field in the record

Supports Expression Language, using FlowFile attributes and Environment variables.

This property is only considered if:
  • the property Query Strategy has a value of Query by Vector
Sparse Vector Indices PathSparse Vector Indices PathIf, Sparse Vectors are to be provided, this RecordPath points to the indices of the sparse data to use.

Supports Expression Language, using FlowFile attributes and Environment variables.

This property is only considered if:
  • the property Query Strategy has a value of Query by Vector
Sparse Vector Values Path *Sparse Vector Values PathIf, Sparse Vectors are to be provided, this RecordPath points to the values of the sparse data to use.

Supports Expression Language, using FlowFile attributes and Environment variables.

This property is only considered if:
  • the property Query Strategy has a value of Query by Vector
  • the property Sparse Vector Indices Path has a value specified
Sparse Dense Vector WeightingSparse Dense Vector WeightingRanges from 0.0 to 1.0. Weight to apply on dense and sparse vectors when doing an hybrid search. (1 - weight) will be applied to the values of the sparse vector and (weight) will be applied to the dense vector.

Supports Expression Language, using FlowFile attributes and Environment variables.

This property is only considered if:
  • the property Sparse Vector Values Path has a value specified
ID Record PathID Record PathThe path to the ID field in the record

Supports Expression Language, using FlowFile attributes and Environment variables.

This property is only considered if:
  • the property Query Strategy has a value of Query by ID
Number of Results *Number of Results10The number of results to return (i.e., Top K)

Supports Expression Language, using FlowFile attributes and Environment variables.
Results Record Path *Results Record PathSpecifies where in the record to place the results.

Supports Expression Language, using FlowFile attributes and Environment variables.
Include Metadata *Include Metadatatrue
  • true
  • false
Specifies whether to include metadata in the results
Include Vectors *Include Vectorstrue
  • true
  • false
Specifies whether to include vectors in the results
Query FilterQuery FilterA JSON representation of the query filter to use

Supports Expression Language, using FlowFile attributes and Environment variables.
Web Client Service *Web Client ServiceController Service:
WebClientServiceProvider

Implementations:
StandardWebClientServiceProvider
The Web Client Service to use for communicating with Pinecone

Dynamic Properties

This component does not support dynamic properties.

Relationships

NameDescription
failureFlowFiles that cannot be sent to Pinecone, and for which a retry is not expected to be successful, are routed to this relationship
retryFlowFiles that fail to be sent to Pinecone, but for which a retry may help, are routed to this relationship
successFlowFiles that are successfully sent to Pinecone are routed to this relationship

Reads Attributes

This processor does not read attributes.

Writes Attributes

This processor does not write attributes.

State Management

This component does not store state.

Restricted

This component is not restricted.

Input Requirement

This component requires an incoming relationship.

Example Use Cases Involving Other Components

Multiprocessor Use Case 1

Query Pinecone for vectors that are similar to some input text

This flow will result in the results of querying Pinecone being placed in the /results field of the output FlowFile. If text content is stored in metadata, it can be retrieved from the /metadata/text field, for example. Alternatively, the metadata may be used to retrieve the text content from some other location.

Components Involved

  • CreateOpenAiEmbeddings
    1. Set "OpenAI API Key" to the API key for your OpenAI account.
    2. Set "Model" to the name of the OpenAI model to use for creating embeddings.
    3. Set "Record Writer" to a Record Writer that writes out data in the desired format, typically JSON.
    4. Set "Web Client Service" to a WebClientService that can be used to make requests to the OpenAI API.
    5. If the text to query on is the content of the FlowFile, leave the "Record Reader" property unset.
    6. Otherwise, if the text to query on is a field in the FlowFile, set "Record Reader" to a Record Reader that can read the format of the data in the FlowFile and set "Text Record Path" to the path to the field containing the text to query on (e.g., /text).
  • QueryPinecone
    1. Set "Pinecone API Key" to the API key for your Pinecone account.
    2. Set "Pinecone Index" to the name of the Pinecone index to query.
    3. Set "Pinecone Namespace" to the namespace to use for querying Pinecone.
    4. Set "Record Reader" to a Record Reader that can read the format of the data written by CreateOpenAiEmbeddings.
    5. Set "Record Writer" to a Record Writer appropriate for the desired output format.
    6. Set "Web Client Service" to the same WebClientService that was used in the CreateOpenAiEmbeddings processor.
    7. Set "Query Strategy" to Query by Vector.
    8. Set "Vector Record Path" to /embeddings.
    9. Set "Results Record Path" to /results.
    10. Set "Include Metadata" to true.
    11. Set "Include Vectors" to false.

System Resource Considerations

This component does not specify system resource considerations.

See Also