Evaluates the correctness of generated answers in a Retrieval-Augmented Generation (RAG) context by computing metrics such as F1 score, cosine similarity, and answer correctness

EvaluateRagFaithfulness

Evaluates the faithfulness of generated answers in a Retrieval-Augmented Generation (RAG) system by analyzing responses using an LLM (e.g., OpenAI's GPT)

EvaluateRagRetrieval

Calculates retrieval metrics (Precision@N, Recall@N, FScore@N, MAP@N, MRR) for a RAG system using an LLM as a judge

ExecuteSQLStatement

Executes a SQL DDL or DML Statement against a database

ExtractDocumentRawText

Extracts the text from a Document and writes it to the FlowFile content

FetchSharepointFile

Fetches the contents of a file from a Sharepoint Drive, optionally downloading a PDF or HTML version of the file when applicable

FormatWordDocument

Formats a MS Word docx file

GenerateAnswersFromContext

Generates synthetic answers for each question present in the incoming records using a Large Language Model (LLM)

GenerateAnswersFromGroundTruth

Generates synthetic answers for each question in the incoming records using an LLM

GetDBFSFile

Read a DBFS file.

GetHubSpotObject

Get a HubSpot object and its associations by ID or unique value.

GetUnityCatalogFile

Read a Unity Catalog file up to 5 GiB.

GetUnityCatalogFileMetadata

Checks for Unity Catalog file metadata.

ListDBFSDirectory

List file names in a DBFS directory and output a new FlowFile with the filename.

ListUnityCatalogDirectory

List file names in a Unity Catalog directory and output a new FlowFile with the filename.

MergeDocumentElements

Given a FlowFile that contains a full Document and one more FlowFiles that contain additional data to merge into the Document, this Processor will merge the additional data into the Document

OpenAiTranscribeAudio

Transcribes audio into English text

ParsePdfDocument

Parses a PDF file, extracting the text and additional information into a structured JSON document

ParseTableImage

Extracts the text from a Table image and writes it to the FlowFile content in csv format.

PerformOCR

Uses the Datavolo Tesseract OCR Service to extract text from a PDF or image, optionally providing metadata including the bounding box, page number and confidence level of the OCR.

PromptAnthropicAI

Sends a prompt to Anthropic, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile

PromptAzureOpenAI

Sends a prompt to Azure's OpenAI service, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile

PromptLLM

This processor sends a user defined prompt to a Large Language Model (LLM) to respond.

PromptOllama

Sends a prompt to Ollama, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile

PromptOpenAI

Sends a prompt to OpenAI, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile

PromptSnowflakeCortex

Sends a prompt to Snowflake Cortex, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile

PromptVertexAI

Sends a prompt to VertexAI, writing the response either as a FlowFile attribute or to the contents of the incoming FlowFile

PublishKafka

Sends the contents of a FlowFile as either a message or as individual records to Apache Kafka using the Kafka Producer API

PutDatabricksSQL

Submit a SQL Execution using Databricks REST API then write the JSON response to FlowFile Content

PutDBFSFile

Write FlowFile content to DBFS.

PutHubSpot

Upsert a HubSpot object.

PutIcebergTable

Store records in Iceberg using configurable Catalog for managing namespaces and tables.

PutMLflow

Record metadata in MLflow

PutSnowflakeInternalStageFile

Puts files into a Snowflake internal stage

PutUnityCatalogFile

Write FlowFile content with max size of 5 GiB to Unity Catalog.

PutVectaraDocument

Generate and upload a JSON document to Vectara's upload endpoint

PutVectaraFile

Upload a FlowFile content to Vectara's index endpoint

PutVespaDocument

Uses Vespa document api to update a record in a specific namespace.

QueryDocument

Evaluates a SQL-like query against the incoming Datavolo Document JSON, producing the results on the outgoing FlowFile

QueryMilvus

Queries a given collection in a Milvus database using vectors

QueryPinecone

Queries Pinecone for vectors that are similar to the input vector, or retrieves a vector by ID.

RunDatabricksJob

Triggers a pre-defined Databricks job to run with custom parameters

SummarizeText

This processor uses a Large Language Model (LLM) to summarize the content of a FlowFile

UpsertMilvus

Upserts vectors into Milvus database for a given collection

UpsertPinecone

Publishes vectors, including metadata, and optionally text, to a Pinecone index.