Skip to main content

EvaluateRagAnswerCorrectness

Description

Evaluates the correctness of generated answers in a Retrieval-Augmented Generation (RAG) context by computing metrics such as F1 score, cosine similarity, and answer correctness. The processor uses an LLM (e.g., OpenAI's GPT) to assess the generated answer against the ground truth.

Tags

ai, answer correctness, evaluation, llm, nlp, openai, rag

Properties

In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Record Reader *Record ReaderController Service:
RecordReaderFactory

Implementations:
AvroReader
CEFReader
CSVReader
ExcelReader
GrokReader
JsonPathReader
JsonTreeReader
ReaderLookup
ScriptedReader
Syslog5424Reader
SyslogReader
WindowsEventLogReader
XMLReader
YamlTreeReader
The Record Reader to use for reading the FlowFile.
Record Writer *Record WriterController Service:
RecordSetWriterFactory

Implementations:
AvroRecordSetWriter
CSVRecordSetWriter
FreeFormTextRecordSetWriter
JsonRecordSetWriter
RecordSetWriterLookup
ScriptedRecordSetWriter
XMLRecordSetWriter
The Record Writer to use for writing the results.
Question Record Path *Question Record PathThe RecordPath to the question field in the record.

Supports Expression Language, using FlowFile attributes and Environment variables.
Ground Truth Record Path *Ground Truth Record PathThe RecordPath to the ground truth field in the record.

Supports Expression Language, using FlowFile attributes and Environment variables.
Ground Truth Vector Record Path *Ground Truth Vector Record Path/ground_truth_embeddingThe path to the ground truth vector field in the record.

Supports Expression Language, using FlowFile attributes and Environment variables.
Generated Answer Record Path *Generated Answer Record Path/generated_answerThe path to the answer field in the record

Supports Expression Language, using FlowFile attributes and Environment variables.
Generated Answer Vector Record Path *Generated Answer Vector Record Path/generated_answer_embeddingThe path to the answer vector field in the record.

Supports Expression Language, using FlowFile attributes and Environment variables.
Evaluation Results Record Path *Evaluation Results Record PathThe RecordPath to write the results of the evaluation to.

Supports Expression Language, using FlowFile attributes and Environment variables.
LLM Provider Service *LLM Provider ServiceController Service:
LLMService

Implementations:
StandardAnthropicLLMService
StandardOpenAILLMService
The provider service for sending evaluation prompts to LLM
F1 Score Weight *F1 Score Weight0.75The weight to apply to the F1 score when calculating answer correctness (between 0.0 and 1.0)

Supports Expression Language, using FlowFile attributes and Environment variables.
Cosine Similarity Weight *Cosine Similarity Weight0.25The weight to apply to the cosine similarity when calculating answer correctness (between 0.0 and 1.0)

Supports Expression Language, using FlowFile attributes and Environment variables.

Dynamic Properties

This component does not support dynamic properties.

Relationships

NameDescription
failureFlowFiles that cannot be processed are routed to this relationship
successFlowFiles that are successfully processed are routed to this relationship

Reads Attributes

This processor does not read attributes.

Writes Attributes

NameDescription
average.answerCorrectnessThe average answer correctness score computed over all records.
average.cosineSimThe average cosine similarity between the ground truth and answer embeddings.
average.f1ScoreThe average F1 score computed over all records.
json.parse.failuresNumber of JSON parse failures encountered.

State Management

This component does not store state.

Restricted

This component is not restricted.

Input Requirement

This component requires an incoming relationship.

Example Use Cases

Use Case 1

Use this processor to assess the quality of answers generated by an LLM in comparison to ground truth answers, providing metrics that can be used for monitoring and improving the performance of RAG systems.

Configuration

Configure the processor with the appropriate LLM Provider Service.
Set the Record Reader and Record Writer to read and write records in the desired format.
Specify the Record Paths for the question, ground truth, answer, and their corresponding embeddings.
The processor will read each record, compute the F1 score, cosine similarity, and answer correctness, and write the enriched records to the output.

System Resource Considerations

This component does not specify system resource considerations.

See Also