EvaluateRagAnswerCorrectness
Description
Evaluates the correctness of generated answers in a Retrieval-Augmented Generation (RAG) context by computing metrics such as F1 score, cosine similarity, and answer correctness. The processor uses an LLM (e.g., OpenAI's GPT) to assess the generated answer against the ground truth.
Tags
ai, answer correctness, evaluation, llm, nlp, openai, rag
Properties
In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.
Display Name | API Name | Default Value | Allowable Values | Description |
---|---|---|---|---|
Record Reader * | Record Reader | Controller Service: RecordReaderFactory Implementations: AvroReader CEFReader CSVReader ExcelReader GrokReader JsonPathReader JsonTreeReader ReaderLookup ScriptedReader Syslog5424Reader SyslogReader WindowsEventLogReader XMLReader YamlTreeReader | The Record Reader to use for reading the FlowFile. | |
Record Writer * | Record Writer | Controller Service: RecordSetWriterFactory Implementations: AvroRecordSetWriter CSVRecordSetWriter FreeFormTextRecordSetWriter JsonRecordSetWriter RecordSetWriterLookup ScriptedRecordSetWriter XMLRecordSetWriter | The Record Writer to use for writing the results. | |
Question Record Path * | Question Record Path | The RecordPath to the question field in the record. Supports Expression Language, using FlowFile attributes and Environment variables. | ||
Ground Truth Record Path * | Ground Truth Record Path | The RecordPath to the ground truth field in the record. Supports Expression Language, using FlowFile attributes and Environment variables. | ||
Ground Truth Vector Record Path * | Ground Truth Vector Record Path | /ground_truth_embedding | The path to the ground truth vector field in the record. Supports Expression Language, using FlowFile attributes and Environment variables. | |
Generated Answer Record Path * | Generated Answer Record Path | /generated_answer | The path to the answer field in the record Supports Expression Language, using FlowFile attributes and Environment variables. | |
Generated Answer Vector Record Path * | Generated Answer Vector Record Path | /generated_answer_embedding | The path to the answer vector field in the record. Supports Expression Language, using FlowFile attributes and Environment variables. | |
Evaluation Results Record Path * | Evaluation Results Record Path | The RecordPath to write the results of the evaluation to. Supports Expression Language, using FlowFile attributes and Environment variables. | ||
LLM Provider Service * | LLM Provider Service | Controller Service: LLMService Implementations: StandardAnthropicLLMService StandardOpenAILLMService | The provider service for sending evaluation prompts to LLM | |
F1 Score Weight * | F1 Score Weight | 0.75 | The weight to apply to the F1 score when calculating answer correctness (between 0.0 and 1.0) Supports Expression Language, using FlowFile attributes and Environment variables. | |
Cosine Similarity Weight * | Cosine Similarity Weight | 0.25 | The weight to apply to the cosine similarity when calculating answer correctness (between 0.0 and 1.0) Supports Expression Language, using FlowFile attributes and Environment variables. |
Dynamic Properties
This component does not support dynamic properties.
Relationships
Name | Description |
---|---|
failure | FlowFiles that cannot be processed are routed to this relationship |
success | FlowFiles that are successfully processed are routed to this relationship |
Reads Attributes
This processor does not read attributes.
Writes Attributes
Name | Description |
---|---|
average.answerCorrectness | The average answer correctness score computed over all records. |
average.cosineSim | The average cosine similarity between the ground truth and answer embeddings. |
average.f1Score | The average F1 score computed over all records. |
json.parse.failures | Number of JSON parse failures encountered. |
State Management
This component does not store state.
Restricted
This component is not restricted.
Input Requirement
This component requires an incoming relationship.
Example Use Cases
Use Case 1
Use this processor to assess the quality of answers generated by an LLM in comparison to ground truth answers, providing metrics that can be used for monitoring and improving the performance of RAG systems.
Configuration
Configure the processor with the appropriate LLM Provider Service.
Set the Record Reader and Record Writer to read and write records in the desired format.
Specify the Record Paths for the question, ground truth, answer, and their corresponding embeddings.
The processor will read each record, compute the F1 score, cosine similarity, and answer correctness, and write the enriched records to the output.
System Resource Considerations
This component does not specify system resource considerations.