Skip to main content

OpenAiTranscribeAudio

Description

Transcribes audio into English text. The audio data must be in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm

Tags

audio, flac, m4a, mp3, mp4, mpeg, mpga, ogg, openai, speech-to-text, text, transcribe, translate, wav, webm

Properties

In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
OpenAI API Key *OpenAI API KeyThe API Key for interacting with OpenAI
Model Name *Model Namewhisper-1The name of the OpenAI Model to use
PromptPromptText that can be used to guide the model's style or continue a previous audio segment. The text must be in English.

Supports Expression Language, using FlowFile attributes and Environment variables.
Response Format *Response FormatJSON
  • JSON
  • Text
  • SRT
  • Verbose JSON
  • VTT
Specifies which format is desired for the output
Temperature *Temperature0The sampling temperature to use. The value must be a floating-point number between 0.0 and 1.0. A higher value, such as 0.8 will result in more of an interpreted translation, whereas a value of 0.0 will result in a more literal translation.

Dynamic Properties

This component does not support dynamic properties.

Relationships

NameDescription
failureFlowFiles that could not be transcribed are routed to this relationship.
successFlowFiles that have been successfully transcribed will be transferred to this relationship.

Reads Attributes

This processor does not read attributes.

Writes Attributes

This processor does not write attributes.

State Management

This component does not store state.

Restricted

This component is not restricted.

Input Requirement

This component requires an incoming relationship.

Example Use Cases Involving Other Components

Multiprocessor Use Case 1

Create embeddings for audio data and insert them into Pinecone so that the audio can be made available to a large language model (LLM) such as OpenAI's GPT models.

Components Involved

  • OpenAiTranscribeAudio
    1. Set "OpenAI API Key" to the API Key for interacting with OpenAI
    2. Set "Model Name" to the name of the OpenAI Model to use, such as whisper-1
    3. Set "Response Format" to Verbose JSON
    4. Connect the 'success' Relationship to ConvertRecord.
  • ConvertRecord
    1. Set "Record Reader" to a JsonTreeReader.
    2. Set "Record Writer" to a FreeFormTextRecordSetWriter.
    3. On the FreeFormTextRecordSetWriter, set the "Text" property to ${text}.
    4. Connect the 'success' Relationship to ParseDocument.
  • ParseDocument
    1. Set "Input Format" to Plain Text.
  • ChunkDocument
    1. Set "Chunking Strategy" to Split by Character.
    2. Set "Chunk Size" to 3000.
    3. Set "Chunk Overlap" to 200.
  • CreateOpenAiEmbeddings
    1. Set "OpenAI API Key" to the API key for your OpenAI account.
    2. Set "Model" to the name of the OpenAI model to use for creating embeddings.
    3. Set "Record Writer" to a Record Writer that writes out data in the desired format, typically JSON.
    4. Set "Web Client Service" to a WebClientService that can be used to make requests to the OpenAI API.
    5. Set "Record Reader" to a JsonTreeReader.
    6. Set "Text Record Path" /text.
    7. Connect the 'success' Relationship to UpsertPinecone.
  • UpsertPinecone
    1. Set "Pinecone API Key" to the API key for your Pinecone account.
    2. Set "Pinecone Index" to the name of the Pinecone index to publish the vectors to.
    3. Set "Pinecone Namespace" to the namespace to use when publishing the vectors.
    4. Set "Record Reader" to a Record Reader that can read the format of the data produced by the CreateOpenAIEmbeddings processor.
    5. Set "Vector Record Path" to /embeddings.
    6. Set "Metadata Record Path" to /metadata.
    7. Set "Web Client Service" to the same WebClientService that was used in the CreateOpenAiEmbeddings processor.
    8. If the desire is to include the transcription of the audio in Pinecone, set "Text Record Path" to /text and set "Text Field Name" to text.
    9. Otherwise, leave both of these properties unset.

System Resource Considerations

This component does not specify system resource considerations.

See Also