LLMTextProcessor - Pipecat

Overview
Constructor Parameters
Usage

Overview

LLMTextProcessor is a processor designed to aggregate LLMTextFrames into coherent text units before passing them to downstream services, such as TTS. By utilizing text aggregators, it ensures that text is properly segmented and structured, enhancing the quality of subsequent processing. This processor expects LLMTextFrames as input and outputs AggregatedTextFrames containing the aggregated text.

If an LLMTextProcessor is in use, the text_aggregator parameter of TTS services will be ignored, as text aggregation is handled upstream.

The benefit of pre-aggregating LLM text frames is that it allows for more controlled and meaningful text synthesis. Downstream services can operate on complete sentences or logical text blocks. For TTS services, this means being able to customize how certain types of text are spoken (e.g., spelling out phone numbers, stripping out url protocols, or inserting other tts-specific annotations) or even skipping over certain text segments entirely (e.g., code snippets or markup). For other services, such as RTVI, it allows for sending these logical text units as separate bot-output messages, supporting custom client-side handling and rendering (e.g. collapsible code blocks, clickable links, etc.).

LLMTextProcessor API Reference

Pipecat’s API methods for LLMTextProcessor

Example Implementation

Complete example with LLMTextProcessor and custom TTS and RTVI handling

Constructor Parameters

text_aggregator

BaseTextAggregator

default:"None"

An instance of a text aggregator (e.g., PatternPairAggregator or a custom aggregator type) used to aggregate incoming text from LLMTextFrames. If None, a SimpleTextAggregator will be used by default, aggregating text based on sentence boundaries.

Usage

The LLMTextProcessor should be integrated into your pipeline after the LLM service and before any services that consume text, such as TTS. It processes incoming LLMTextFrames, aggregates their text content, and outputs AggregatedTextFrames.

For more usage examples, check out the docs for the PatternPairAggregator.

from pipecat.processors.aggregators.llm_text_processor import LLMTextProcessor

...

llm_text_aggregator = PatternPairAggregator()
llm_text_aggregator.add_pattern(
    type="code",
    start_pattern="<code>",
    end_pattern="</code>",
    action=MatchAction.AGGREGATE,
)
llm_text_processor = LLMTextProcessor(text_aggregator=llm_text_aggregator)

...

# Pipeline - The following pipeline is typical for a STT->LLM->TTS bot + RTVI
#            with the addition of the LLMTextProcessor to handle special text segments.
pipeline = Pipeline(
    [
        transport.input(),
        rtvi,
        stt,
        transcript_processor.user(),
        context_aggregator.user(),
        llm,
        llm_text_processor,
        tts,
        transport.output(),
        transcript_processor.assistant(),
        context_aggregator.assistant(),
    ]
)

Sentry Metrics Producer & Consumer Processors

⌘I