Messages and chat history
消息和聊天記錄
PydanticAI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed.
PydanticAI 提供對代理運行期間交換的消息的訪問。這些消息既可用於繼續連貫的對話,也可用於瞭解代理的表現。
Accessing Messages from Results
從結果訪問消息
After running an agent, you can access the messages exchanged during that run from the result object.
運行代理后,您可以從 result 物件存取該運行期間交換的消息。
Both RunResult
(returned by Agent.run, Agent.run_sync)
and StreamedRunResult (returned by Agent.run_stream) have the following methods:兩個 RunResult
(由 Agent.run 返回,Agent.run_sync) 和 StreamedRunResult(由 Agent.run_stream 返回)具有以下方法:
all_messages(): returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes,all_messages_json().all_messages():返回所有消息,包括之前運行的消息。還有一個返回 JSON 位元組的變體all_messages_json()。new_messages(): returns only the messages from the current run. There's also a variant that returns JSON bytes,new_messages_json().new_messages():僅返回當前運行的消息。還有一個返回 JSON 位元組的變體new_messages_json()。
StreamedRunResult and complete messages
StreamedRunResult 和完整消息
On StreamedRunResult, the messages returned from these methods will only include the final result message once the stream has finished.
在 StreamedRunResult 上,一旦流完成,從這些方法返回的消息將僅包含最終結果消息。
E.g. you've awaited one of the following coroutines:
例如,您等待了以下協程之一:
StreamedRunResult.stream()
StreamedRunResult.stream() (英语)StreamedRunResult.stream_text()
StreamedRunResult.stream_text()StreamedRunResult.stream_structured()
StreamedRunResult.stream_structured()StreamedRunResult.get_output()
StreamedRunResult.get_output()
Note: The final result message will NOT be added to result messages if you use .stream_text(delta=True) since in this case the result content is never built as one string.
注意: 如果您使用 .stream_text(delta=True), 則不會將最終結果消息添加到結果消息中,因為在這種情況下,結果內容永遠不會構建為一個字符串。
Example of accessing methods on a RunResult :
存取 RunResult 上的方法的範例 :
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
result = agent.run_sync('Tell me a joke.')
print(result.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.
# all messages from the run
print(result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.',
timestamp=datetime.datetime(...),
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
),
]
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(requests=1, request_tokens=60, response_tokens=12, total_tokens=72),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
]
"""
(此示例已完成,可以「按原樣」運行)
Example of accessing methods on a StreamedRunResult :
訪問 StreamedRunResult 上的方法的示例:
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
async def main():
async with agent.run_stream('Tell me a joke.') as result:
# incomplete messages before the stream finishes
print(result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.',
timestamp=datetime.datetime(...),
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
),
]
)
]
"""
async for text in result.stream_text():
print(text)
#> Did you hear
#> Did you hear about the toothpaste
#> Did you hear about the toothpaste scandal? They called
#> Did you hear about the toothpaste scandal? They called it Colgate.
# complete messages once the stream finishes
print(result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.',
timestamp=datetime.datetime(...),
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
),
]
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(request_tokens=50, response_tokens=12, total_tokens=62),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
]
"""
asyncio.run(main()) to run main)(這個例子是完整的,它可以 “按原樣” 運行 — 你需要添加
asyncio.run(main()) 來運行 main)
Using Messages as Input for Further Agent Runs
使用消息作為進一步代理運行的輸入
The primary use of message histories in PydanticAI is to maintain context across multiple agent runs.
PydanticAI 中消息歷史記錄的主要用途是在多個代理運行中維護上下文。
To use existing messages in a run, pass them to the message_history parameter of
Agent.run, Agent.run_sync or
Agent.run_stream.
要在運行中使用現有消息,請將它們傳遞給 message_history 的 參數
Agent.run、Agent.run_sync 或
Agent.run_stream。
If message_history is set and not empty, a new system prompt is not generated — we assume the existing message history includes a system prompt.
如果 message_history 已設置且不為空,則不會生成新的系統提示符 — 我們假設現有消息歷史記錄包含系統提示符。
在對話中重複使用消息
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
result1 = agent.run_sync('Tell me a joke.')
print(result1.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.
result2 = agent.run_sync('Explain?', message_history=result1.new_messages())
print(result2.output)
#> This is an excellent joke invented by Samuel Colvin, it needs no explanation.
print(result2.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.',
timestamp=datetime.datetime(...),
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
),
]
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(requests=1, request_tokens=60, response_tokens=12, total_tokens=72),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
ModelRequest(
parts=[
UserPromptPart(
content='Explain?',
timestamp=datetime.datetime(...),
)
]
),
ModelResponse(
parts=[
TextPart(
content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.'
)
],
usage=Usage(requests=1, request_tokens=61, response_tokens=26, total_tokens=87),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
]
"""
(此示例已完成,可以「按原樣」運行)
Storing and loading messages (to JSON)
儲存及載入訊息(到 JSON)
While maintaining conversation state in memory is enough for many applications, often times you may want to store the messages history of an agent run on disk or in a database. This might be for evals, for sharing data between Python and JavaScript/TypeScript, or any number of other use cases.
雖然在記憶體中維護會話狀態對於許多應用程式來說就足夠了,但通常您可能希望將代理程式的消息歷史記錄存儲在磁碟或資料庫中。這可能用於評估、在 Python 和 JavaScript/TypeScript 之間共用數據,或任何其他用例。
The intended way to do this is using a TypeAdapter.
執行此作的預期方法是使用 TypeAdapter。
We export ModelMessagesTypeAdapter that can be used for this, or you can create your own.
我們導出可用於此目的的 ModelMessagesTypeAdapter,或者您可以創建自己的 ModelMessagesTypeAdapter。
Here's an example showing how:
以下範例顯示如何作:
將消息序列化為 JSON
from pydantic_core import to_jsonable_python
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessagesTypeAdapter
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
result1 = agent.run_sync('Tell me a joke.')
history_step_1 = result1.all_messages()
as_python_objects = to_jsonable_python(history_step_1)
same_history_as_step_1 = ModelMessagesTypeAdapter.validate_python(as_python_objects)
result2 = agent.run_sync(
'Tell me a different joke.', message_history=same_history_as_step_1
)
(This example is complete, it can be run "as is")
(此示例已完成,可以「按原樣」運行)
Other ways of using messages
消息的其他使用方式
Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing.
由於消息是由簡單的數據類定義的,因此您可以手動創建和作,例如用於測試。
The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models.
消息格式與使用的模型無關,因此您可以在不同的代理中使用消息,也可以將同一代理與不同的模型一起使用。
In the example below, we reuse the message from the first agent run, which uses the openai:gpt-4o model, in a second agent run using the google-gla:gemini-1.5-pro model.
在下面的示例中,我們在使用 google-gla:gemini-1.5-pro 模型的第二次代理運行中重複使用第一次代理運行(使用 openai:gpt-4o 模型)的消息。
使用不同模型重用消息
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
result1 = agent.run_sync('Tell me a joke.')
print(result1.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.
result2 = agent.run_sync(
'Explain?',
model='google-gla:gemini-1.5-pro',
message_history=result1.new_messages(),
)
print(result2.output)
#> This is an excellent joke invented by Samuel Colvin, it needs no explanation.
print(result2.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.',
timestamp=datetime.datetime(...),
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
),
]
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(requests=1, request_tokens=60, response_tokens=12, total_tokens=72),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
ModelRequest(
parts=[
UserPromptPart(
content='Explain?',
timestamp=datetime.datetime(...),
)
]
),
ModelResponse(
parts=[
TextPart(
content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.'
)
],
usage=Usage(requests=1, request_tokens=61, response_tokens=26, total_tokens=87),
model_name='gemini-1.5-pro',
timestamp=datetime.datetime(...),
),
]
"""
Processing Message History
處理消息歷史記錄
Sometimes you may want to modify the message history before it's sent to the model. This could be for privacy
reasons (filtering out sensitive information), to save costs on tokens, to give less context to the LLM, or
custom processing logic.
有時,您可能希望在將消息歷史記錄發送到模型之前對其進行修改。這可能是出於隱私原因(篩選掉敏感資訊)、節省令牌成本、為 LLM 提供較少的上下文或自定義處理邏輯。
PydanticAI provides a history_processors parameter on Agent that allows you to intercept and modify
the message history before each model request.
PydanticAI 在 Agent 上提供了一個 history_processors 參數,允許您在每個模型請求之前攔截和修改消息歷史記錄。
Usage 用法
The history_processors is a list of callables that take a list of
ModelMessage and return a modified list of the same type.history_processors 是一個可調用物件清單,它採用
ModelMessage 並返回相同類型的修改後清單。
Each processor is applied in sequence, and processors can be either synchronous or asynchronous.
每個處理器按順序應用,處理器可以是同步的,也可以是異步的。
from pydantic_ai import Agent
from pydantic_ai.messages import (
ModelMessage,
ModelRequest,
ModelResponse,
TextPart,
UserPromptPart,
)
def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]:
"""Remove all ModelResponse messages, keeping only ModelRequest messages."""
return [msg for msg in messages if isinstance(msg, ModelRequest)]
# Create agent with history processor
agent = Agent('openai:gpt-4o', history_processors=[filter_responses])
# Example: Create some conversation history
message_history = [
ModelRequest(parts=[UserPromptPart(content='What is 2+2?')]),
ModelResponse(parts=[TextPart(content='2+2 equals 4')]), # This will be filtered out
]
# When you run the agent, the history processor will filter out ModelResponse messages
# result = agent.run_sync('What about 3+3?', message_history=message_history)
Keep Only Recent Messages
僅保留最近的消息
You can use the history_processor to only keep the recent messages:
您可以使用 history_processor 僅保留最近的消息:
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage
async def keep_recent_messages(messages: list[ModelMessage]) -> list[ModelMessage]:
"""Keep only the last 5 messages to manage token usage."""
return messages[-5:] if len(messages) > 5 else messages
agent = Agent('openai:gpt-4o', history_processors=[keep_recent_messages])
# Example: Even with a long conversation history, only the last 5 messages are sent to the model
long_conversation_history: list[ModelMessage] = [] # Your long conversation history here
# result = agent.run_sync('What did we discuss?', message_history=long_conversation_history)
RunContext parameter
RunContext 參數
History processors can optionally accept a RunContext parameter to access
additional information about the current run, such as dependencies, model information, and usage statistics:
歷史記錄處理器可以選擇接受 RunContext 參數來訪問有關當前運行的其他資訊,例如依賴項、模型資訊和使用方式統計資訊:
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage
from pydantic_ai.tools import RunContext
def context_aware_processor(
ctx: RunContext[None],
messages: list[ModelMessage],
) -> list[ModelMessage]:
# Access current usage
current_tokens = ctx.usage.total_tokens
# Filter messages based on context
if current_tokens > 1000:
return messages[-3:] # Keep only recent messages when token usage is high
return messages
agent = Agent('openai:gpt-4o', history_processors=[context_aware_processor])
This allows for more sophisticated message processing based on the current state of the agent run.
這允許根據代理運行的當前狀態進行更複雜的消息處理。
Summarize Old Messages 匯總舊消息
Use an LLM to summarize older messages to preserve context while reducing tokens.
使用 LLM 匯總較舊的消息,以保留上下文,同時減少令牌。
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage
# Use a cheaper model to summarize old messages.
summarize_agent = Agent(
'openai:gpt-4o-mini',
instructions="""
Summarize this conversation, omitting small talk and unrelated topics.
Focus on the technical discussion and next steps.
""",
)
async def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]:
# Summarize the oldest 10 messages
if len(messages) > 10:
oldest_messages = messages[:10]
summary = await summarize_agent.run(message_history=oldest_messages)
# Return the last message and the summary
return summary.new_messages() + messages[-1:]
return messages
agent = Agent('openai:gpt-4o', history_processors=[summarize_old_messages])
Testing History Processors
測試歷史處理器
You can test what messages are actually sent to the model provider using
FunctionModel:
您可以使用
FunctionModel 的 FunctionModel 中:
import pytest
from pydantic_ai import Agent
from pydantic_ai.messages import (
ModelMessage,
ModelRequest,
ModelResponse,
TextPart,
UserPromptPart,
)
from pydantic_ai.models.function import AgentInfo, FunctionModel
@pytest.fixture
def received_messages() -> list[ModelMessage]:
return []
@pytest.fixture
def function_model(received_messages: list[ModelMessage]) -> FunctionModel:
def capture_model_function(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
# Capture the messages that the provider actually receives
received_messages.clear()
received_messages.extend(messages)
return ModelResponse(parts=[TextPart(content='Provider response')])
return FunctionModel(capture_model_function)
def test_history_processor(function_model: FunctionModel, received_messages: list[ModelMessage]):
def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]:
return [msg for msg in messages if isinstance(msg, ModelRequest)]
agent = Agent(function_model, history_processors=[filter_responses])
message_history = [
ModelRequest(parts=[UserPromptPart(content='Question 1')]),
ModelResponse(parts=[TextPart(content='Answer 1')]),
]
agent.run_sync('Question 2', message_history=message_history)
assert received_messages == [
ModelRequest(parts=[UserPromptPart(content='Question 1')]),
ModelRequest(parts=[UserPromptPart(content='Question 2')]),
]
Multiple Processors 多個處理器
You can also use multiple processors:
您還可以使用多個處理器:
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelRequest
def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]:
return [msg for msg in messages if isinstance(msg, ModelRequest)]
def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]:
return messages[-5:]
agent = Agent('openai:gpt-4o', history_processors=[filter_responses, summarize_old_messages])
In this case, the filter_responses processor will be applied first, and the
summarize_old_messages processor will be applied second.
在這種情況下,將首先應用 filter_responses 處理器,並且
summarize_old_messages 處理器將再次應用。
Examples 例子
For a more complete example of using messages in conversations, see the chat app example.
有關在對話中使用消息的更完整範例,請參閱聊天應用程式範例。