GPTDesk - OpenAI Responses API 实现指南¶

概述¶

GPTDesk 是基于 OpenAI Responses API 的 DeskLLM 实现，提供纯文本生成能力。Responses API 是 OpenAI 的新一代接口，支持指令（instructions）字段和结构化输出。

模型支持¶

快速开始¶

基本使用¶

from tfrobot.brain.chain.llms.generation_llms.desk_llm.openai_desk import GPTDesk
from tfrobot.brain.chain.prompt.memo_prompt import MemoPrompt
from tfrobot.schema.message.conversation.message_dto import TextMessage

# 创建 GPTDesk 实例
desk_llm = GPTDesk(
    name="gpt-4o",
    openai_api_key="your_api_key"
)

# 配置系统提示
desk_llm.system_prompt = [
    MemoPrompt(template="你是一个专业的代码助手，擅长 Python 开发。")
]

# 调用
result = desk_llm.complete(
    current_input=TextMessage(content="帮我写一个快速排序函数")
)
print(result.generations[0].text)

配置 API 密钥¶

通过环境变量配置（推荐）：

export OPENAI_API_KEY="your_api_key"

或直接传入：

desk_llm = GPTDesk(
    name="gpt-4o",
    openai_api_key="your_api_key"
)

核心参数¶

模型参数¶

参数	类型	默认值	说明
`name`	`str`	`gpt-4-turbo-preview`	模型名称
`max_tokens`	`int`	-	最大生成 tokens 数（max_output_tokens）
`temperature`	`float`	`1.0`	采样温度（0-2）
`top_p`	`float`	`1.0`	核采样参数
`stop`	`str\\|list[str]`	-	停止词列表（最多 4 个）
`stream`	`bool`	`False`	是否流式处理
`timeout`	`int`	`180`	请求超时时间（秒）
`user`	`str`	-	用户 ID，用于监控和检测滥用

网络配置¶

参数	类型	默认值	说明
`openai_api_key`	`str`	环境变量	OpenAI API 密钥
`proxy_host`	`str`	-	代理地址
`proxy_port`	`int`	-	代理端口
`proxy_user`	`str`	-	代理用户名
`proxy_pass`	`str`	-	代理密码

特殊功能¶

Instructions 字段¶

Responses API 支持独立的 instructions 字段，用于指示模型行为：

from tfrobot.brain.chain.llms.generation_llms.desk_llm.openai_desk import GPTDesk

desk_llm = GPTDesk(name="gpt-4o")

# 系统提示：角色定义
desk_llm.system_prompt = [
    MemoPrompt(template="你是一个代码助手。")
]

# 指令提示：具体行为指南
desk_llm.instruction_prompt = [
    MemoPrompt(template="""
请遵循以下规则：
1. 代码必须有类型注解
2. 函数必须有文档字符串
3. 使用清晰的变量命名
""")
]

result = desk_llm.complete(current_input=TextMessage(content="写一个快速排序"))

多模态支持¶

GPTDesk 支持图片输入：

from tfrobot.brain.chain.llms.generation_llms.desk_llm.openai_desk import GPTDesk
from tfrobot.schema.message.conversation.message_dto import MultiPartMessage
from tfrobot.schema.message.msg_part import TextPart, ImagePart
from tfrobot.schema.message.image_url import ImgUrl

desk_llm = GPTDesk(name="gpt-4o")

# 多模态输入
msg = MultiPartMessage(content=[
    TextPart(text="请描述这个界面的布局"),
    ImagePart(
        image_url=ImgUrl(url="screenshot.png", detail="high")
    ),
])

result = desk_llm.complete(current_input=msg)
print(result.generations[0].text)

使用场景¶

代码生成¶

desk_llm = GPTDesk(name="gpt-4o")

desk_llm.system_prompt = [
    MemoPrompt(template="你是一个 Python 专家，编写高质量、有文档的代码。")
]

desk_llm.instruction_prompt = [
    MemoPrompt(template="""
示例：
输入：创建一个Person类
输出：
class Person:
    def __init__(self, name: str, age: int):
        self.name = name
        self.age = age
""")
]

desk_llm.purpose_prompt = [
    MemoPrompt(template="创建一个Student类，继承Person类。")
]

result = desk_llm.complete(current_input=TextMessage(content="开始生成"))

图像理解¶

desk_llm = GPTDesk(name="gpt-4o")

desk_llm.system_prompt = [
    MemoPrompt(template="你是一个图像分析专家。")
]

msg = MultiPartMessage(content=[
    TextPart(text="请详细描述这张图片中的内容和布局"),
    ImagePart(
        image_url=ImgUrl(url="path/to/image.jpg", detail="high")
    ),
])

result = desk_llm.complete(current_input=msg)

多轮编辑¶

desk_llm = GPTDesk(name="gpt-4o")

# 初始代码
original_code = """
def add(a, b):
    return a + b
"""

desk_llm.original_desk_screenshot_prompt = [
    MemoPrompt(template=f"原始代码：\n```python\n{original_code}\n```")
]

# 第一轮：添加类型注解
desk_llm.purpose_prompt = [
    MemoPrompt(template="添加类型注解。")
]

result1 = desk_llm.complete(current_input=TextMessage(content="开始编辑"))
current_code = result1.generations[0].text

# 第二轮：添加文档字符串
desk_llm.current_desk_screenshot_prompt = [
    MemoPrompt(template=f"当前代码：\n```python\n{current_code}\n```")
]

desk_llm.intermediate_prompt = [
    MemoPrompt(template="已添加类型注解。")
]

desk_llm.purpose_prompt = [
    MemoPrompt(template="添加文档字符串。")
]

result2 = desk_llm.complete(current_input=TextMessage(content="继续编辑"))

代理配置¶

HTTP 代理¶

desk_llm = GPTDesk(
    name="gpt-4o",
    proxy_host="127.0.0.1",
    proxy_port=7890
)

带认证的代理¶

desk_llm = GPTDesk(
    name="gpt-4o",
    proxy_host="proxy.company.com",
    proxy_port=8080,
    proxy_user="username",
    proxy_pass="password"
)

错误处理¶

上下文超长¶

GPTDesk 会自动检测并处理上下文超长错误：

from tfrobot.schema.exceptions import ContextTooLargeError

try:
    result = desk_llm.complete(current_input=very_long_input)
except ContextTooLargeError as e:
    print(f"当前大小: {e.current_size}")
    print(f"目标大小: {e.target_size}")
    print(f"模型: {e.model_name}")
    # 由 Chain 层自动处理上下文压缩

重试机制¶

GPTDesk 使用 tenacity 实现自动重试：

重试条件：APITimeoutError, APIConnectionError
重试次数：3 次
重试策略：指数退避（4s, 8s, 10s）

import logging

# 配置日志记录重试
logging.basicConfig(level=logging.WARNING)

高级用法¶

使用用户 ID¶

# 为每个用户设置唯一 ID，便于监控和检测滥用
desk_llm = GPTDesk(
    name="gpt-4o",
    user="user_12345"
)

result = desk_llm.complete(current_input=user_input)

控制输出格式¶

# 使用 prefix_prompt 强制输出格式
desk_llm = GPTDesk(name="gpt-4o")

desk_llm.purpose_prompt = [
    MemoPrompt(template="生成一个 JSON 对象，包含 name 和 age 字段。")
]

desk_llm.prefix_prompt = [
    MemoPrompt(template="{")
]

result = desk_llm.complete(current_input=TextMessage(content="开始"))
# 输出会以 { 开头

异步调用¶

import asyncio

async def generate():
    desk_llm = GPTDesk(name="gpt-4o")
    result = await desk_llm.async_complete(
        current_input=TextMessage(content="生成代码")
    )
    return result.generations[0].text

result = asyncio.run(generate())

最佳实践¶

1. 选择合适的模型¶

# 简单任务使用 Mini（快速、经济）
desk_llm = GPTDesk(name="gpt-4o-mini")

# 复杂任务使用 GPT-4o
desk_llm = GPTDesk(name="gpt-4o")

# 代码任务使用 GPT-4o（多模态能力）
desk_llm = GPTDesk(name="gpt-4o")

2. 合理使用 Instructions¶

# 系统提示：角色定义
desk_llm.system_prompt = [
    MemoPrompt(template="你是一个 Python 开发者。")
]

# 指令提示：具体行为
desk_llm.instruction_prompt = [
    MemoPrompt(template="""
请遵循以下规范：
- 使用类型注解
- 编写文档字符串
- 遵循 PEP 8
""")
]

3. 利用多模态能力¶

# 图像理解
desk_llm = GPTDesk(name="gpt-4o")

msg = MultiPartMessage(content=[
    TextPart(text="分析这个界面"),
    ImagePart(image_url=ImgUrl(url="screenshot.png", detail="high"))
])

result = desk_llm.complete(current_input=msg)

4. 合理设置温度¶

# 确定性输出（代码生成）
desk_llm = GPTDesk(
    name="gpt-4o",
    temperature=0.0
)

# 创造性输出（文案创作）
desk_llm = GPTDesk(
    name="gpt-4o",
    temperature=1.2
)

性能优化¶

1. 减少延迟¶

# 使用 GPT-4o-mini 减少延迟
desk_llm = GPTDesk(
    name="gpt-4o-mini",
    temperature=0.0
)

2. 控制 Token 消耗¶

# 限制输出长度
desk_llm = GPTDesk(
    name="gpt-4o",
    max_tokens=512
)

3. 异步调用¶

# 使用异步调用提高吞吐量
async def batch_generate():
    desk_llm = GPTDesk(name="gpt-4o")
    tasks = [
        desk_llm.async_complete(current_input=TextMessage(content=f"任务 {i}"))
        for i in range(10)
    ]
    results = await asyncio.gather(*tasks)
    return results

与 GPTGenDesk 的区别¶

GPTDesk 使用 Responses API，GPTGenDesk 使用 Completions API：

特性	GPTDesk	GPTGenDesk
API 类型	Responses API	Completions API
Instructions	✅ 支持	❌ 不支持
停止词	⚠️ 不支持	✅ 支持（最多 4 个）
模型支持	最新模型	旧版模型
结构化输出	✅ 支持	❌ 不支持
多模态	✅ 支持	⚠️ 有限

注意事项¶

1. 不支持停止词¶

Responses API 不支持 stop 参数，如需停止词，请在 Prompt 中说明：

# ❌ 不支持
desk_llm = GPTDesk(stop=["```"])

# ✅ 使用 Prompt
desk_llm.prefix_prompt = [
    MemoPrompt(template="```python\n")
]

desk_llm.purpose_prompt = [
    MemoPrompt(template="请在代码块结束后停止输出。")
]

2. Token 计算¶

如果 API 未返回 token 使用情况，GPTDesk 会使用 tiktoken 自动计算：

result = desk_llm.complete(current_input=user_input)

# Token 使用情况
print(result.usage)
# {
#     "prompt_tokens": 100,
#     "completion_tokens": 50,
#     "total_tokens": 150,
#     "prompt_cost": 0.001,
#     "completion_cost": 0.002,
#     "total_cost": 0.003
# }

模型名称	上下文窗口	特点
`gpt-4o`	128K	最强多模态能力
`gpt-4o-mini`	128K	快速、经济
`gpt-4-turbo`	128K	高性能
`gpt-3.5-turbo`	16K	轻量级