TencentDSDesk - 腾讯混元实现指南¶

概述¶

TencentDSDesk 是基于腾讯混元 API 的 DeskLLM 实现，提供纯文本生成能力。腾讯混元是腾讯推出的全链路自研大模型，支持多轮对话、工具调用等功能。

模型支持¶

快速开始¶

基本使用¶

from tfrobot.brain.chain.llms.generation_llms.desk_llm.tencent_ds_desk import TencentDSDesk
from tfrobot.brain.chain.prompt.memo_prompt import MemoPrompt
from tfrobot.schema.message.conversation.message_dto import TextMessage

# 创建 TencentDSDesk 实例
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    api_key="your_api_key"
)

# 配置系统提示
desk_llm.system_prompt = [
    MemoPrompt(template="你是一个专业的代码助手，擅长 Python 开发。")
]

# 调用
result = desk_llm.complete(
    current_input=TextMessage(content="帮我写一个快速排序函数")
)
print(result.generations[0].text)

配置 API 密钥¶

通过环境变量配置（推荐）：

export TENCENT_API_KEY="your_api_key"

或直接传入：

desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    api_key="your_api_key"
)

核心参数¶

模型参数¶

参数	类型	默认值	说明
`name`	`str`	`deepseek-chat`	模型名称
`max_tokens`	`int`	-	最大生成 tokens 数
`temperature`	`float`	`1.0`	采样温度（0-2）
`top_p`	`float`	`1.0`	核采样参数
`stop`	`str\\|list[str]`	-	停止词列表
`stream`	`bool`	`False`	是否流式输出
`timeout`	`int`	`300`	请求超时时间（秒）

高级采样参数¶

参数	类型	默认值	说明
`frequency_penalty`	`float`	`0`	频率惩罚（-2.0 到 2.0）
`presence_penalty`	`float`	`0`	存在惩罚（-2.0 到 2.0）
`logit_bias`	`dict`	-	Logit 偏置
`n`	`int`	`1`	生成数量
`seed`	`int`	-	种子（用于确定性采样）
`tool_choice`	`str\\|dict`	-	工具选择策略
`tool_params_strict`	`bool`	-	是否严格限制工具调用参数

网络配置¶

参数	类型	默认值	说明
`api_key`	`str`	环境变量	腾讯 API 密钥
`base_url`	`str`	`https://api.lkeap.cloud.tencent.com/v1`	API 基础 URL
`proxy_host`	`str`	-	代理地址
`proxy_port`	`int`	-	代理端口
`proxy_user`	`str`	-	代理用户名
`proxy_pass`	`str`	-	代理密码

使用场景¶

代码生成¶

desk_llm = TencentDSDesk(name="hunyuan-pro")

desk_llm.system_prompt = [
    MemoPrompt(template="你是一个 Python 专家，编写高质量、有文档的代码。")
]

desk_llm.purpose_prompt = [
    MemoPrompt(template="创建一个 Person 数据类，包含 name 和 age 属性。")
]

desk_llm.prefix_prompt = [
    MemoPrompt(template="```python\nfrom dataclasses import dataclass\n\n")
]

result = desk_llm.complete(current_input=TextMessage(content="开始生成"))
print(result.generations[0].text)

中文内容生成¶

腾讯混元在中文场景下表现优异：

desk_llm = TencentDSDesk(name="hunyuan-pro")

desk_llm.system_prompt = [
    MemoPrompt(template="你是一个专业的文案撰写助手。")
]

desk_llm.purpose_prompt = [
    MemoPrompt(template="为一款智能家居产品撰写一段宣传文案，突出便捷性和智能化。")
]

result = desk_llm.complete(current_input=TextMessage(content="开始撰写"))
print(result.generations[0].text)

流式输出¶

desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    stream=True
)

result = desk_llm.complete(current_input=TextMessage(content="生成一个故事"))
# 内容会逐步生成

代理配置¶

HTTP 代理¶

desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    proxy_host="127.0.0.1",
    proxy_port=7890
)

带认证的代理¶

desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    proxy_host="proxy.company.com",
    proxy_port=8080,
    proxy_user="username",
    proxy_pass="password"
)

自定义 API 端点¶

# 使用兼容 OpenAI API 的第三方服务
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    base_url="https://custom-api.example.com/v1",
    api_key="custom_key"
)

错误处理¶

上下文超长¶

TencentDSDesk 会自动检测并处理上下文超长错误：

from tfrobot.schema.exceptions import ContextTooLargeError

try:
    result = desk_llm.complete(current_input=very_long_input)
except ContextTooLargeError as e:
    print(f"错误: {e.message}")
    print(f"模型: {e.model_name}")
    # 腾讯 API 通常不返回具体的 token 数量
    # 由 Chain 层自动处理上下文压缩

错误码识别¶

TencentDSDesk 会识别以下错误码：

# 错误码 20059：上下文超长
# 错误消息：input length too long
# 错误消息包含：context_length_exceeded

try:
    result = desk_llm.complete(current_input=long_input)
except ContextTooLargeError as e:
    if "20059" in str(e) or "input length too long" in str(e).lower():
        print("检测到腾讯混元上下文超长错误")

重试机制¶

TencentDSDesk 使用 tenacity 实现自动重试：

重试条件：APITimeoutError, APIConnectionError
重试次数：3 次
重试策略：指数退避（4s, 8s, 10s）

import logging

# 配置日志记录重试
logging.basicConfig(level=logging.WARNING)

高级用法¶

多轮编辑¶

desk_llm = TencentDSDesk(name="hunyuan-pro")

# 初始代码
original_code = """
def add(a, b):
    return a + b
"""

desk_llm.original_desk_screenshot_prompt = [
    MemoPrompt(template=f"原始代码：\n```python\n{original_code}\n```")
]

# 第一轮：添加类型注解
desk_llm.purpose_prompt = [
    MemoPrompt(template="添加类型注解。")
]

result1 = desk_llm.complete(current_input=TextMessage(content="开始编辑"))
current_code = result1.generations[0].text

# 第二轮：添加文档字符串
desk_llm.current_desk_screenshot_prompt = [
    MemoPrompt(template=f"当前代码：\n```python\n{current_code}\n```")
]

desk_llm.intermediate_prompt = [
    MemoPrompt(template="已添加类型注解。")
]

desk_llm.purpose_prompt = [
    MemoPrompt(template="添加文档字符串。")
]

result2 = desk_llm.complete(current_input=TextMessage(content="继续编辑"))
print(result2.generations[0].text)

使用采样参数¶

# 频率惩罚：降低重复内容的概率
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    frequency_penalty=0.5  # 正值会惩罚频繁出现的 token
)

# 存在惩罚：鼓励谈论新话题
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    presence_penalty=0.5  # 正值会鼓励谈论新话题
)

# 种子：确定性采样
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    seed=42  # 相同的输入和种子会产生相同的输出
)

Logit 偏置¶

# 修改特定 token 的生成概率
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    logit_bias={
        "1234": -10,  # 抑制 token ID 1234 的生成
        "5678": 10    # 促进 token ID 5678 的生成
    }
)

异步调用¶

import asyncio

async def generate():
    desk_llm = TencentDSDesk(name="hunyuan-pro")
    result = await desk_llm.async_complete(
        current_input=TextMessage(content="生成代码")
    )
    return result.generations[0].text

result = asyncio.run(generate())

流式输出¶

desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    stream=True
)

# 流式输出会逐步生成内容
result = desk_llm.complete(current_input=TextMessage(content="生成一个故事"))
print(result.generations[0].text)

最佳实践¶

1. 选择合适的模型¶

# 专业版：最强推理能力
desk_llm = TencentDSDesk(name="hunyuan-pro")

# 标准版：平衡性能与成本
desk_llm = TencentDSDesk(name="hunyuan-standard")

# 轻量版：快速响应
desk_llm = TencentDSDesk(name="hunyuan-lite")

2. 合理设置温度¶

# 确定性输出（代码生成）
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    temperature=0.0
)

# 创造性输出（文案创作）
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    temperature=1.2
)

3. 使用惩罚参数¶

# 减少重复内容
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    frequency_penalty=0.5,  # 降低重复内容的概率
    presence_penalty=0.5     # 鼓励谈论新话题
)

4. 使用种子实现确定性输出¶

# 相同的输入和种子会产生相同的输出
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    temperature=0.0,
    seed=42  # 固定种子
)

性能优化¶

1. 减少 Prompt 长度¶

# 避免冗长的 Prompt
desk_llm.system_prompt = [
    MemoPrompt(template="你是 Python 专家。")  # 简洁明了
]

2. 使用合适的采样参数¶

# 对于代码生成，使用低温度
desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    temperature=0.0,
    top_p=0.95
)

3. 异步调用¶

# 使用异步调用提高吞吐量
async def batch_generate():
    desk_llm = TencentDSDesk(name="hunyuan-pro")
    tasks = [
        desk_llm.async_complete(current_input=TextMessage(content=f"任务 {i}"))
        for i in range(10)
    ]
    results = await asyncio.gather(*tasks)
    return results

Token 计算¶

TencentDSDesk 使用 tiktoken 近似计算 Token 消耗：

result = desk_llm.complete(current_input=user_input)

# Token 使用情况
print(result.usage)
# {
#     "prompt_tokens": 100,
#     "completion_tokens": 50,
#     "total_tokens": 150,
#     "prompt_cost": 0.001,
#     "completion_cost": 0.002,
#     "total_cost": 0.003
# }

与其他实现的区别¶

特性	TencentDSDesk	DeepSeekDesk	ClaudeDesk
中文能力	强	强	中
代码能力	中	强	强
API 兼容性	OpenAI	OpenAI	Anthropic
上下文窗口	128K	128K	200K
流式输出	✅	✅	❌
采样参数	丰富	丰富	丰富
价格	中等	低	高

注意事项¶

1. Token 信息不完整¶

腾讯混元 API 通常不返回当前 token 数量（current_size）和最大限制（target_size），只提供错误码和消息：

# 错误码：20059
# 错误消息：input length too long

try:
    result = desk_llm.complete(current_input=long_input)
except ContextTooLargeError as e:
    # e.current_size 和 e.target_size 通常为 None
    print(f"模型: {e.model_name}")
    print(f"错误: {e.message}")

2. API 格式可能变化¶

腾讯 API 可能变更错误格式，TencentDSDesk 会尝试识别多种格式：

# 支持的错误格式：
# 1. error_code=20059
# 2. input length too long
# 3. context_length_exceeded

3. 流式输出¶

TencentDSDesk 支持流式输出，但需要手动合并 chunks：

desk_llm = TencentDSDesk(
    name="hunyuan-pro",
    stream=True
)

result = desk_llm.complete(current_input=user_input)
# TencentDSDesk 会自动合并流式输出

模型名称	上下文窗口	特点
`hunyuan-pro`	128K	专业版，最强推理能力
`hunyuan-standard`	128K	标准版，平衡性能与成本
`hunyuan-lite`	128K	轻量版，快速响应

TencentDSDesk - 腾讯混元实现指南¶

概述¶

模型支持¶

推荐模型¶

快速开始¶

基本使用¶

配置 API 密钥¶

核心参数¶

模型参数¶

高级采样参数¶

网络配置¶

使用场景¶

代码生成¶

中文内容生成¶

流式输出¶

代理配置¶

HTTP 代理¶

带认证的代理¶

自定义 API 端点¶

错误处理¶

上下文超长¶

错误码识别¶

重试机制¶

高级用法¶

多轮编辑¶

使用采样参数¶

Logit 偏置¶

异步调用¶

流式输出¶

最佳实践¶

1. 选择合适的模型¶

2. 合理设置温度¶

3. 使用惩罚参数¶

4. 使用种子实现确定性输出¶

性能优化¶

1. 减少 Prompt 长度¶

2. 使用合适的采样参数¶

3. 异步调用¶

Token 计算¶

与其他实现的区别¶

注意事项¶

1. Token 信息不完整¶

2. API 格式可能变化¶

3. 流式输出¶

相关文档¶