上下文缓存对话生成

1.概述

调用本接口，向大模型发起带上下文缓存的请求。在发起之前，您需要调用上下文缓存创建，获取到缓存的id，作为context_id字段在本接口中引用缓存。

模型列表：

Doubao-1.5-pro-32k
Doubao-1.5-lite-32k
Doubao-pro-32k

2.请求说明

请求方法: POST
请求地址

  https://gateway.theturbo.ai/v1/context/chat/completions

3.请求参数

3.1 Head参数

参数名称	类型	必填	说明	示例值
`Content-Type`	string	是	设置请求头类型，必须为`application/json`	`application/json`
`Accept`	string	是	设置响应类型，建议统一为`application/json`	`application/json`
`Authorization`	string	是	身份验证所需的 API_KEY，格式`Bearer $YOUR_API_KEY`	`Bearer $YOUR_API_KEY`

3.2 Body 参数 (application/json)

参数名称	类型	必填	说明	示例
context_id	string	是	上下文缓存的ID，用于关联缓存的信息。	`ctx-20241211104333-12345`
model	string	是	要使用的模型 ID。详见概述列出的可用版本，如 `deepseek-chat`。	`deepseek-chat`
messages	array	是	聊天消息列表。数组中的每个对象包含`role` (角色) 与 `content` (内容)。	`[{"role": "user","content": "你好"}]`
role	string	否	消息角色，可选值: `system`、`user`、`assistant`。	`user`
content	string	否	消息的具体内容。	`你好，请给我讲个笑话。`
temperature	number	否	采样温度，取值`0~2`。数值越大，输出越随机；数值越小，输出越集中和确定。	`0.7`
top_p	number	否	另一种调节采样分布的方式，取值 `0~1`。和 `temperature` 通常二选一设置。	`0.9`
n	number	否	为每条输入消息生成多少条回复。	`1`
stream	boolean	否	是否开启流式输出。设置为 `true` 时，返回类似 ChatGPT 的流式数据。	`false`
stop	string	否	最多可指定 4 个字符串，一旦生成的内容出现这几个字符串之一，就停止生成更多 tokens。	`\n`
max_tokens	number	否	单次回复可生成的最大 token 数量，受模型上下文长度限制。	`1024`
presence_penalty	number	否	-2.0 ~ 2.0，正值会鼓励模型输出更多新话题，负值会降低输出新话题的概率。	`0`
frequency_penalty	number	否	-2.0 ~ 2.0，正值会降低模型重复字句的频率，负值会提高重复字句出现的概率。	`0`

4.请求示例

4.1聊天对话

  POST /v1/context/chat/completions
Content-Type: application/json
Accept: application/json
Authorization: Bearer $YOUR_API_KEY

{
	"context_id": "ctx-20241211104333-12345",
	"model": "Doubao-1.5-pro-32k",
	"messages": [
		{
			"role": "user",
			"content": "你好，给我科普一下量子力学吧"
		}
	],
	"temperature": 0.7,
	"max_tokens": 1024
}

  curl https://gateway.theturbo.ai/v1/context/chat/completions \
	-H "Content-Type: application/json" \
	-H "Accept: application/json" \
	-H "Authorization: Bearer $YOUR_API_KEY" \
	-d "{
	\"context_id\": \"ctx-20241211104333-12345\",
	\"model\": \"Doubao-1.5-pro-32k\",
	\"messages\": [{
		\"role\": \"user\",
		\"content\": \"你好，给我科普一下量子力学吧\"
	}],
	\"temperature\": 0.7,
	\"max_tokens\": 1024
}"

  package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
	"strings"
)

const (
	YOUR_API_KEY    = "sk-123456789012345678901234567890123456789012345678"
	REQUEST_PAYLOAD = `{
	"context_id": "ctx-20241211104333-12345",
	"model": "Doubao-1.5-pro-32k",
	"messages": [{
		"role": "user",
		"content": "你好，给我科普一下量子力学吧"
	}],
	"temperature": 0.7,
	"max_tokens": 1024
}`
)

func main() {

	requestURL := "https://gateway.theturbo.ai/v1/context/chat/completions"
	requestMethod := "POST"
	requestPayload := strings.NewReader(REQUEST_PAYLOAD)

	req, err := http.NewRequest(requestMethod, requestURL, requestPayload)
	if err != nil {
		fmt.Println("Create request failed, err: ", err)
		return
	}

	req.Header.Add("Content-Type", "application/json")
	req.Header.Add("Accept", "application/json")
	req.Header.Add("Authorization", "Bearer "+YOUR_API_KEY)

	client := &http.Client{}

	resp, err := client.Do(req)
	if err != nil {
		fmt.Println("Do request failed, err: ", err)
		return
	}
	defer resp.Body.Close()

	respBodyBytes, err := ioutil.ReadAll(resp.Body)
	if err != nil {
		fmt.Println("Read response body failed, err: ", err)
		return
	}
	fmt.Println(string(respBodyBytes))
}

5.响应示例

  {
	"id": "chatcmpl-1234567890",
	"object": "chat.completion",
	"created": 1699999999,
	"model": "Doubao-1.5-pro-32k",
	"choices": [
		{
			"message": {
				"role": "assistant",
				"content": "量子力学是研究微观世界的物理学分支……"
			},
			"finish_reason": "stop"
		}
	],
	"usage": {
		"prompt_tokens": 64,
		"completion_tokens": 13,
		"total_tokens": 77,
		"prompt_tokens_details": {
			"cached_tokens": 50
		},
		"completion_tokens_details": {
			"reasoning_tokens": 0
		}
	}
}

上下文缓存对话创建

智普 AI

上下文缓存对话生成

1.概述 link

模型列表： link

2.请求说明 link

3.请求参数 link

3.1 Head参数 link

3.2 Body 参数 (application/json) link

4.请求示例 link

4.1聊天对话 link

5.响应示例 link

1.概述

模型列表：

2.请求说明

3.请求参数

3.1 Head参数

3.2 Body 参数 (application/json)

4.请求示例

4.1聊天对话

5.响应示例