Gemini API 的 token 到底是怎么计的？

最近在用 Gemini 的时候，发现它生成一段 7 行的小红书文案用了 4000 多个 token ，而我用它生成一段 20 行的数学题解析时，却只用了 1000 多个 token 。有没有人知道 Gemini 的 token 到底是怎么算的？

mumbler

29 天前

直接问 Gemini

yinmin

28 天前

thinking 模型是会产生中间思考阶段的 token 的，总 token=input token+thinking token+output token

pike0002

28 天前

基本的公式：统计分词后得到的所有 Token 的总数，包含：
* 单词或子词
* 空格（如果分词器将空格视为 Token ）
* 标点符号
* 特殊控制标记（例如 <start>、<end> 等）
对于中文的话，也是差不多，只是分词的规则不是太一样。比如基于单字的方式，那就是一个汉字，一个标点都算一个。还有基于单词（有意思的单词）。

一般大家会用一些计算工具看一下它大概是多少。比如 Gemimi 就提供了接口可以本地计算。例子： https://www.pixelstech.net/article/1735013847-calculating-token-count-for-claude-api-using-go%3a-a-step-by-step-guide?lang=chinese

也有现成的工具： https://www.pixelstech.net/application/tokencalculator

pike0002

28 天前

@pike0002 给错例子了。应该是

package gemini

import (
"context"
"log"

"cloud.google.com/go/vertexai/genai"
"cloud.google.com/go/vertexai/genai/tokenizer"
)

// CalculateToken calculates the number of tokens in a given content using a specified encoding model.
func CalculateToken(ctx context.Context, content string, encoding string) (int, error) {
client, err := tokenizer.New(encoding)
if err != nil {
log.Printf("Failed to get encoding: %v", err)
return 0, err
}

resp, err := client.CountTokens(genai.Text(content))
if err != nil {
log.Printf("Failed to count tokens: %v", err)
return 0, err
}
return int(resp.TotalTokens), nil
}

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://tanronggui.xyz/t/1104273

V2EX 是创意工作者们的社区，是一个分享自己正在做的有趣事物、交流想法，可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.