请教一个 Java 查询 elasticsearch 的问题

2023-12-20 13:16:44 +08:00
 zshineee
//代码
Long startTime = System.currentTimeMillis()
SearchResponse searchResponse = restHighlevelclient.search(searchRequest, RequestOptions.DEFAULT);
Long endTime = System.currentTimeMillis();
log.info("took time:{},execution time:{}",searchResponse.getTook() , (endTime - startTime) + "ms"):

//日志
took time:20ms ,execution time:80ms
took time:19ms ,execution time:77ms
took time:48ms ,execution time:349ms
took time:18ms ,execution time:65ms
took time:34ms ,execution time:884ms
took time:19ms ,execution time:59ms
took time:16ms ,execution time:55ms
took time:19ms ,execution time:1113ms
tookt ime:24ms ,execution time:65ms
tookt ime:16ms ,execution time:56ms
tookt ime:16ms ,execution time:909ms

索引总条数是 2w 条数据,查询条数是 1w 条,实际返回 6000 多条,条件很简单就一个 match ,发现代码实际执行时间不太稳定,应该不需要 1s ,请教下是什么原因?该怎么排查?
1774 次点击
所在节点    Java
7 条回复
connor123
2023-12-20 14:05:54 +08:00
反序列化慢?我觉得是不是一次反序列化数据太多了,先用 10 条试试看
Morriaty
2023-12-20 14:22:40 +08:00
The time reported by elasticsearch in the "took" field is the time that it
took elasticsearch to process the query on its side. It doesn't include

- serializing the request into JSON on the client

- sending the request over the network

- deserializing the request from JSON on the server

- serializing the response into JSON on the server

- sending the response over the network

- deserializing the response from JSON on the client
lvtuyukuai
2023-12-20 14:25:43 +08:00
took time 应该是表示 elasticsearch 服务器端的执行时间,看起来比较稳定。
execution time 大致可以表示:组装请求 + 发送 http 请求到 es + took time + 从 es 接收 http 响应 + 解析响应
所以问题可能是在 「发送 http 请求到 es 、从 es 接收 http 响应」(网络不稳定?)、解析响应(响应内容太大?)
lvtuyukuai
2023-12-20 14:26:23 +08:00
好吧,跟二楼重复了,回复前没刷新🤣
zshineee
2023-12-20 15:47:38 +08:00
@connor123 @lvtuyukuai @Morriaty

调整了下 size ,10 条好像没什么问题,1000 条偶尔会出现问题

然后试了下 scroll api ,发现 size 是 10 、100 时,took 时长和 execution 时长都会出现 800 多 ms
lvtuyukuai
2023-12-20 18:04:38 +08:00
scroll api 时,是首次请求以及之后使用 scrollId 调用都是 800 多 ms 吗?需要注意的是,使用 scroll api 时,那个 size 是每一页(每次滚动请求)返回的结果文档数目,而不是总共要返回的文档数目
Aresxue
2023-12-26 10:10:55 +08:00
应该不是服务端问题 ,用 arthas trace 一下看看,实际中 es 的连接数、反序列化都会影响最终的性能。

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://tanronggui.xyz/t/1001940

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX