索引会影响排序的速度吗？

比如以下 pymongo 代码

DB['todo'].find({'time': {'$lte': start}}).sort([('lv', 1), ('time', 1)]

排序的逻辑是 lv 小数字优先，lv 同级时 time 小的优先。
对应的查询目前是建两个索引

一个是 {time:1} 。
另一个是单独 {lv:1} 好，还是组合 {lv:1,time:1} 好（还是顺序应该反过来？）

索引会影响 aggregate 中 group 的效率吗？

比如 aggregate 中有一环是 group

group = {'$group':{
    '_id': '$main_key',
    'key1': {'$max':'$key1'},
    'key2': {'$push':'$key2'},
    ...
}}

这里面如果对 main_key, key1, key2,... 做索引的话会提高效率吗？

索引

Group

key2'

aggregate

6 条回复 • 2020-08-10 18:59:24 +08:00

limboMu

2020-08-10 16:30:10 +08:00

1：{lv: 1, time:1} 好一些，参考组合索引

limboMu

2020-08-10 16:31:47 +08:00

2：如果 aggregate 没有生成新的文档结构，原表的索引基本上是可以被引用到的，按 group 例子中的具体查询_id 是唯一的，所以没啥卵用

limboMu

2020-08-10 16:32:41 +08:00

@limboMu 我看错了，是可以提高的

JCZ2MkKb5S8ZX9pq

2020-08-10 16:44:52 +08:00

@limboMu
感觉例 2 里，在 max/min 这类排序的可能会有提高，但 push/addtoset 感觉索引可能就没啥用。不知道这么理解对不对。

limboMu

2020-08-10 17:06:45 +08:00

@JCZ2MkKb5S8ZX9pq 能用到索引的是 group 操作，group 在数据库里的通常实现是 sort 这个时候使用已经排序好的字段作为 groupKey 会有提高效率的效果。

libook

2020-08-10 18:59:24 +08:00

https://docs.mongodb.com/manual/core/index-compound/

“The order of the fields listed in a compound index is important. The index will contain references to documents sorted first by the values of the item field and, within each value of the item field, sorted by values of the stock field. ”

索引会先按照前面的字段排序，然后这个字段相同值的 documents 再按照后边的字段排序。

可以用 explain 做一下试验：
time 、lv 、lv-time 、time-lv 四种排列情况都建索引，然后执行：
db.getCollection('test').find({"time": {"$lte": 1}}).sort({"lv":1,"time":1}).explain()
最终返回的结果是：
winningPlan" : {
"stage" : "FETCH",
"filter" : {
"time" : {
"$lte" : 1.0
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"lv" : 1,
"time" : 1
},
"indexName" : "lv1time1",
"isMultiKey" : false,
"multiKeyPaths" : {
"lv" : [],
"time" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"lv" : [
"[MinKey, MaxKey]"
],
"time" : [
"[MinKey, MaxKey]"
]
}
}
},
可以看到命中的是 lv 在前、time 在后的索引。

关于 mongodb 索引的几个小问题

索引会影响排序的速度吗？

索引会影响 aggregate 中 group 的效率吗？