且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

基于记录中字段数的Mongodb查询

更新时间:2021-11-09 22:10:42

运行它仍然不是一个很好的查询,但是通过$objectToArray$redact

It's still not a nice query to run, but there is a slightly more modern way to do it via $objectToArray and $redact

db.collection.aggregate([
  { "$redact": {
    "$cond": {
      "if": {
        "$eq": [
          { "$size": { "$objectToArray": "$value" } },
          3
        ]
      },
      "then": "$$KEEP",
      "else": "$$PRUNE"
    }
  }}
])

$objectToArray基本上将对象强制转换为数组形式,非常类似于JavaScript中Object.keys().map()的组合.

Where $objectToArray basically coerces the object into an array form, much like a combination of Object.keys() and .map() would in JavaScript.

这仍然不是一个好主意,因为它确实需要扫描整个集合,但是至少聚合框架操作使用本机代码",而不是像使用$where的JavaScript解释那样.

It's still not a fantastic idea since it does require scanning the whole collection, but at least the aggregation framework operations use "native code" as opposed to JavaScript interpretation as is the case using $where.

因此,通常建议更改数据结构并使用自然数组以及可能的情况下存储的"size"属性,以便进行最有效的查询操作.

So it's still generally advisable to change data structure and use a natural array as well as stored "size" properties where possible in order to make the most effective query operations.

是的,可以这样做,但不是***的方法.原因是您实际上使用的是 $where 运算符查询,该查询使用JavaScript评估来匹配内容.这不是最有效的方法,因为它永远无法使用索引,并且需要测试所有文档:

Yes it is possible to do but not in the nicest way. The reason for this is that you are essentially using a $where operator query which uses JavaScript evaluation to match the contents. Not the most efficient way as this can never use an index and needs to test all the documents:

db.collection.find({ "$where": "return Object.keys(this.value).length == 3" })

这将查找与三个"元素匹配的条件,然后仅返回列出的两个文档:

This looks for the condition matching "three" elements, then only two of your listed documents would be returned:

{ "_id" : "number1", "value" : { "a" : 1, "b" : 2, "f" : 5 } }
{ "_id" : "number2", "value" : { "e" : 2, "f" : 114, "h" : 12 } }

或者对于五个"或更多字段,您可以执行相同的操作:

Or for "five" fields or more you can do much the same:

db.numbers.find({ "$where": "return Object.keys(this.value).length >= 5" })

因此,该运算符的参数实际上是在服务器上评估为返回true所在位置的JavaScript语句.

So the arguments to that operator are effectively JavaScript statements that are evaluated on the server to return where true.

一种更有效的方法是将元素的计数"存储在文档本身中.这样,您可以为该字段建立索引",查询效率更高,因为不需要扫描其他条件选择的集合中的每个文档来确定长度:

A more efficient way is to store the "count" of the elements in the document itself. In this way you can "index" this field and the queries are much more efficient as each document in the collection selected by other conditions does not need to be scanned to determine the length:

{_id:'number1', value:{'a':1, 'b':2, 'f':5} count: 3},
{_id:'number2', value:{'e':2, 'f':114, 'h':12}, count: 3},
{_id:'number3', value:{'i':2, 'j':22, 'z':12, 'za':111, 'zb':114}, count: 5}

然后要获得包含五个"元素的文档,您只需要简单的查询:

Then to get the documents with "five" elements you only need the simple query:

db.collection.find({ "count": 5 })

通常是***形式.但是还有一点是,您可能会从一般实践中满意的一般对象"结构并不是MongoDB通常能很好地发挥"的东西.问题是对象中元素的遍历",这样,当您使用数组"时,MongoDB会更快乐.甚至以这种形式:

That is generally the most optimal form. But another point is that the general "Object" structure that you might be happy with from general practice is not something that MongoDB "plays well" with in general. The problem is "traversal" of elements in the object, and in this way MongoDB is much happier when you use an "array". And even in this form:

{
    '_id': 'number1', 
    'values':[
        { 'key': 'a', 'value': 1 },
        { 'key': 'b', 'value': 2 }, 
        { 'key': 'f', 'value': 5 }
    ],
},
{
    '_id': 'number2', 
    'values':[
        { 'key': 'e', 'value': 2 }, 
        { 'key': 'f', 'value': 114 }, 
        { 'key': 'h', 'value': 12 }
    ],
},
{
    '_id':'number3', 
    'values': [
        { 'key': 'i', 'values': 2 }, 
        { 'key': 'j', 'values': 22 }, 
        { 'key': 'z'' 'values': :12 }, 
        { 'key': 'za', 'values': 111 },
        { 'key': 'zb', 'values': 114 }
    ]
}

因此,如果您实际上切换到这样的数组"格式,则可以使用一个版本的 $size 运算符:

So if you actually switch to an "array" format like that then you can do an exact length of an array with one version of the $size operator:

db.collection.find({ "values": { "$size": 5 } })

该运算符可以为数组长度的 exact 值工作,因为这是该运算符可以完成的工作的基本规定. 不平等"匹配中记录了您无法做的事情.为此,您需要用于MongoDB的聚合框架",它是JavaScript和mapReduce操作的更好替代方案:

That operator can work for an exact value for an array length as that is a basic provision of what can be done with this operator. What you cannot do as is documented in a "in-equality" match. For that you need the "aggregation framework" for MongoDB, which is a better alternate to JavaScript and mapReduce operations:

db.collection.aggregate([
    // Project a size of the array
    { "$project": {
        "values": 1,
        "size": { "$size": "$values" }
    }},
    // Match on that size
    { "$match": { "size": { "$gte": 5 } } },
    // Project just the same fields 
    {{ "$project": {
        "values": 1
    }}
])

所以这些是备用的.有一个本机"方法可用于聚合和数组类型.但是,JavaScript评估对于MongoDB也是本机",这是有争议的,只是因此未在本机代码中实现.

So those are the alternates. There is a "native" method available to aggregation and an array type. But it is fairly arguable that the JavaScript evaluation is also "native" to MongoDB, just not therefore implemented in native code.