且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用正则表达式从MongoDB中提取子字符串列表

更新时间:2023-02-21 20:40:17

在即将发布的MongoDB版本中(撰写本文时),可以使用聚合框架和$indexOfCP运算符执行此操作.在那之前,您***的选择是MapReduce.

It will be possible to do this in the upcoming version of MongoDB(as the time of this writing) using the aggregation framework and the $indexOfCP operator. Until then, your best bet here is MapReduce.

var mapper = function() { 
    emit(this._id, this.fileName.substring(this.fileName.indexOf(".")))
};

db.coll.mapReduce(mapper, 
                  function(key, value) {}, 
                  { "out": { "inline": 1 }}
)["results"]

哪种产量:

[
    {
        "_id" : 12121,
        "value" : ".doc"
    },
    {
        "_id" : 12125,
        "value" : ".txt"
    },
    {
        "_id" : 12126,
        "value" : ".pdf"
    },
    {
        "_id" : 12127,
        "value" : ".txt"
    }
]


为完整起见,这是使用聚合框架的解决方案 *

db.coll.aggregate(
    [
        { "$match": { "name": /\.[0-9a-z]+$/i } },
        { "$group": { 
            "_id": null,
            "extension":  { 
                "$push": {
                    "$substr": [ 
                        "$fileName", 
                        { "$indexOfCP": [ "$fileName", "." ] }, 
                        -1 
                    ]
                }
            }
        }}
    ])

产生:

{ 
    "_id" : null, 
    "extensions" : [ ".doc", ".txt", ".pdf", ".txt" ] 
}


* MongoDB的当前开发版本(在撰写本文时).