且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

按 DocumentDB 中的字段分组

更新时间:2023-02-15 11:45:21

DocumentDB 当前不支持 GROUP BY 或任何其他聚合.它是请求量第二高的功能,并在 DocumentDB UserVoice 上列为审核中".

同时,documentdb-lumenize 是 DocumentDB 的聚合库,编写为存储程序.您将 cube.string 作为存储过程加载,然后使用聚合配置调用它.这个例子有点矫枉过正,但它完全有能力做你在这里要求的事情.如果您将其传递给存储过程:

{cubeConfig: {groupBy: "name", field: "priority", f: "max"}}

这应该可以满足您的需求.

注意,Lumenize 可以做的远不止这些,包括简单的分组与其他函数(总和、计数、最小值、最大值、中值、p75 等)、数据透视表,以及一直到复杂的 n-每个单元具有多个度量的维度超立方体.

我从未尝试从 .NET 加载 cube.string,因为我们使用的是 node.js,但它作为字符串而不是 javascript 提供,因此您可以轻松加载和发送它.

或者,您可以编写一个存储过程来完成这个简单的聚合.

Is it possible, in some way, to group upon a field in DocumentDB, stored procedure or not?

Let's say I have the following collection:

[
    {
        name: "Item A",
        priority: 1
    },
    {
        name: "Item B",
        priority: 2
    },
    {
        name: "Item C",
        priority: 2
    },
    {
        name: "Item D",
        priority: 1
    }
]

I would like to get all the items in the highest priority group (priority 2 in this case). I do not know what value of the highest priority. I.e.:

[
    {
        name: "Item B",
        priority: 2
    },
    {
        name: "Item C",
        priority: 2
    }
]

With some crude LINQ, it would look something like this:

var highestPriority = 
    collection
        .GroupBy(x => x.Priority)
        .OrderByDescending(x => x.Key)
        .First();

DocumentDB currently does not support GROUP BY nor any other aggregation. It is the second most requested feature and is listed as "Under Review" on the DocumentDB UserVoice.

In the mean time, documentdb-lumenize is an aggregation library for DocumentDB written as a stored procedure. You load cube.string as a stored procedure, then you call it with an aggregation configuration. It's a bit overkill for this example, but it's perfectly capable of doing what you are asking here. If you pass this into the stored procedure:

{cubeConfig: {groupBy: "name", field: "priority", f: "max"}}

that should do what you want.

Note, Lumenize can do a lot more than that including simple group-by's with other function (sum, count, min, max, median, p75, etc.), pivot tables, and all the way up to complicated n-dimensional hypercubes with multiple metrics per cell.

I have never tried loading cube.string from .NET because we're on node.js, but it is shipped as a string rather than javascript so you can easily load and send it.

Alternatively, you could write a stored procedure to do this simple aggregation.