且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

MongoDB - 子级和父级结构

更新时间:2023-02-05 14:03:08

您需要考虑需要执行的查询类型以及需要每种类型的频率.当我在做类似的事情时,我想出了六种可能的行动:

You need to consider the type of queries you will need to perform and how frequently each type will be needed. When I was working on something similar, I came up with six possible actions:

  • 和父母一起做点什么
  • 和孩子们一起做点什么
  • 与祖先(父母的父母,父母的父母等)做一些事情
  • 对后代做一些事情(孩子的孩子,孩子的孩子等)
  • 更改关系(在层次结构中添加/移动/删除节点)
  • 更改当前节点中的主要数据(例如更改标题"字段中的值)

您需要估计每一项对您的应用程序的重要性.

You'll want to estimate how important each of these is to your application.

如果您的大部分工作都涉及处理某些给定文章的存储数据,包括其直接父级和子级,则第一个想法最有用.实际上,在 MongoDB 中,将您需要的所有信息放在同一个文档中而不是在外部引用它是很常见的,这样您只需要检索一件事并使用该数据即可.不过,列表中的最后四个操作更加棘手.

If most of your work involves working with stored data for some given article including its immediate parent and children, the first idea is most useful. Indeed in MongoDB, it is quite common to place all the information you need in the same document rather than referencing it externally so that you only need to retrieve one thing and just work with that data. The last four actions in the list are more tricky though.

特别是,在这种情况下,您需要遍历树以检索祖先和后代,在中间文档中移动并遵循路径,即使您可能只关心路径中的最后一个文档.对于长层次结构,这可能会很慢.由于每个文档中都存在所有数据,因此更改关系可能需要在多个文档中移动大量信息.但是即使更改像标题"这样的单个字段也可能很烦人,因为您必须考虑该字段存在于多个不同文档中的事实,无论是作为主字段还是在父字段或子字段下.

In particular, you will need to traverse through the tree to retrieve ancestors and descendants in this case, moving through intermediary documents and following a path, even though you may only care about the last document in the path. This can be slow for long hierarchies. Changing relationships can require moving a lot of information around in multiple documents because of all the data present in each one. But even changing a single field like "title" can be annoying, because you have to consider the fact that this field is present in multiple different documents, either as a main field or under the parent or children fields.

基本上,您的第一个想法在更多的静态应用程序中效果***它定期.

Basically, your first idea works best in more static applications where you won't be changing the data a lot after initially creating it, but where you need to read it regularly.

MongoDB 文档有 五种推荐的方法 用于处理树-像(分层)结构.它们都有不同的优点和缺点,尽管它们都可以很容易地更新一篇文章中的主要数据,只需在一个文档中进行更新.

The MongoDB documentation has five recommended approaches for handling tree-like (hierarchical) structures. All of them have different advantages and disadvantages, though they all make it easy to update the main data in an article by only needing to do so in one document.

  • 父引用:每个节点都包含对其父节点的引用.
  • 优势:
    • 快速父级查找(按_id"=您的文档标题查找,返回父级"字段)
    • 快速子项查找(按父项"查找 = 您的文档标题,这将返回所有子文档)
    • 更新关系只是更改父"字段的问题
    • 更改基础数据只需要更改一个文档
    • Parent References: each node contains a reference to its parent.
    • Advantages:
      • Fast parent lookup (lookup by "_id" = your doc title, return "parent" field)
      • Fast children lookup (lookup by "parent" = your doc title, which will return all child documents)
      • Updating relationships is just a matter of changing the "parent" field
      • Changing the underlying data requires changes to only one document
      • 按祖先和后代搜索很慢,需要遍历
      • 优势:
        • 快速检索孩子(返回孩子数组)
        • 快速关系更新(只需在需要的地方更新子数组)
        • 查找父节点需要在所有文档的所有子数组中查找您的 _id,直到找到它(因为父节点将包含当前节点作为子节点)
        • 祖先与后代搜索需要遍历树
        • 优势:
          • 快速检索祖先(无需遍历即可找到特定的)
          • 按照父母参考"方法轻松查找父母和孩子
          • 要查找后代,只需查找祖先,因为所有后代必须包含相同的祖先
          • 需要担心在关系发生变化时(通常是跨多个文档)保持祖先数组和父字段的更新.
          • 优势:
            • 使用正则表达式轻松查找子代和后代
            • 可以使用路径来检索父级和祖先
            • 灵活性,例如通过部分路径查找节点
            • 关系更改很困难,因为它们可能需要更改跨多个文档的路径
            • 优势:
              • 通过在左"和右"之间搜索,以***方式轻松检索后代
              • 与父参考"方法一样,很容易找到父母和孩子
              • 需要遍历结构才能找到祖先
              • 在这里,关系更改的效果比其他任何选项都差,因为树中的每个文档都可能需要更改,以确保一旦层次结构发生更改,左"和右"仍然有意义

              MongoDB 文档中更详细地讨论了这五种方法>.

              您的第二个想法结合了上面讨论的父引用"和子引用"方法.这种方式可以很方便的同时找到children和parent,也可以很方便的更新一篇文章的关系和主要数据(虽然需要同时更新parent和children字段),但是还是需要遍历一遍寻找祖先和后代.

              Your second idea combines the "Parent References" and "Child References" approaches discussed above. This approach makes it easy to find both the children and the parent and makes it easy to update relationships and the main data of an article (though you need to update both the parent and the children fields), but you still need to traverse through it to find ancestors and descendants.

              如果您对查找祖先和后代感兴趣(并且关心这一点而不是能够轻松更新关系),您可以考虑在您的第二个想法中添加祖先数组,以便查询祖先和后代.当然,如果你这样做,更新关系会变得非常痛苦.

              If you are interested in finding ancestors and descendants (and care about this more than being able to easily update relationships), you can consider adding an ancestors array to your second idea to make it also easy to query for ancestors and descendants. Of course, updating relationships becomes a real pain if you do this though.

              结论:

              • 最终,这一切都取决于最需要采取的行动.由于您正在处理文章,其基础数据(如标题)可能会经常更改,因此您可能希望避免第一个想法,因为您不仅需要更新该文章的主文档,还需要更新所有子文档以及父母.

              • Ultimately it all depends on what actions are needed the most. Since you're working with articles, whose underlying data (like the title) can change frequently, you may want to avoid the first idea since you would need to update not only the main document for that article but all child documents as well as the parent.

              您的第二个想法可以很容易地检索直接的父母和孩子.更新关系也不是太难(它肯定比其他一些可用的选项更好).

              Your second idea makes it easy to retrieve the immediate parent and children. Updating relationships is also not too difficult (It's certainly better than some of the other options available).

              如果您真的想以牺牲更新关系为代价轻松查找祖先和后代,请选择包含祖先引用数组.

              If you really want to make it easy to find ancestors and descendants at the expense of updating relationships as easily, choose to include an array of ancestor references.

              一般来说,尽量减少所需的遍历次数,因为它们需要运行某种迭代或递归才能获得所需的数据.如果您重视更新关系的能力,您还应该选择一个更改树中较少节点的选项(父引用、子引用和您的第二个想法可以做到这一点).

              In general, try to minimize the number of traversals required, as they require running some kind of iteration or recursion to get to the data you want. If you value the ability to update relationships, you should also pick an option that changes fewer nodes in the tree (Parent References, Child References, and your second idea can do this).