且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用mongodb时如何处理关系

更新时间:2023-09-17 15:02:16

刚刚找到了布伦丹的答案麦克亚当斯(McAdams),来自10gen的家伙,显然比我更具权威,他建议嵌入文档.

Just found an answer of Brendan McAdams, guy from 10gen, who is obviously way way authoritative than me, and he recommends to embed documents.

第一个是将每个用户所属的用户的ObjectID手动添加到每个评论中.

The first one is to manually include to each comment ObjectID of user they're belong.

comment: { text : "...", 
           date: "...", 
           user: ObjectId("4b866f08234ae01d21d89604"),
           votes: 7 }

第二个方法也是聪明的方法,使用DBRefs

The second one, and clever way is to use DBRefs

我们在磁盘上添加了额外的I/O,从而失去了性能,对吗? (我不确定这在内部是如何工作的),因此我们需要尽可能避免链接,对吧?

是的-还有一个查询,但是驱动程序将为您完成-您可以将其视为一种语法糖.它会影响性能吗?实际上,这也取决于:) Mongo之所以如此之快的原因之一是它使用了内存映射文件 和mongo尽力将所有工作集(加上索引)直接保留在RAM中.并且每60秒(默认情况下)会同步RAM快照和基于磁盘的文件.
当我说工作集时,是指您正在使用的东西:您可以拥有三个集合- foo bar baz ,但是如果您现在仅使用foo和bar,则应该将它们加载到ram中,而baz则保留在废弃的磁盘上.此外,内存映射文件只允许加载集合的一部分.因此,如果您要构建诸如engadget或techcrunch之类的文件,则很有可能工作集将是最近几天的注释,而旧页面的恢复频率则会降低(注释会按需生成到内存中),因此它不会不会显着影响性能.

Yes - there would be one more query, but driver will do it for you - you can think of this as of kind of syntax sugar. Does it affect performance? Actually, it is depends too :) One of the reasons why Mongo so freaking fast is that it is using memory mapped files and mongo try it best to keep all of the working set (plus indexes) directly in RAM. And every 60 seconds (by default) it syncs RAM snapshot with disk based file.
When I'm saying working set, I mean things you are working with: you can have three collections - foo, bar, baz, but if you are working now only with foo and bar, they ought to be loaded into ram, while baz stays on disk abandoned. Moreover memory mapped files allow as to load only part of the collection. So if you're building something like engadget or techcrunch there is high probability that working set would be comments for the last few days and old pages will be revived way less frequently (comments would be spawned to memory on demand), so it doesn't affect performance significally.

所以回顾一下:只要您将工作集保留在内存中(您可能会认为这是读/写缓存),那么获取这些内容的速度就非常快,并且再进行一次查询就不会成为问题.如果您处理的数据片不适合内存,则速度会

So recap: as long as you keep working set in memory (you may think that is read/write caching), fetching of those things is superfast and one more query wouldn't be a problem. If you working with a slices of data that doesn't fit into memory, there would be speed degradation, but I don't now your circumstances -- it could be acceptable, so in both cases I tend to choose do use linking.