且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

App Engine数据存储 - 数据模型问题

更新时间:2023-01-09 10:28:06

两者都存在问题。 #1基本上是好的,除了使用引用属性而不是祖先,并且让你的Object类型为Expando。



桶和用户从桶下降的问题是否会迫使用户创建的每个存储桶和对象都位于相同的实体组。这限制了性能和可伸缩性,因为所有单个用户的数据都必须存储在同一个数据存储节点上。当您需要在同一个事务中操作多个实体时,实体组很有用。如果您只需要对所有权进行建模,请使用 ReferenceProperty


b $ b应用程序中,
对象上定义的属性可以根据存储区到
存储区而不同。这些属性在创建存储桶时由用户指定

另外,对象上的所有属性都需要
来查询。


An Expando 为您提供了这两项功能。您的属性可以即时定义,并自动进行索引。



没有什么要求同一种类的两个实体具有相同的一组属性。种类只是名字;他们没有定义或执行任何类型的模式。动态地创建一堆它们并不会为你买任何东西。


I need to design a data model for an Amazon S3-like application. Let's simplify the problem into 3 key concepts - users, buckets and objects. There are many ways to design this model - I'll list two.

  1. Three Kinds - User, Bucket and Object. Each Object has a Bucket as its parent. Each Bucket has a User as its parent. User is the root.

  2. Dynamic Kinds - Users are stored in the User kind and buckets are stored in the Bucket kind - same as #1. However objects within a bucket are stored in a dynamic kind named "<BucketID>_Object". There is no parent / child relationship between bucket and object entities anymore. This relationship is established by the name of the object kind.

#1 is of course the more intuitive and traditional model. One can argue that #2 is radical while others may say ridiculous.

Why am I thinking about #2? - In my application, properties defined on objects can vary from bucket to bucket. These properties are specified by the user at bucket creation time. Also, all properties on objects need to be queryable. A dynamic object kind per bucket allows me to support these requirements. Moreover, because my object kind is now a root kind, I no longer need to apply ancestor filters which means I get an index on each object property for free. In Model #1 I am forced to apply ancestor filters which means that I need a custom index for every property I want to query against.

I apologize for the convoluted explanation. I'll try better if it's not clear.

My questions are - is #2 a totally outrageous model? With #2 my kinds can potentially run into the 10s of thousands. Is that ok? I understand there's a limit on the number of custom indexes. But I am not creating custom indexes on my dynamic kinds but only relying on the automatic indexes.

Thanks, Keyur

There are issues with both. #1 is basically fine, except use reference properties instead of ancestors, and make your Object kind an Expando.

The problem with having buckets descend from users and objects descend from buckets is that this forces every bucket and object a user creates to live in the same entity group. This constrains performance and scalability, as all of an individual user's data has to be stored on the same datastore node. Entity groups are useful when you need to manipulate multiple entities in the same transaction. If you just need to model ownership, use a ReferenceProperty.

In my application, properties defined on objects can vary from bucket to bucket. These properties are specified by the user at bucket creation time. Also, all properties on objects need to be queryable.

An Expando gives you both of these. Your properties can be defined on the fly, and they're indexed automatically.

Nothing requires two entities of the same kind to have the same set of properties. Kinds are just names; they don't define or enforce any kind of schema. Creating a bunch of them on the fly just doesn't buy you anything.