更新时间:2023-02-15 11:14:49
如果您计划搜索特定属性,将其序列化为单个列是一个坏主意,因为您必须使用每行函数来获取信息 - 这从来不是很好。
If you ever plan on searching for specific attribtes, it's a bad idea to serialize them into a single column, since you'll have to use per-row functions to get the information out - this never scales well.
我会选择你的第二选择。在属性表中有一个属性列表,它们自己表中的对象,以及一个叫做对象属性的多对多关系表。
I would opt for your second choice. Have a list of attributes in an attribute table, the objects in their own table, and a many-to-many relationship table called object attributes.
例如:
objects:
object_id integer
object_name varchar(20)
primary key (object_id)
attributes:
attr_id integer
attr_name varchar(20)
primary key (attr_id)
object_attributes:
object_id integer references (objects.object_id)
attr_id integer references (attributes.attr_id)
primary key (object_id,attr_id)
关于性能是注意的,但在我的经验,拆分列总是比组合多列更昂贵。如果事实证明存在性能问题,出于性能原因,打破3NF是完全可以接受的。
Your concern about performance is noted but, in my experience, it's always more costly to split a column than to combine multiple columns. If it turns out that there are performance problems, it's perfectly acceptable to break 3NF for performance reasons.
在这种情况下,我将以相同的方式存储它,但也有一列与原始序列化数据。如果使用插入/更新触发器来保持列和合并数据同步,则不会有任何问题。但你不应该担心,直到一个实际的问题表面。
In that case I would store it the same way but also have a column with the raw serialized data. Provided you use insert/update triggers to keep the columnar and combined data in sync, you won't have any problems. But you shouldn't worry about that until an actual problem surfaces.
通过使用这些触发器,你最小化所需的工作,只有当数据更改。通过尝试提取子列信息,您对每个select都做不必要的工作。
By using those triggers, you minimize the work required to only when the data changes. By trying to extract sub-column information, you do unnecessary work on every select.