且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

设计“一排”关系的正确方法是什么?

更新时间:2023-01-19 15:53:56

将一列添加到Datasets表中,该表指定它所代表的数据集类型。那么1可能意味着HEA,2可能意味着CSO。检查约束将该字段限制为两个值之一。如果稍后添加新类型的数据集,则唯一需要改变的是更改约束。如果将其定义为数据集类型表的外键,则只需要在其中添加新的数据集类型。



在PK和新字段。



将相同的字段添加到每个子表。但是,检查约束将HEA表中的值限制为该值,并将CSO表仅限制该值。然后形成Datasets表的ID字段,将新字段形成为FK to Datasets表。



这将ID值限制为只有一个子表,它必须是一个在Datasets表中定义。也就是说,如果您定义了ID值为1000的HEA数据集条目和HEA类型值,则唯一可以包含ID值为1000的子表是HEA表。



创建表数据集(
ID int identity / auto_generate,
DSType char(3)检查(DSType in('HEA','CSO'),
[一切],
约束PK_Datasets主键(ID),
约束UQ_Dateset_Type唯一(ID,DSType) - 需要参考
);

创建表HEA(
ID int not null,
DSType char(3)检查(DSType ='HEA') - 使其为常量值
[其他HEA数据],
约束PK_HEA主键(ID),
约束FK_HEA_Dataset_PK外键(ID)
引用数据集(ID),
约束FK_HEA_Dataset_Type外键(ID,DSType)
引用数据集(ID,DSType)
);

与CSO子表相同的想法。 p>

我会推荐一个HEA和CSO视图,显示完整的数据集行,元数据和类型特定的数据,连接在一起。使用这些视图的触发器,它们可以是应用程序代码的DML点。然后,应用程序不必跟踪数据库中的数据是如何布局的,如果机会出现本身,则可以轻松进行改进。


I am trying to model the following in a postgres db.

I have N number of 'datasets'. These datasets are things like survey results, national statistics, aggregated data etc. They each have a name a source insitution a method etc. This is the meta data of a dataset and I have tables created for this and tables for codifying the research methods etc. The 'root' meta-data table is called 'Datasets'. Each row represents one dataset.

I then need to store and access the actual data associated with this dataset. So I need to create a table that contains that data. How do I represent the relationship between this table and its corresponding row in the 'Datasets' table?

an example

'hea' is a set of survey responses. it is unaggregated so each row is one survey response. I create a table called 'HeaData' that contains this data.

'cso' is a set of aggregated employment data. each row is a economic sector. I create a table called 'CsoData' that contains this data

I create a row for each of these in the 'datasets' table with the relevant meta data for each and they have ids of 1 & 2 respectively.

what is the best way to relate 1 to the HeaData table and 2 to the CsoData table?

I will eventually be accessing this data with scala slick so if the database design could just 'plug and play' with slick that would be ideal

Add a column to the Datasets table which designates which type of dataset it represents. Then a 1 may mean HEA and 2 may mean CSO. A check constraint would limit the field to one of the two values. If new types of datasets are added later, the only change needed is to change the constraint. If it is defined as a foreign key to a "type of dataset" table, you just need to add the new type of dataset there.

Form a unique index on the PK and the new field.

Add the same field to each of the subtables. But the check constraint limits the value in the HEA table to only that value and the CSO table to only that value. Then form the ID field of Datasets table and the new field as the FK to Datasets table.

This limits the ID value to only one of the subtables and it must be the one defined in the Datasets table. That is, if you define a HEA dataset entry with an ID value of 1000 and the HEA type value, the only subtable that can contain an ID value of 1000 is the HEA table.

create table Datasets(
    ID     int identity/auto_generate,
    DSType char( 3 ) check( DSType in( 'HEA', 'CSO' ),
    [everything else],
    constraint PK_Datasets primary key( ID ),
    constraint UQ_Dateset_Type unique( ID, DSType ) -- needed for references
);

create table HEA(
    ID     int not null,
    DSType char( 3 ) check( DSType = 'HEA' ) -- making this a constant value
    [other HEA data],
    constraint PK_HEA primary key( ID ),
    constraint FK_HEA_Dataset_PK foreign key( ID )
        references Dataset( ID ),
    constraint FK_HEA_Dataset_Type foreign key( ID, DSType )
        references Dataset( ID, DSType )
);

The same idea with the CSO subtable.

I would recommend an HEA and CSO view that would show the complete dataset rows, metadata and type-specific data, joined together. With triggers on those views, they can be the DML points for the application code. Then the apps don't have to keep track of how that data is laid out in the database, making it a lot easier to make improvements should the opportunity present itself.