且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Inner Join vs Natural Join vs USING 子句:有什么优势吗?

更新时间:2023-10-23 14:15:34

现在,除了第一种形式有重复的列之外,其他两种形式是否有真正的优势?或者它们只是语法糖?

Now, apart from the fact that the first form has a duplicated column, is there a real advantage to the other two forms? Or are they just syntactic sugar?

TL;DR NATURAL JOIN 用于某种关系编程风格,比通常的 SQL 风格更简单.(尽管当嵌入到 SQL 中时,它会承担其余 SQL 查询语法的负担.)那是因为 1. 它直接使用谓词的简单运算符逻辑,工程(包括软件工程)、科学(包括计算机科学)和数学中的精确语言,此外 2.同时交替em>它直接使用关系代数简单运算符.

TL;DR NATURAL JOIN is used in a certain style of relational programming that is simpler than the usual SQL style. (Although when embedded in SQL it is burdened with the rest of SQL query syntax.) That's because 1. it directly uses the simple operators of predicate logic, the language of precision in engineering (including software engineering), science (including computer science) and mathematics, and moreover 2. simultaneously and alternatively it directly uses the simple operators of relational algebra.

关于 NATURAL JOIN 的常见抱怨是,由于共享列不是明确的,因此在架构更改后可能会出现不适当的列配对.在特定的开发环境中可能就是这种情况.但在那种情况下,有一个要求,即只加入某些列,而没有 PROJECT 的 NATURAL JOIN 是不合适的.所以这些论点假设 NATURAL JOIN 使用不当.此外,争论者甚至没有意识到他们忽略了需求.这样的抱怨是似是而非.(此外,健全的软件工程设计原则导致没有具有此类规范的接口.)

The common complaint about NATURAL JOIN is that since shared columns aren't explicit, after a schema change inappropriate column pairing may occur. And that may be the case in a particular development environment. But in that case there was a requirement that only certain columns be joined and NATURAL JOIN without PROJECT was not appropriate. So these arguments assume that NATURAL JOIN is being used inappropriately. Moreover the arguers aren't even aware that they are ignoring requirements. Such complaints are specious. (Moreover, sound software engineering design principles lead to not having interfaces with such specificiatons.)

来自同一阵营的另一个相关的误解似是而非的抱怨是 NATURAL JOIN 甚至不考虑外键关系".但是任何加入是因为表含义,而不是约束一>.查询不需要约束.如果添加了约束,则查询保持正确.如果删除约束,则依赖它的查询就会出错,并且必须更改为依赖它的措辞,不必改变.这与 NATURAL JOIN 无关.

Another related misconceived specious complaint from the same camp is that "NATURAL JOIN does not even take foreign key relationships into account". But any join is there because of the table meanings, not the constraints. Constraints are not needed to query. If a constraint is added then a query remains correct. If a constraint is dropped then a query relying on it becomes wrong and must be changed to a phrasing that doesn't rely on it that wouldn't have had to change. This has nothing to do with NATURAL JOIN.

您已经描述了效果上的差异:只返回每个公共列的一个副本.

You have described the difference in effect: just one copy of each common column is returned.

来自是否有根据人类可读的描述构造 SQL 查询的经验法则?:

事实证明,自然语言表达式和逻辑表达式以及关系代数表达式和 SQL 表达式(后两者的混合)以一种相当直接的方式对应.

It turns out that natural language expressions and logical expressions and relational algebra expressions and SQL expressions (a hybrid of the last two) correspond in a rather direct way.

例如来自 Codd 1970:

所描述的关系称为组件.[...] component(x, y,z) 的含义是x 是部件 y 的直接组件(或子组件),组装一个部件需要 z 个部件 xy 部分.

The relation depicted is called component. [...] The meaning of component(x, y,z) is that part x is an immediate component (or subassembly) of part y, and z units of part x are needed to assemble one unit of part y.

来自这个答案:

每个基表都有一个语句模板,又名谓词,由列名参数化,我们可以通过列名放入或删除一行.

Every base table has a statement template, aka predicate, parameterized by column names, by which we put a row in or leave it out.

将一行插入谓词给出了一个语句又名命题.提出真命题的行放在表中,提出假命题的行留在表中.(所以一个表格陈述了每个存在行的命题,而不陈述每个缺席行的命题.)

Plugging a row into a predicate gives a statement aka proposition. The rows that make a true proposition go in a table and the rows that make a false proposition stay out. (So a table states the proposition of each present row and states NOT the proposition of each absent row.)

但是每个表表达式值每个表达式都有一个谓词.关系模型的设计使得如果表 TU 包含 T(...) 和 U(...)(分别)的行,则:

But every table expression value has a predicate per its expression. The relational model is designed so that if tables T and U hold rows where T(...) and U(...) (respectively) then:

  • T NATURAL JOIN U 保存 T(...) AND U(...) 的行
  • T WHEREcondition 保存 T(...) AND condition
  • 的行
  • T UNION CORRESPONDING U 保存 T(...) OR U(...) 的行
  • T EXCEPT CORRESPONDING U 保存 T(...) AND NOT U(...) 的行
  • SELECT DISTINCT要保留的列FROM T 保存行,其中
    存在 要删除的列 这样的 T(...)
    • T NATURAL JOIN U holds rows where T(...) AND U(...)
    • T WHEREcondition holds rows where T(...) AND condition
    • T UNION CORRESPONDING U holds rows where T(...) OR U(...)
    • T EXCEPT CORRESPONDING U holds rows where T(...) AND NOT U(...)
    • SELECT DISTINCTcolumns to keepFROM T holds rows where
      THERE EXISTS columns to drop SUCH THAT T(...)
    • etc
    • 而对 SQL 的其他推理...不是自然的":

      Whereas reasoning about SQL otherwise is... not "natural":

      SQL SELECT 语句在代数上可以被认为是 1. 隐式地将表的每一列 C 重命名为(可能是隐式的)相关名称 TTC,然后 2. CROSS JOINing,然后 3. RESTRICTing per INNER ON,然后 4. RESTRICTing per WHERE,然后 5. PROJECTing per SELECT,然后 6. RENAMEing per SELECT,删除 T.s, then 7. 隐式重命名以删除剩余的 T.s T.-RENAMEings 之间的代数运算符也可以被认为是逻辑运算符和表名作为它们的谓词:T JOIN ... vs Employee T.EMPLOYEE 的名字是 T.NAME ... AND ....但从概念上讲,SELECT 语句内部是一个双重命名诱导 CROSS JOIN 表,列名使用 T.Cs,而外部表的列名使用 Cs.

      An SQL SELECT statement can be thought of algebraically as 1. implicitly RENAMEing each column C of a table with (possibly implicit) correlation name T to T.C, then 2. CROSS JOINing, then 3. RESTRICTing per INNER ON, then 4. RESTRICTing per WHERE, then 5. PROJECTing per SELECT, then 6. RENAMEing per SELECT, dropping T.s, then 7. implicitly RENAMEing to drop remaining T.s Between the T.-RENAMEings algebra operators can also be thought of as logic operators and table names as their predicates: T JOIN ... vs Employee T.EMPLOYEE has name T.NAME ... AND .... But conceptually inside a SELECT statement is a double-RENAME-inducing CROSS JOIN table with T.Cs for column names while outside tables have Cs for column names.

      或者,SQL SELECT 语句在逻辑上可以被认为是 1. 在每个相关名称 T 和基本名称或子查询 的整个语句周围引入 FORSOME T IN E>E,然后2.引用量化的T的值,用TC引用它的C部分,然后3. 根据 FROM 等从 TC 构建结果行,然后 4. 根据 SELECT 子句命名结果行列,然后 4. 离开 FORSOME 的范围.代数运算符再次被认为是逻辑运算符,而表名则是它们的谓词.同样,这在概念上有 T.C 在 SELECT 里面,但 C 在外面,相关名称来来去去.

      Alternatively an SQL SELECT statement can be thought of logically as 1. introducing FORSOME T IN E around the entire statement per correlation name T and base name or subquery E, then 2. referring to the value of quantified T by using T.C to refer to its C part, then 3. building result rows from T.Cs per FROM etc, then 4. naming the result row columns per the SELECT clause, then 4. leaving the scope of the FORSOMEs. Again the algebra operators are being thought of as logic operators and table names as their predicates. Again though, this conceptually has T.C inside SELECTs but C outside with correlation names coming and going.

      这两种 SQL 解释远不及使用 JOIN 或 AND 等简单,可互换.(您不必同意它更简单,但这种看法就是为什么 NATURAL JOIN 和 UNION/EXCEPT CORRESPONDING 存在的原因.)(在其预期用途之外批评这种风格的论点是似是而非.)

      These two SQL interpretations are nowhere near as straightforward as just using JOIN or AND, etc, interchangeably. (You don't have to agree that it's simpler, but that perception is why NATURAL JOIN and UNION/EXCEPT CORRESPONDING are there.) (Arguments criticizing this style outside the context of its intended use are specious.)

      USING 是一种中间地带孤儿,一只脚在 NATURAL JOIN 阵营,另一只脚在 CROSS JOIN.它在前者中没有实际作用,因为那里没有重复的列名.在后者中,它或多或少只是缩写 JOIN 条件和 SELECT 子句.

      USING is a kind of middle ground orphan with one foot in the NATURAL JOIN camp and one in the CROSS JOIN. It has no real role in the former because there are no duplicate column names there. In the latter it is more or less just abbreviating JOIN conditions and SELECT clauses.

      我可以看到后一种形式的缺点是您应该将主键和外键命名为相同的名称,这并不总是实用的.

      I can see the disadvantage in the latter forms is that you are expected to have named your primary and foreign keys the same, which is not always practical.

      PK(主键)、FK(外键)和查询不需要其他约束.(知道一列是其他人的函数允许标量子查询,但你总是可以不使用.)此外,任何两个表都可以有意义地连接.如果您需要两列与 NATURAL JOIN 具有相同的名称,您可以通过 SELECT AS 重命名.

      PKs (primary keys), FKs (foreign keys) & other constraints are not needed for querying. (Knowing a column is a function of others allows scalar subqueries, but you can always phrase without.) Moreover any two tables can be meaningfully joined. If you need two columns to have the same name with NATURAL JOIN you rename via SELECT AS.