且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

执行动态交叉表查询

更新时间:2023-02-05 17:10:18

您要的是 不可能 . SQL是一种严格类型化的语言. PostgreSQL函数需要在创建 时间声明返回类型(RETURNS ..).

解决此问题的一种有限方法是使用多态函数.如果您可以在函数调用 时提供返回类型.但这从您的问题中看不出来.

A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.

可以返回带有匿名记录的完全动态的结果.但是随后您需要在每次调用时提供一个列定义列表.您如何知道返回的列?赶上22..

You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.

有多种解决方法,具体取决于您需要或可以使用的方法.由于您的所有数据列似乎都共享相同的数据类型,因此我建议返回一个 array :text[].或者,您可以返回hstorejson之类的文档类型.相关:

There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:

将hstore键动态转换为一组未知键集的列

但是使用两个调用可能会更简单:1:让Postgres构建查询. 2:执行并检索返回的行.

But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.

我完全不会使用您的问题 中提出的Eric Minikel的功能.通过恶意格式错误的标识符来防止SQL注入是不安全的.使用 format() 进行构建查询字符串,除非您运行的版本早于Postgres 9.1.

I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.

更短,更简洁的实现可能看起来像这样:

A shorter and cleaner implementation could look like this:

CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
                              , _expr text  -- still vulnerable to SQL injection!
                              , _type regtype)
  RETURNS text AS
$func$
DECLARE
   _cat_list text;
   _col_list text;
BEGIN

-- generate categories for xtab param and col definition list    
EXECUTE format(
 $$SELECT string_agg(quote_literal(x.cat), '), (')
        , string_agg(quote_ident  (x.cat), %L)
   FROM  (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
 , ' ' || _type || ', ', _cat, _tbl)
INTO  _cat_list, _col_list;

-- generate query string
RETURN format(
'SELECT * FROM crosstab(
   $q$SELECT %I, %I, %s
      FROM   %I
      GROUP  BY 1, 2  -- only works if the 3rd column is an aggregate expression
      ORDER  BY 1, 2$q$
 , $c$VALUES (%5$s)$c$
   ) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr  -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type
);

END
$func$ LANGUAGE plpgsql;

与原始版本相同的函数调用.功能crosstab()由附加模块 tablefunc .基础:

Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:

这将安全地处理列名和表名.注意使用对象标识符类型regclassregtype.也适用于模式限定的名称.

This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.

但是,当您传递要作为表达式执行的字符串(原始查询中的_expr-cellc)时, 并不完全安全 .这种输入本质上是不安全的,不能进行SQL注入,因此永远不要向公众公开.

However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.

仅一次 扫描表中的两个类别列表,并且速度应更快一些.

Scans the table only once for both lists of categories and should be a bit faster.

仍然无法返回完全动态的行类型,因为那是绝对不可能的.

Still can't return completely dynamic row types since that's strictly not possible.