且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用regexp_split_to_table(Amazon Redshift)时出错

更新时间:2022-12-07 17:07:04

A1



E 是Posix样式的转义字符串的前缀。在现代的Postgres中,您通常不需要此功能。仅当您想解释字符串中的特殊字符时,才添加它。像 E'\n'换行符。详细信息和文档链接:

A1

E is a prefix for Posix-style escape strings. You don't normally need this in modern Postgres. Only prepend it if you want to interpret special characters in the string. Like E'\n' for a newline char.Details and links to documentation:

  • Insert text with single quotes in PostgreSQL
  • SQL select where column begins with \

E 是没有意义的,但仍然可以使用。 >答案恐怕您链接到的不是很好。

E is pointless noise in your query, but it should still work. The answer you are linking to is not very good, I am afraid.

应该工作照原样。但***不要使用 E

Should work as is. But better without the E.

SELECT id, regexp_split_to_table(fruits, '|') AS split_fruits
FROM   tbl;

对于简单的分隔符,您不需要昂贵的正则表达式。通常这会更快:

For simple delimiters, you don't need expensive regular expressions. This is typically faster:

SELECT id, unnest(string_to_array(fruits, '|')) AS split_fruits
FROM   tbl;

在Postgres 9.3+中,您宁愿使用 LATERAL 加入集合返回函数:

In Postgres 9.3+ you'd rather use a LATERAL join for set-returning functions:

SELECT t.id, f.split_fruits
FROM   tbl t
LEFT   JOIN LATERAL unnest(string_to_array(fruits, '|')) AS f(split_fruits)
                                                                   ON true;

详细信息:

  • What is the difference between LATERAL and a subquery in PostgreSQL?
  • PostgreSQL unnest() with element number

它仅实现了一组简化的功能,如其手册中记录的。特别是,没有没有表函数,包括基本函数 unnest() generate_series() regexp_split_to_table()当使用其计算节点(访问任何表)时。

It only implements a reduced set of features as documented in its manual. In particular, there are no table functions, including the essential functions unnest(), generate_series() or regexp_split_to_table() when working with its "compute nodes" (accessing any tables).

您应该以标准化表格布局作为开头(每行一个水果的额外表格)。

You should go with a normalized table layout to begin with (extra table with one fruit per row).

或者以下是一些用于创建在Redshift中设置行:

Or here are some options to create a set of rows in Redshift:

  • How to select multiple rows filled with constants in Amazon Redshift?

解决方法应该做到:


  1. 创建一个数字表,其中列的数量至少应与水果的数量一样多。临时或永久(如果您继续使用)。假设我们从来没有超过9:

  1. Create a table of numbers, with at least as many rows as there can be fruits in your column. Temporary or permanent if you'll keep using it. Say we never have more than 9:

CREATE TEMP TABLE nr9(i int);
INSERT INTO nr9(i) VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9);


  • 加入数字表并使用 split_part(),它是实际上是在Redshift中实现的

    SELECT *, split_part(t.fruits, '|', n.i) As fruit
    FROM   nr9 n
    JOIN   tbl t ON split_part(t.fruits, '|', n.i) <> ''
    


  • Voilá。