更新时间:2023-11-18 22:47:58
如果您的文件使用 ISO-8859-1 编码,但您的系统区域设置为 UTF-8,这将不起作用.
将文件转换为 UTF-8 或将您的系统区域设置更改为 ISO-8859-1.
# 在 grepping 之前从 ISO-8859-1 转换为环境语言环境# 输出将在当前语言环境中$ iconv -f 8859_1 input/words.txt |格雷普...# 使用 ISO-8859-1 语言环境运行 grep# 输出将采用 ISO-8859-1 编码$ cat input/words.txt |环境 LC_ALL=en_US grep ...I'm trying mount a regex that get some words on a file where all letters of this word match with a word pattern.
My problem is, the regex can't find accented words, but in my text file there are alot of accented words.
My command line is:
cat input/words.txt | grep '^[éra]{1,4}$' > output/words_era.txt
cat input/words.txt | grep '^[carroça]{1,7}$' > output/words_carroca.txt
And the content of file is:
carroça
éra
éssa
roça
roco
rato
onça
orça
roca
How can I fix it?
If your file is encoded in ISO-8859-1 but your system locale is UTF-8, this will not work.
Either convert the file to UTF-8 or change your system locale to ISO-8859-1.
# convert from ISO-8859-1 to the environmental locale before grepping # output will be in the current locale $ iconv -f 8859_1 input/words.txt | grep ... # run grep with an ISO-8859-1 locale # output will be in ISO-8859-1 encoding $ cat input/words.txt | env LC_ALL=en_US grep ...