且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用 R 删除字符串中的重复字符?

更新时间:2023-01-12 21:23:39

我没有仔细考虑这个问题,但这是我在正则表达式中使用引用的快速解决方案:

I did not think very carefully on this, but this is my quick solution using references in regular expressions:

gsub('([[:alpha:]])\\1+', '\\1', 'Buenaaaaaaaaa Suerrrrte')
# [1] "Buena Suerte"

()先捕获一个字母,\\1指那个字母,+表示匹配一次或多次;将所有这些部分放在一起,我们可以将一个字母匹配两次或更多次.

() captures a letter first, \\1 refers to that letter, + means to match it once or more; put all these pieces together, we can match a letter two or more times.

要包含字母数字以外的其他字符,请将 [[:alpha:]] 替换为与您希望包含的任何内容匹配的正则表达式.

To include other characters besides alphanumerics, replace [[:alpha:]] with a regex matching whatever you wish to include.