且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

R:使用正则表达式定义文本范围

更新时间:2023-02-17 21:30:01

定义输入:变量r010等,我们假设它们是标量和字符串s.

Define the inputs: the variables r010 etc. which we assume are scalars and the string s.

然后定义与{...}部分匹配的模式pat和接受pat中3个捕获组的函数Sum(即,与括号中与pat的部分匹配的字符串)并执行所需的总和.

Then define a pattern pat which matches the {...} part and a function Sum which accepts the 3 capture groups in pat (i.e. the strings matched to the parts of pat within parentheses) and performs the desired sum.

使用gsubfn匹配模式,将捕获组传递到Sum,并将匹配项替换为Sum的输出.然后评估它.

Use gsubfn to match the pattern, passing the capture groups to Sum and replacing the match with the output of Sum. Then evaluate it.

在该示例中,全局环境中唯一名称在r010r050之间的变量是r010r020(如果存在,它将使用更多的变量),并且由于它们的总和为它返回TRUE.

In the example the only variables in the global environment whose names are between r010 and r050 inclusive are r010 and r020 (it would have used more had they existed) and since they sum to r060 it returned TRUE.

library(gsubfn)

# inputs
r010 <- 1; r020 <- 2; r060 <- 3
s <- "{r010-050} == {r060}"

pat <- "[{](\\w+)(-(\\w+))?[}]"
Sum <- function(x1, x2, x3, env = .GlobalEnv) {
  x3 <- if(x3 == "") x1 else paste0(gsub("\\d", "", x1), x3)
  lst <- ls(env)
  sum(unlist(mget(lst[lst >= x1 & lst <= x3], envir = env)))
}
eval(parse(text = gsubfn(pat, Sum, s)))
## [1] TRUE