Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm using regular expressions to replace some substrings. The replacement value reuses part of the match. I want to match case insensitively, but in the replacement, I want a lower case version of the thing that was matched.

library(stringi)
x <- "CatCATdog"
rx <- "(?i)(cat)(?-i)"
stri_replace_all_regex(x, rx, "{$1}")
# [1] "{Cat}{CAT}dog"

This is close to what I want, except the "cat"s should be lower case. That is, the output string should be "{cat}{cat}dog".

The following code doesn't work, but it shows my intension.

stri_replace_all_regex(x, rx, "{tolower($1)}") 

The following technique does work, but it's ugly, not very generalizable, and not very efficient. My idea was to replace the regular expression with one that matches what I want, but not the replacement values (that is, "cat" but not "{cat}"). Then search for the first match in each input string, find the location of the match, do a substring replacement, then look for the next match until there are no more. It's awful.

x <- "CatCATdog"
rx <- "(?i)((?<!\{)cat(?!\}))(?-i)"
repeat{
  detected <- stri_detect_regex(x, rx)
  if(!any(detected))
  {
    break
  }
  index <- stri_locate_first_regex(x[detected], rx)
  match <- tolower(stri_match_first_regex(x[detected], rx)[, 2])
  stri_sub(x[detected], index[, 1], index[, 2]) <- paste0("{", match[detected], "}")
}

I feel like there must be a better way.

How do I replace case insensitive matches with lower case values?


Thanks to inspiration from the comments, I discovered that the thing I'm looking for is "replacement text case conversion".

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.1k views
Welcome To Ask or Share your Answers For Others

1 Answer

If you need to perform any kind of string manipulation you may use gsubfn:

> library(gsubfn)
> rx <- "(?i)cat"
> s = "CatCATdog"
> gsubfn(rx, ~ paste0("{",tolower(x),"}"), s, backref=0)
[1] "{cat}{cat}dog"

You can use the gsubfn as you would use an anonymous callback method inside String#replace in JavaScript (you may specify the arguments for capturing groups with function(args), and also make more sophisticated manipulations inside).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...