Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

(Related question that does not include sorting. It's easy to just use paste when you don't need to sort.)

I have a less-than-ideally-structured table with character columns that are generic "item1","item2" etc. I would like to create a new character variable that is the alphabetized, comma-separated concatenation of these columns. So for example, in row 5, if item1 = "milk", item2 = "eggs", and item3 = "butter", the new variable in row 5 might be "butter, eggs, milk"

I wrote a function f() below that works on two character variables. However, I am having trouble

  • Using mapply or other "vectorization" (I know it's really just a for loop)
  • Generalizing the function to an arbitrary number of columns

Any help much appreciated.

df <- data.frame(a =c("foo","bar"), 
                 b= c("baz","qux"))   
paste(df$a,df$b, sep=", ")
# returns [1] "foo, baz" "bar, qux" ... but I want [1] "baz, foo" "bar, qux"

f <- function(a,b) paste(c(a,b)[order(c(a,b))],collapse=", ")
f("foo","baz") 
# returns [1] "baz, foo" ... which is what I want ... how to vectorize?

df$new_var <- mapply(f, df$a, df$b)
df 
#     a   b new_var      <- new_var is not what I want
# 1 foo baz    1, 2
# 2 bar qux    1, 2

# Interestingly, data.table is smart enough to fix my bad mapply
library(data.table)
dt <- data.table(a =c("foo","bar"), 
                 b= c("baz","qux"))  
dt[,new_var:=mapply(f, a, b)]
dt
#     a    b  new_var    <- new var IS what I want
# 1: foo baz baz, foo
# 2: bar qux bar, qux
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
656 views
Welcome To Ask or Share your Answers For Others

1 Answer

Just apply down rows:

apply(df,1,function(x){
  paste(sort(x),collapse = ",")
})

Wrap it in a function if you want. You'll either have to define which columns to send or assume all. i.e. apply(df[ ,2:3],1,f()...

sort(x) is the same as x[order(x)]


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...