Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mapping over the rows of a data frame

Tags:

r

Suppose I have a data frame with columns c1, ..., cn, and a function f that takes in the columns of this data frame as arguments. How can I apply f to each row of the data frame to get a new data frame?

For example,

x = data.frame(letter=c('a','b','c'), number=c(1,2,3))
# x is
# letter | number
#      a | 1
#      b | 2
#      c | 3

f = function(letter, number) { paste(letter, number, sep='') }

# desired output is
# a1
# b2
# c3

How do I do this? I'm guessing it's something along the lines of {s,l,t}apply(x, f), but I can't figure it out.

like image 989
Brett Avatar asked Aug 13 '10 21:08

Brett


People also ask

How do you put a map on a data frame?

DataFrame - applymap() functionThe applymap() function is used to apply a function to a Dataframe elementwise. This method applies a function that accepts and returns a scalar to every element of a DataFrame. Python function, returns a single value from a single value. Transformed DataFrame.

What is mapping in DataFrame?

Mapping external value to a dataframe means using different sets of values to add in that dataframe by keeping the keys of external dictionary as same as the one column of that dataframe. To add external values in dataframe, we use dictionary which has keys and values which we want to add in the dataframe.


1 Answers

as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:

> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
  X1 V1
1  1 a1
2  2 b2
3  3 c3

you'll want to rename the output columns, I'm sure

So while I was typing this, @joshua showed an alternative method using ddply. The difference in my example is that adply treats the input data frame as an array. adply does not use the "group by" variable row that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply() approach. In the aforementioned question.

like image 167
JD Long Avatar answered Sep 22 '22 14:09

JD Long