Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use dplyr for programming

Tags:

r

dplyr

cran

I like dplyr for data manipulation, but I don't understand how to use it for programming. For example, to rescale some variables, we could do:

mutate(cars, speed.scaled = scale(speed), dist.scaled = scale(dist))

Very cool. But now suppose I want to write a function that uses mutate to scale all variables in a data frame. How do I create the ... argument? The best thing I can come up with is something like:

fnargs <- lapply(names(cars), function(x){call("scale", as.name(x))})
names(fnargs) <- paste0(names(cars), ".scaled")
do.call(mutate, c(.data=as.name("cars"), fnargs))

Or is there an alternative interface that is more programming friendly?

like image 255
Jeroen Ooms Avatar asked Jan 31 '14 02:01

Jeroen Ooms


People also ask

How do I use dplyr in R?

Describe what the dplyr package in R is used for. Apply common dplyr functions to manipulate data in R. Employ the 'pipe' operator to link together a sequence of functions. Employ the 'mutate' function to apply other chosen functions to existing columns and create new columns of data.

What is dplyr package used for?

dplyr aims to provide a function for each basic verb of data manipulation. These verbs can be organised into three categories based on the component of the dataset that they work with: Rows: filter() chooses rows based on column values.

Which are 5 of the most commonly used dplyr functions?

This article will cover the five verbs of dplyr: select, filter, arrange, mutate, and summarize.

What is N () in dplyr?

n() gives the current group size. cur_data() gives the current data for the current group (excluding grouping variables). cur_data_all() gives the current data for the current group (including grouping variables)


2 Answers

Easy peasy: use mutate_each(cars, funs(scale)) or apply(cars, 2, scale).

like image 95
crestor Avatar answered Sep 22 '22 19:09

crestor


This can be done in base R like this:

cars.scaled <- as.data.frame(scale(cars))

or

cars.scaled <- replace(cars, TRUE, lapply(cars, scale))

or

cars.scaled <- cars
cars.scaled[] <- lapply(cars, scale)

The first one above can be translated to work with %>% like this:

cars.scaled <- cars %>% scale %>% as.data.frame
like image 32
G. Grothendieck Avatar answered Sep 19 '22 19:09

G. Grothendieck