Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing data.frame as an argument in function

Tags:

function

r

Context

As a followup to R: Pass data.frame by reference to a function and How to add a column in the data frame within a function

I am attempting the following, seemingly easy, function:

naToZero <- function(df) {
  df$Vol[is.na(df$Vol)] <- 0
}

Data.frame

> str(WFM)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   990571 obs. of  14 variables:
 $ Date      : chr  "04/12/2017" "04/12/2017" "04/12/2017" "04/12/2017" ...
 $ Time      :Classes 'hms', 'difftime'  atomic [1:990571] 41970 41969 41968 41967 41966 ...
  .. ..- attr(*, "units")= chr "secs"
 $ Bar#      : chr  "197953/197953" NA "197952/197953" NA ...
 $ Bar Index : int  0 NA -1 NA NA -2 NA NA -3 NA ...
 $ Tick Range: int  0 NA 0 NA NA 0 NA NA 0 NA ...
 $ Open      : num  33.9 NA 33.9 NA NA ...
 $ High      : num  33.9 NA 33.9 NA NA ...
 $ Low       : num  33.9 NA 33.9 NA NA ...
 $ Close     : num  33.9 NA 33.9 NA NA ...
 $ Vol       : int  100 NA 200 NA NA 100 NA NA 400 NA ...
 $ MACDHist  : num  -59 NA -87 NA NA ...
 $ MACD      : num  -450 NA -445 NA NA ...
 $ MACDSig   : num  -391 NA -358 NA NA ...
 $ ZScore1   : num  NA NA NA NA NA NA NA NA NA NA ...

Hoping to use this function to speed things up in data cleaning.

Problem

After I run the function in the script editor, and then pass a data.frame to run it. But the function does not do anything and when I View(WFM), it's still the same old data. However, when I manually run the command:

WFM$Vol[is.na(WFM$Vol)] <- 0

Then it works.

Things I tried

I tried experimenting based on the two links I saw, being seemingly relevant:

Using WFM <- naToZero(WFM), produces a vector with a single value, 0.

Tried using WFM <- data.table(WFM) and running the function... same thing.

I must be missing something basic.

like image 350
Robert Tan Avatar asked Apr 19 '17 14:04

Robert Tan


People also ask

Can you pass a DataFrame to a function?

Pandas dataframes allow you the flexibility of applying a function along a particular axis of a dataframe.

Which of the following is one of the argument of data frame () function?

these arguments are of either the form value or tag = value . Component names are created based on the tag (if present) or the deparsed argument itself. NULL or a single integer or character string specifying a column to be used as row names, or a character or integer vector giving the row names for the data frame.

Can we pass a function as an argument to a function?

We cannot pass the function as an argument to another function. But we can pass the reference of a function as a parameter by using a function pointer.


2 Answers

Virtually all objects in R are immutable: operations do not modify the original, they create a copy. So you need to assign that copy back to the original.

<- does that, but it assigns to df inside your function, which is a copy of the argument (= WFM) you pass to your function.

So you need to modify your function:

naToZero <- function(df) {
    df$Vol[is.na(df$Vol)] <- 0
    df
}

… and how you call it:

WFM = naToZero(WFM)
like image 188
Konrad Rudolph Avatar answered Sep 19 '22 23:09

Konrad Rudolph


We can make this more dynamic using the devel version of dplyr (soon to be released 0.6.0)

library(tidyverse)
naToZero <- function(df, Col) {
    Col <- enquo(Col)
    ColN <- quo_name(Col)
     df %>% 
      mutate(!!ColN := replace(!!Col, is.na(!!Col), 0))
 

}

naToZero(WFM, Vol)
# A tibble: 3 × 2
#       Date   Vol
#      <chr> <dbl>
#1 04/12/2017     0
#2 04/12/2017    23
#3 04/12/2017    40

Or for any other columns

naToZero(WFM, Open)
# A tibble: 3 × 3
#       Date   Vol  Open
#       <chr> <dbl> <dbl>
#1 04/12/2017    NA  33.9
#2 04/12/2017    23   0.0
#3 04/12/2017    40  32.0

The enquo does similar functionality as substitute from base R by taking input arguments and converting it to quosure. In the mutate, we can unquote (!! or UQ) to evaluate the columns as well as the strings on the lhs created with quo_name

data

WFM <- tibble(Date = rep("04/12/2017", 3), Vol = c(NA, 23, 40), Open = c(33.9, NA, 32))
like image 23
akrun Avatar answered Sep 22 '22 23:09

akrun