Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I divide one column of a data frame through another?

Tags:

I wanted to divide one column by another to get the per person time how can I do this?I couldn't find anything on how you can divide.

Here is some data that I want to use

     min    count2.freq 263807.0    1582 196190.5    1016 586689.0    3479 

In the end I want to add a third column like this that has the number from min / count2.freq

e.g 263808.0/1582 = 166.75

like image 686
user1741021 Avatar asked Oct 22 '12 14:10

user1741021


People also ask

How do you split a column by another column in a DataFrame?

The second method to divide two columns is using the div() method. It divides the columns elementwise. It accepts a scalar value, series, or dataframe as an argument for dividing with the axis. If the axis is 0 the division is done row-wise and if the axis is 1 then division is done column-wise.

How do I divide one data frame by another?

div() method divides element-wise division of one pandas DataFrame by another. DataFrame elements can be divided by a pandas series or by a Python sequence as well. Calling div() on a DataFrame instance is equivalent to invoking the division operator (/).


2 Answers

There are a plethora of ways in which this can be done. The problem is how to make R aware of the locations of the variables you wish to divide.

Assuming

d <- read.table(text = "263807.0    1582 196190.5    1016 586689.0    3479 ") names(d) <- c("min", "count2.freq") > d        min count2.freq 1 263807.0        1582 2 196190.5        1016 3 586689.0        3479 

My preferred way

To add the desired division as a third variable I would use transform()

> d <- transform(d, new = min / count2.freq) > d        min count2.freq      new 1 263807.0        1582 166.7554 2 196190.5        1016 193.1009 3 586689.0        3479 168.6373 

The basic R way

If doing this in a function (i.e. you are programming) then best to avoid the sugar shown above and index. In that case any of these would do what you want

## 1. via `[` and character indexes d[, "new"] <- d[, "min"] / d[, "count2.freq"]  ## 2. via `[` with numeric indices d[, 3] <- d[, 1] / d[, 2]  ## 3. via `$` d$new <- d$min / d$count2.freq 

All of these can be used at the prompt too, but which is easier to read:

d <- transform(d, new = min / count2.freq) 

or

d$new <- d$min / d$count2.freq ## or any of the above examples 

Hopefully you think like I do and the first version is better ;-)

The reason we don't use the syntactic sugar of tranform() et al when programming is because of how they do their evaluation (look for the named variables). At the top level (at the prompt, working interactively) transform() et al work just fine. But buried in function calls or within a call to one of the apply() family of functions they can and often do break.

Likewise, be careful using numeric indices (## 2. above); if you change the ordering of your data, you will select the wrong variables.

The preferred way if you don't need replacement

If you are just wanting to do the division (rather than insert the result back into the data frame, then use with(), which allows us to isolate the simple expression you wish to evaluate

> with(d, min / count2.freq) [1] 166.7554 193.1009 168.6373 

This is again much cleaner code than the equivalent

> d$min / d$count2.freq [1] 166.7554 193.1009 168.6373 

as it explicitly states that "using d, execute the code min / count2.freq. Your preference may be different to mine, so I have shown all options.

like image 78
Gavin Simpson Avatar answered Oct 01 '22 19:10

Gavin Simpson


Hadley Wickham

dplyr

packages is always a saver in case of data wrangling. To add the desired division as a third variable I would use mutate()

d <- mutate(d, new = min / count2.freq) 
like image 35
Azam Yahya Avatar answered Oct 01 '22 19:10

Azam Yahya