I need some help tidying my data. I'm trying to convert some integers to factors (but not all integers to factors). I think I can do with selecting the variables in question but how do I add them back to the original data set? For example, keeping the values NOT selected from my raw_data_tbl and using the mutated types from the raw_data_tbl_int
library(dplyr)
raw_data_tbl %>%
select_if(is.numeric) %>%
select(-c(contains("units"), PRO_ALLOW, RTL_ACTUAL, REAL_PRICE,
REAL_PRICE_HHU, REBATE, RETURN_UNITS, UNITS_PER_CASE, Profit, STR_COST, DCC,
CREDIT_AMT)) %>%
mutate_if(is.numeric, as.factor)
You can change data types using as. * where * is the datatype to change to, the other way is using class(). class(df$var) = "Numeric".
Use the lapply() Function to Convert Multiple Columns From Integer to Numeric Type in R. Base R's lapply() function allows us to apply a function to elements of a list.
convert() function in R Language is used to compute the data type of a particular data object. It can convert data object to logical, integer, numeric, or factor.
To rename a column in R you can use the rename() function from dplyr. For example, if you want to rename the column “A” to “B”, again, you can run the following code: rename(dataframe, B = A) .
How to Add Columns to Data Frame in R Using dplyr You can use the mutate () function from the dplyr package to add one or more columns to a data frame in R. This function uses the following basic syntax: Method 1: Add Column at End of Data Frame
This is a second post in a series of dplyr functions. It covers tools to manipulate your columns to get them the way you want them: this can be the calculation of a new column, changing a column into discrete values or splitting/merging columns.
Thanks Suzan - it was helpful to know I wasn't missing something! Your suggested approach of using arithmetic works. I actually did a bit more digging and found another option, which is to specifically tell dplyr to make row-wise calculations, using rowwise (). Very true, rowwise () would work as well.
Existing columns can be modified by assigning new values to desired columns. Writing code in comment? Please use ide.geeksforgeeks.org , generate link and share the link here.
As of dplyr 1.0.0 released on CRAN 2020-06-01, the scoped functions mutate_at()
, mutate_if()
and mutate_all()
have been superseded thanks to the more generalizable across()
. This means you can stay with just mutate()
. The introductory blog post from April explains why it took so long to discover.
Toy example:
library(dplyr)
iris %>%
mutate(across(c(Sepal.Width,
Sepal.Length),
factor))
In your case, you'd do this:
library(dplyr)
raw_data_tbl %>%
mutate(across(c(is.numeric,
-contains("units"),
-c(PRO_ALLOW, RTL_ACTUAL, REAL_PRICE, REAL_PRICE_HHU,
REBATE, RETURN_UNITS, UNITS_PER_CASE, Profit,
STR_COST, DCC, CREDIT_AMT)),
factor))
You can use mutate_at
instead. Here's an example using the iris
dataframe:
library(dplyr)
iris_factor <- iris %>%
mutate_at(vars(Sepal.Width,
Sepal.Length),
funs(factor))
As of dplyr 0.8.0, funs()
is deprecated. Use list()
instead, as in
library(dplyr)
iris_factor <- iris %>%
mutate_at(vars(Sepal.Width,
Sepal.Length),
list(factor))
And the proof:
> str(iris_factor)
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: Factor w/ 35 levels "4.3","4.4","4.5",..: 9 7 5 4 8 12 4 8 2 7 ...
$ Sepal.Width : Factor w/ 23 levels "2","2.2","2.3",..: 15 10 12 11 16 19 14 14 9 11 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
Honestly, I'd do it like this:
df = data.frame("LOC_ID" = c(1,2,3,4),
"STRS" = c("a","b","c","d"),
"UPC_CDE" = c(813,814,815,816))
df$LOC_ID = as.factor(df$LOC_ID)
df$UPC_CDE = as.factor(df$UPC_CDE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With