I have a data frame, and I want to transform all columns (say, take the logs or whatever) with columns that match a certain name. So in the example below, I want to take the log of X.1 and X.2, but not Y or Z.1.
df <- data.frame(
Y = sample(0:1, 10, replace = TRUE),
X.1 = sample(1:10),
X.2 = sample(1:10),
Z.1 = sample(151:160)
)
# option 1, won't work for dozens of fields
df$X.1 <- log(df$X.1)
df$X.2 <- log(df$X.2)
Is there a good, efficient way to do this when the dataframe is several gigabtyes?
We can use ALTER TABLE ALTER COLUMN statement to change the datatype of the column. The syntax to change the datatype of the column is the following. In the syntax, Tbl_name: Specify the table name that contains the column that you want to change.
Convert All Columns to Strings If you want to change the data type for all columns in the DataFrame to the string type, you can use df. applymap(str) or df. astype(str) methods.
In the case of functions that will return a data.frame:
cols <- c("X.1","X.2")
df[cols] <- log(df[cols])
Otherwise you will need to use lapply
or a loop over the columns. These solutions will be slower than the solution above, so only use them if you must.
df[cols] <- lapply(df[cols], function(x) c(NA,diff(x)))
for(col in cols) {
df[col] <- c(NA,diff(df[col]))
}
vars <- c("X.1", "X.2")
df[vars] <- lapply(df[vars], log)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With