I'm utilizing Rscript
to run an R script through bash, and I want to specify arguments to be passed to functions within the script itself. Specifically, I want to pass arguments that specify:
.csv
) andI run into a problem when the column names include the tilde sign (~
). I've tried wrapping the column names with backticks but still unsuccessful.
I want to write a script that takes in a data file in .csv
format and plots a histogram for one variable according to the user's choice.
plot_histogram <- function(path_to_input, x_var) {
data_raw <- read.csv(file = path_to_input)
path_to_output_folder <- dirname(path_to_input)
png(filename = paste0(path_to_output_folder, "/", "output_plot.png"))
hist(as.numeric(na.omit(data_raw[[x_var]])), main = "histogram", xlab = "my_var")
replicate(dev.off(), n = 20)
}
set.seed(123)
df <- data.frame(age = sample(20:80, size = 100, replace = TRUE))
write.csv(df, "some_age_data.csv")
plot_histogram(path_to_input = "some_age_data.csv",
x_var = "age")
As intended, I get a .png
file with the plot, saved to the same directory where the .csv
is at
plot_histogram.R
args <- commandArgs(trailingOnly = TRUE)
## same function as above
plot_histogram <- function(path_to_input, x_var) {
data_raw <- read.csv(file = path_to_input)
path_to_output_folder <- dirname(path_to_input)
png(filename = paste0(path_to_output_folder, "/", "output_plot.png"))
hist(as.numeric(na.omit(data_raw[[x_var]])), main = "histogram", xlab = "my_var")
replicate(dev.off(), n = 20)
}
plot_histogram(path_to_input = args[1], x_var = args[2])
Then run via command line using Rscript
$ Rscript --vanilla plot_histogram.R /../../../some_age_data.csv "age"
Works too!
Step 1: create fake data
library(tibble)
set.seed(123)
df <- tibble(`age-blah~value` = sample(20:80, size = 100, replace = T))
write.csv(df, "some_age_data.csv")
Step 2: Using Rscript
:
$ Rscript --vanilla plot_histogram.R /../../../some_age_data.csv "age-blah~value"
Error in hist.default(as.numeric(na.omit(data_raw[[x_var]])), main = "histogram", : invalid number of 'breaks' Calls: plot_histogram -> hist -> hist.default Execution halted
When using Rscript
, how can I pass an argument that specifies a column name containing tilde? Alternatively, how can I work around .csv
files that have such a format of tilde in column names, within the framework of Rscript
?
Thanks!
Shift Operator. Shift operator in bash (syntactically shift n, where n is the number of positions to move) shifts the position of the command line arguments. The default value for n is one if not specified. The shift operator causes the indexing of the input to start from the shifted position.
You are successfully passing an argument that specifies a column name containing tilde. However, read.csv
has "fixed" the column names so it doesn't actually contain a tilde.
read.csv
is silently converting the column name to age.blah.value
. Use check.names = FALSE
to make it age-blah~value
.
data_raw <- read.csv(file = path_to_input, check.names = FALSE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With