Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic variable names in plots, files and compatibility with loop

I am trying to write a function that makes a plot and saves it into a file automatically. The trick I struggle with it to do both dynamically [plotname=varname & filename=varname &], and to make it compatible with calling it from a loop.

# Create data
my_df = cbind(uni=runif (100),norm=rnorm (100),bino=rbinom(100,20, 0.5));   head (my_df)
my_vec = my_df[,'uni']; 

# How to make plot and file-name meaningful if you call the variable in a loop?

# if you call by name, the plotname is telling. It is similar what I would like to see.
hist(my_df[,'bino'])


for (plotit in colnames(my_df)) {
    hist(my_df[,plotit])
    print (plotit)
    # this is already not meaningful
}



# step 2 write it into files 
hist_auto <-  function(variable, col ="gold1", ...) {
    if ( length (variable) > 0 ) {
        plotname = paste(substitute(variable), sep="", collapse = "_");     print (plotname); is (plotname)
        # I would like to define plotname, and later tune it according to my needs
        FnP = paste (getwd(),'/',plotname, '.hist.pdf', collapse = "", sep=""); print (FnP)
        hist (variable, main = plotname)
        #this is apparently not working: I do not get my_df[, "bino"] or anything similar
        dev.copy2pdf (file=FnP )
    } else { print ("var empty") }
}


hist_auto (my_vec)
# name works, and is meaningful [as much as the var name ... ]

hist_auto (my_df[,'bino'])
# name sort of works, but falls apart

assign (plotit, my_df[,'bino'])
hist_auto (get(plotit))
# name works, but meaningless


# Now in a loop

for (plotit in colnames(my_df)) {
    my_df[,plotit]
    hist(my_df[,plotit])
    ## name works, but meaningless and NOT UNIQUE > overwritten by next
}


for (plotit in colnames(my_df)) {
    hist_auto(my_df[,plotit])
    ## name works, but meaningless and NOT UNIQUE > overwritten by next
}

for (plotit in colnames(my_df)) {
    assign (plotit, my_df[,plotit])
    hist_auto (get(plotit))
    ## name works, but meaningless and NOT UNIQUE > overwritten by next
}

My aim is to have a function that iterates over eg. columns of a matrix, plots and saves each with a unique and meaningful name.

The solution will probably involve a smart combination of substitute() parse() eval() and paste (), but lacking solid understanding I failed to figure out.

My basis of experimentation was: how to dynamically call a variable?

like image 981
bud.dugong Avatar asked Nov 09 '22 21:11

bud.dugong


1 Answers

How about something like this? You may need to install.packages("ggplot2")

library(ggplot2)
my_df <- data.frame(uni=runif(100),
                    norm=rnorm(100),
                    bino=rbinom(100, 20, 0.5))
get_histogram <- function(df, varname, binwidth=1, save=T) {
    stopifnot(varname %in% names(df))
    title <- sprintf("Histogram of %s", varname)
    p <- (ggplot(df, aes_string(x=varname)) +
          geom_histogram(binwidth=binwidth) +
          ggtitle(title))
    if(save) {
        filename <- sprintf("histogram_%s.png", gsub(" ", "_", varname))
        ggsave(filename, p, width=10, height=8)
    }
    return(p)
}
for(var in names(my_df))
    get_histogram(my_df, var, binwidth=0.5)  # If you want to save them
get_histogram(my_df, "uni", binwidth=0.1, save=F)  # If you want to look at a specific one
like image 164
Adrian Avatar answered Nov 15 '22 05:11

Adrian