I am trying to plot the descriptive variables in the first row by the following procedure. I also tried unsuccessfully with quoting the column/row names <ol> <li>rotate rows and columns in the CSV data for the correposding data structure (tall table) required in the thread A very simple histogram with R? with <code>ggplot</code> </li> <li> to plot histogram of events as <code>Absolute</code> variable XOR (<code>Average</code>, <code>Min</code>, <code>Max</code>) <ul> <li>If absolute value only, just draw absolute value in histogram. </li> <li>If (average, min and max), just draw them in the histogram with whiskers (= whisker plot) where the limits of the whiskers are made by the min and max. </li> </ul> </li> </ol> Data <ol> <li> initially, <code>data.csv</code> <pre class="prettyprint"><code>"Vars" , "Sleep", "Awake", "REM", "Deep" "Absolute", , , 5 , 7 "Average" , 7 , 12 , , "Min" , 4 , 5 , , "Max" , 10 , 15 , , </code></pre> </li> <li> data after reshaping visually <pre class="prettyprint"><code> V1 V2 V3 V4 Vars Absolute Average Min Max Sleep <NA> 7 4 10 Awake <NA> 12 5 15 REM 5 <NA> <NA> <NA> Deep 7 <NA> <NA> <NA> </code></pre> </li> <li> data after reshaping for R <pre class="prettyprint"><code> data <- structure(list(V1 = structure(c(3L, NA, NA, 1L, 2L), .Names = c("Vars", "Sleep", "Awake", "REM", "Deep"), .Label = c(" 5", " 7", "Absolute" ), class = "factor"), V2 = structure(c(3L, 2L, 1L, NA, NA), .Names = c("Vars", "Sleep", "Awake", "REM", "Deep"), .Label = c("12", " 7", "Average " ), class = "factor"), V3 = structure(c(3L, 1L, 2L, NA, NA), .Names = c("Vars", "Sleep", "Awake", "REM", "Deep"), .Label = c(" 4", " 5", "Min " ), class = "factor"), V4 = structure(c(3L, 1L, 2L, NA, NA), .Names = c("Vars", "Sleep", "Awake", "REM", "Deep"), .Label = c("10", "15", "Max " ), class = "factor")), .Names = c("V1", "V2", "V3", "V4"), row.names = c("Vars", "Sleep", "Awake", "REM", "Deep"), class = "data.frame") </code></pre> </li> </ol> R code with debugging code <pre class="prettyprint"><code>dat.m <- read.csv("data.csv") # rotate rows and columns dat.m <- as.data.frame(t(dat.m)) # https://stackoverflow.com/a/7342329/54964 Comment 42- library("reshape2") dat.m <- melt(dat.m, id.vars="Vars") ## Just plot values existing there correspondingly library("ggplot2") # https://stackoverflow.com/a/25584792/54964 # TODO following #ggplot(dat.m, aes(x = "Vars", y = value,fill=variable)) </code></pre> Error <pre class="prettyprint"><code>Error: id variables not found in data: Vars Execution halted </code></pre> R: 3.3.3, 3.4.0 (backports) OS: Debian 8.7 R reshape2, ggplot2, ... with <code>sessionInfo()</code> after loading the two packages <pre class="prettyprint"><code>Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ggplot2_2.1.0 reshape2_1.4.2 loaded via a namespace (and not attached): [1] colorspace_1.3-2 scales_0.4.1 magrittr_1.5 plyr_1.8.4 [5] tools_3.3.3 gtable_0.2.0 Rcpp_0.12.10 stringi_1.1.5 [9] grid_3.3.3 stringr_1.2.0 munsell_0.4.3 </code></pre> <h3>Testing HaberdashPI's proposal </h3> Output in Fig. 1 where wrongly absolute value in <code>Sleep</code> and <code>Awake</code>. If <code>NA</code>, just set value to zero. Fig. 1 HaberdashPI's proposal output not as expected <img src="https://i.stack.imgur.com/zxWcys.png" alt="enter image description here"> Data structure of <code>dat.m</code> before the transpose <pre class="prettyprint"><code>'data.frame': 4 obs. of 5 variables: $ Absolute: Factor w/ 2 levels " 5"," 7": NA NA 1 2 ..- attr(*, "names")= chr "Sleep" "Awake" "REM" "Deep" $ Average : Factor w/ 2 levels "12"," 7": 2 1 NA NA ..- attr(*, "names")= chr "Sleep" "Awake" "REM" "Deep" $ Min : Factor w/ 2 levels " 4"," 5": 1 2 NA NA ..- attr(*, "names")= chr "Sleep" "Awake" "REM" "Deep" $ Max : Factor w/ 2 levels "10","15": 1 2 NA NA ..- attr(*, "names")= chr "Sleep" "Awake" "REM" "Deep" $ Vars : chr "Sleep" "Awake" "REM" "Deep" Absolute Average Min Max Vars Sleep <NA> 7 4 10 Sleep Awake <NA> 12 5 15 Awake REM 5 <NA> <NA> <NA> REM Deep 7 <NA> <NA> <NA> Deep </code></pre> Data structure of <code>dat.m</code> after the transpose <pre class="prettyprint"><code>'data.frame': 16 obs. of 3 variables: $ Vars : chr "Sleep" "Awake" "REM" "Deep" ... $ variable: Factor w/ 4 levels "Absolute","Average ",..: 1 1 1 1 2 2 2 2 3 3 ... $ value : chr NA NA " 5" " 7" ... Vars variable value 1 Sleep Absolute <NA> 2 Awake Absolute <NA> 3 REM Absolute 5 4 Deep Absolute 7 5 Sleep Average 7 6 Awake Average 12 7 REM Average <NA> 8 Deep Average <NA> 9 Sleep Min 4 10 Awake Min 5 11 REM Min <NA> 12 Deep Min <NA> 13 Sleep Max 10 14 Awake Max 15 15 REM Max <NA> 16 Deep Max <NA> </code></pre> <h3>Testing akash87's proposal </h3> Code <pre class="prettyprint"><code>ds <- dat.m str(ds) ds ds$variable ds$variable %in% c("Min","Max") </code></pre> Wrong output because all <code>False</code> in the end <pre class="prettyprint"><code> $ Vars : chr "Sleep" "Awake" "REM" "Deep" ... $ variable: Factor w/ 4 levels "Absolute","Average ",..: 1 1 1 1 2 2 2 2 3 3 ... $ value : chr NA NA " 5" " 7" ... Vars variable value 1 Sleep Absolute <NA> 2 Awake Absolute <NA> 3 REM Absolute 5 4 Deep Absolute 7 5 Sleep Average 7 6 Awake Average 12 7 REM Average <NA> 8 Deep Average <NA> 9 Sleep Min 4 10 Awake Min 5 11 REM Min <NA> 12 Deep Min <NA> 13 Sleep Max 10 14 Awake Max 15 15 REM Max <NA> 16 Deep Max <NA> [1] "hello 3" [1] Absolute Absolute Absolute Absolute Average Average Average Average [9] Min Min Min Min Max Max Max Max Levels: Absolute Average Min Max [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [13] FALSE FALSE FALSE FALSE </code></pre> So doing <code>ds[ds$variable %in% c("Min","Max"), ]</code> will given <code>False</code> output because error-carried-forward. <h3>Testing Uwe's proposal </h3> Code with explicit <code>data.table::dcast</code> and two times <code>data.table::melt</code>. Printing out <code>sessionInfo()</code> just before <code>molten <- ...</code>. Note <code>library(ggplot2)</code> is not loaded yet because the error comes from the line <code>molten <- ...</code>. <pre class="prettyprint"><code>$ Rscript test111.r Vars "Average" "Max" "Min" Absolute 1: Sleep 7 10 4 NA 2: Awake 12 15 5 NA 3: REM NA NA NA 5 4: Deep NA NA NA 7 R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 8 (jessie) Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.12.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets base other attached packages: [1] data.table_1.10.4 loaded via a namespace (and not attached): [1] compiler_3.4.0 methods_3.4.0 Error in melt.data.table(transposed, measure.vars = c("Absolute", "Average")) : One or more values in 'measure.vars' is invalid. Calls: <Anonymous> -> melt.data.table Execution halted </code></pre> <h3>Testing Uwe's proposal with test code 2</h3> Code <pre class="prettyprint"><code>molten <- structure(list(Vars = structure(c(1L, 2L, 1L, 2L, 1L, 2L), class = "factor", .Label = c("V1", "V2")), variable = structure(c(1L, 1L, 2L, 2L, 3L, 3L), class = "factor", .Label = c("ave", "ave_max", "lepo")), value = c(7L, 8L, 10L, 10L, 4L, 4L)), .Names = c("Vars", "variable", "value"), row.names = c(NA, -6L), class = c("data.table", "data.frame")) print(molten) library(ggplot2) ggplot(molten, aes(x = Vars, y = value, fill = variable, ymin = lepo, ymax = ave_max)) + geom_col() + geom_errorbar(width = 0.2) </code></pre> Output <pre class="prettyprint"><code> Vars variable value 1 V1 ave 7 2 V2 ave 8 3 V1 ave_max 10 4 V2 ave_max 10 5 V1 lepo 4 6 V2 lepo 4 Error in FUN(X[[i]], ...) : object 'lepo' not found Calls: <Anonymous> ... by_layer -> f -> <Anonymous> -> f -> lapply -> FUN -> FUN Execution halted </code></pre>

The problem with your code is that you used "Vars" with a quote instead of simple Vars in the ggplot aes function. Also, the header of your data set is messed up. The Absolute, Average, ... should be the column names of the data set, not the values themselves. That's why you get the error from melt function. Given your data set, here is my attempt: <pre class="prettyprint"><code>#Data data = cbind.data.frame(c("Sleep", "Awake", "REM", "Deep"), c(NA, NA, 5, 7), c(7, 12, NA, NA), c(4, 5, NA, NA), c(10, 15, NA, NA)) colnames(data) = c("Vars", "Absolute", "Average", "Min", "Max") #reshape dat.m <- melt(data, id.vars="Vars") #Stacked plot ggplot(dat.m, aes(x = Vars, y = value)) + geom_bar(aes(fill=variable), stat = "identity") </code></pre> This will produce: <img src="https://i.stack.imgur.com/WjRil.png" alt="stacked bar"> <pre class="prettyprint"><code>#Or multiple bars ggplot(dat.m, aes(x = Vars, y = value)) + geom_bar(aes(fill=variable), stat = "identity", position="dodge") </code></pre> <img src="https://i.stack.imgur.com/cXSXx.png" alt="nonstacked"> <pre class="prettyprint"><code>#Or separated by Vars ggplot(dat.m, aes(x = Vars, y = value)) + geom_bar(aes(fill=variable), stat = "identity", position="dodge") + facet_wrap( ~ Vars, scales="free") </code></pre> <img src="https://i.stack.imgur.com/JfuTu.png" alt="separatedbyvar"> I am adding another graph to the answer. This collaborates @Uwe answer. <pre class="prettyprint"><code>#data data <- structure(list(Vars = structure(1:2, class = "factor", .Label = c("V1", "V2")), ave = c(7L, 8L), ave_max = c(10L, 10L), lepo = c(4L, 4L)), .Names = c("Vars", "ave", "ave_max", "lepo"), row.names = c(NA, -2L), class = c("data.table", "data.frame"), sorted = "Vars") #Melt library(data.table) mo = data.table::melt(data, measure.vars = c("ave")) ggplot(mo, aes(x = Vars, y = value, fill = variable, ymin = lepo, ymax = ave_max)) + geom_col() + geom_errorbar(width = 0.2) </code></pre> This will produce: <img src="https://i.stack.imgur.com/ztBLn.png" alt="enter image description here">

Your basic problem, is that your column and row names have been messed up when you call <code>dat.m <- as.data.frame(t(dat.m))</code>. That is not really the right way to rearrange your data. Your terminology is a little confusing (do your really mean histogram?), so I'm not sure if this is what you want, but I believe to solve the immediate problem you're having, you can do this: <pre class="prettyprint"><code>library(ggplot2) library(reshape2) dat.m <- read.csv("data.csv") m <- t(dat.m) dat.m <- data.frame(m[2:nrow(m),]) names(dat.m) <- m[1,] dat.m$Vars <- rownames(m)[2:nrow(m)] dat.m <- melt(dat.m, id.vars="Vars") ggplot(dat.m, aes(x = Vars, y = value,fill=variable)) + geom_bar(stat='identity') </code></pre> Here's the output I get:<img src="https://i.stack.imgur.com/G43XG.png" alt="enter image description here"> What I've done here is manually renamed the column names (<code>names(data.m) <- etc...</code>) and inserted a new column called <code>Vars</code>, because you need those names as a column of <code>dat.m</code>, not a set of row names, to refer to them in <code>melt</code> (which is why you get the error you're getting about not being able to find <code>Vars</code>). It isn't elegant, but it gets the job done. It looks like you're making a lot more work for yourself than you may need. It appears that you have already collected a summary of your data in some other program (Excel?), which makes me think there is probably a simpler solution to your problem if you simply load your raw data into R and calculate the average, mean, min and so forth in R, or if you summarize your data in that external program in a format more canonical to R. Not knowing exactly what that raw data looks like, I can't give you a better answer. Much of ggplot is organized around a set of principles for how data ought to be organized: I recommend reading through this blog post on dplyr and this one on tidyr.

How to do histograms of this row-column table in R ggplot?

Q: How do I make a histogram for each column in R?

To create histogram of all columns in an R data frame, we can use hist. data. frame function of Hmisc package. For example, if we have a data frame df that contains five columns then the histogram for all the columns can be created by using a single line code as hist.

Q: Which method is used to create a histogram using ggplot2 in R?

Basic histogram with geom_histogram It is relatively straightforward to build a histogram with ggplot2 thanks to the geom_histogram() function. Only one numeric variable is needed in the input.

Tags:

r

csv

ggplot2

statistics

I am trying to plot the descriptive variables in the first row by the following procedure. I also tried unsuccessfully with quoting the column/row names

rotate rows and columns in the CSV data for the correposding data structure (tall table) required in the thread A very simple histogram with R? with ggplot
to plot histogram of events as Absolute variable XOR (Average, Min, Max)
- If absolute value only, just draw absolute value in histogram.
- If (average, min and max), just draw them in the histogram with whiskers (= whisker plot) where the limits of the whiskers are made by the min and max.

Data

initially, data.csv

How to do histograms of this row-column table in R ggplot?

Tags:

r

csv

ggplot2

statistics

Testing HaberdashPI's proposal

Testing akash87's proposal

Testing Uwe's proposal

Testing Uwe's proposal with test code 2

Léo Léopold Hertz 준영

People also ask

2 Answers

user1480478

HaberdashPI

Related questions

Recent Activity

Donate For Us