I have a data frame w
like this:
>head(w,3)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
1 0.2446884 0.3173719 0.74258410 0.0000000 0 0.0000000 0.01962759 0.0000000 0.0000000 0.5995647 0 0.30201691 0.03109935 0.16897571
2 0.0000000 0.0000000 0.08592243 0.2254971 0 0.7381867 0.11936323 0.2076167 0.0000000 1.0587742 0 0.50226734 0.51295661 0.01298853
3 8.4293893 4.9985040 2.22526463 0.0000000 0 3.6600283 0.00000000 0.0000000 0.2573714 0.8069288 0 0.05074886 0.00000000 0.59403855
V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27
1 0.00000000 0.0000000 0.000000 0.1250837 0.000000 0.5468143 0.3503245 0.000000 0.183144204 0.23026538 6.9868429 1.5774150 0.0000000
2 0.01732732 0.8064441 0.000000 0.0000000 0.000000 0.0000000 0.0000000 0.000000 0.015123385 0.07580794 0.6160713 0.7452335 0.0740328
3 2.66846151 0.0000000 1.453987 0.0000000 1.875298 0.0000000 0.0000000 0.893363 0.004249061 0.00000000 1.6185897 0.0000000 0.7792773
V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 refseq
1 0.5543028 0 0.00000 0.0000000 0.08293075 0.18261450 0.3211127 0.2765295 0 0.04230929 0.05017316 0.3340662 0.00000000 NM_000014
2 0.0000000 0 0.00000 0.0000000 0.00000000 0.03531411 0.0000000 0.4143325 0 0.14894716 0.58056304 0.3310173 0.09162460 NM_000015
3 0.8047882 0 0.88308 0.7207709 0.01574767 0.00000000 0.0000000 0.1183736 0 0.00000000 0.00000000 1.3529881 0.03720155 NM_000016
dim(w)
[1] 37126 41
I tried to plot the density curve of each column(except the last column) in one page. It seems that ggplot2 can do this.
I tried this according to this post:
ggplot(data=w[,-41], aes_string(x=colnames)) + geom_density()
But it doesn't work by complaining like this:
Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'
And I'm not sure how to convert the format of this dataframe to the one ggplot2 accepts. Or is there other way to do this job in R?
To create a density plot in R you can plot the object created with the R density function, that will plot a density curve in a new R window. You can also overlay the density curve over an R histogram with the lines function. The result is the empirical density function.
To create histogram of all columns in an R data frame, we can use hist. data. frame function of Hmisc package. For example, if we have a data frame df that contains five columns then the histogram for all the columns can be created by using a single line code as hist.
In this method, to create a histogram of two variables, the user has to first install and import the ggplot2 package, and then call the geom_histrogram with the specified parameters as per the requirements and needs to create the dataframe with the variable to which we need the histogram in the R programming language.
In order to add a density curve over a histogram you can use the lines function for plotting the curve and density for calculating the underlying non-parametric (kernel) density of the distribution. The bandwidth selection for adjusting non-parametric densities is an area of intense research.
ggplot
needs your data in a long format, like so:
variable value
1 V1 0.24468840
2 V1 0.00000000
3 V1 8.42938930
4 V2 0.31737190
Once it's melted into a long data frame, you can group all the density plots by variable. In the snippet below, ggplot
uses the w.plot
data frame for plotting (which doesn't need to omit the final refseq
variable). You can modify it to use facets, different colors, fills, etc.
w <- as.data.frame(cbind(
c(0.2446884, 0.0000000, 8.4293893),
c(0.3173719, 0.0000000, 4.9985040),
c(0.74258410, 0.08592243, 2.22526463)))
w$refseq <- c("NM_000014", "NM_000015", "NM_000016")
library(ggplot2)
library(reshape2)
w.plot <- melt(w)
p <- ggplot(aes(x=value, colour=variable), data=w.plot)
p + geom_density()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With