Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

column names have periods inserted where there should be spaces

Tags:

r

read.table

In the plot generated by ggplot, each label along the x-axis is a string, i.e., “the product in 1990”. However, the generated plot there is a period in between each word. In other words, the above string is shown as “the.product.in.1990”

How can I ensure the above “.” is not added?

The following code is what I used to add string for each point along the x-axis

last_plot()+scale_x_discrete(limits=ddata$labels$text)

Sample code:

library(ggdendro)
x <- read.csv("test.csv",header=TRUE) 
d <- as.dist(x,diag=FALSE,upper=FALSE) 
hc <- hclust(d,"ave") 
dhc <- as.dendrogram(hc) 
ddata <- dendro_data(dhc,type="rectangle")
ggplot(segment(ddata)) + geom_segment(aes(x=x0,y=y0,xend=x1,yend=y1))
last_plot() + scale_x_discrete(limits=ddata$labels$text)

each row of ddata$labels$text is a string, like "the product in 1990". I would like to keep the same format in the generated plot rather than "the.product.in.1990"

like image 790
bit-question Avatar asked Dec 08 '11 16:12

bit-question


1 Answers

The issue arises because you are trying to read data with column names that contain spaces.

When you read this data with read.csv these column names are converted to syntactically valid R names. Here is an example to illustrate the issues:

some.file <- '
    "Col heading A", "Col heading B"
    A, 1
    B, 2
    C, 3
    '

Read it with the default read.csv settings:

> x1 <- read.csv(text=some.file)
> x1
  Col.heading.A Col.heading.B
1             A             1
2             B             2
3             C             3
4                          NA
> names(x1)
[1] "Col.heading.A" "Col.heading.B"

To avoid this, use the argument check.names=FALSE:

> x2 <- read.csv(text=some.file, check.names=FALSE)
> x2
  Col heading A Col heading B
1             A             1
2             B             2
3             C             3
4                          NA
> names(x2)
[1] "Col heading A" "Col heading B"

Now, the remaining issue is that a column name can not contain spaces. So to refer to these columns, you need to wrap your column name in backticks:

> x2$`Col heading A`
[1]     A     B     C      
Levels:          A     B     C

For more information, see ?read.csv and specifically the information for check.names.

There is also some information about backticks in ?Quotes

like image 73
Andrie Avatar answered Dec 01 '22 20:12

Andrie