I have a tab-delimited file that looks like this:
AG-AG AG-CA AT-AA AT-AC AT-AG ...
0.0142180094786 0.009478672985781 0.0142180094786 0.4218009478672 ...
When I read this into R using read.table i get:
nc.tab <- read.table("./percent_splice_pair.tab", sep="\t", header=TRUE)
AG.AG AG.CA AT.AA AT.AC AT.AG ...
1 0.01421801 0.009478673 0.01421801 0.4218009 0.03317536 ...
This feels somewhat awkward for me, because I am much more used to working with data if its like this:
splice.pair counts
AG.AG 0.01421801
AG.CA 0.009478673
AT.AA 0.01421801
AT.AG 0.03317536
... ...
so far, my attempts at trying to coerce the table into a data frame like this (using data.frame()
) have caused very odd results. I can't work out how to get each row of the table I have as a simple list, which I can then use as columns for the data frame. colnames(nc.tab)
works for the headers but things like nc.tab[1,]
just give me the table + headers again. Am I missing something obvious?
--edit--
Whilst @Andrie's answer gave me the data.frame I needed, I had to do a bit of extra work to coerse the counts values into numeric values so that they would work correctly in ggplot:
nc.tab <- read.table("./percent_splice_pair.tab", header=FALSE, sep="\t")
nc.mat <- t(as.matrix(nc.tab))
sp <- as.character(nc.tab[,2])
c <- as.numeric(as.character(nc.tab[,2]))
nc.dat <- data.frame(Splice.Pair=sp, count=c)
Splice.Pair count
1 AG-AG 0.014218009
2 AG-CA 0.009478673
3 AT-AA 0.014218009
4 AT-AC 0.421800948
5 AT-AG 0.033175355
You need the following to read and reshape your data in the way you want:
read.table
with the parameter header=FALSE
t()
Here is the code:
x <- read.table(..., header=FALSE)
df <- as.data.frame(t(x))
names(df) <- c("splice.pair", "counts")
df
splice.pair counts
V1 AG-AG 0.0142180094786
V2 AG-CA 0.009478672985781
V3 AT-AA 0.0142180094786
V4 AT-AC 0.4218009478672
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With