Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading a CSV file organized horizontally

Tags:

r

csv

read.table

In R, is there a function like read.csv that reads in files where the headers are on the left (or right) as opposed to the top and the data is organized from left to right?

So the data would look like:

var1,1,2,3,4,5

Looking at the documentation for read.table and read.csv, nothing seems to pop out. The best option I see using those functions is to use read.table and then construct another table whose columns are the rows of the original data and so forth.

like image 917
Jon Claus Avatar asked Jun 25 '13 02:06

Jon Claus


People also ask

Why are CSV files so hard to read?

The columns of csv files are formatted for easy storing and processing by computers. This results in the files that are not easily read by humans. A few thing to note about this data file.

What is a CSV file format?

A CSV (Comma Separated Values) file is a form of plain text document which uses a particular format to organize tabular information. CSV file format is a bounded text document that uses a comma to distinguish the values. Every row in the document is a data log.

What is the header argument in a CSV file?

This means that the first row of values in the .csv is set as header information (column names). If your data set does not have a header, set the header argument to FALSE: Clearly this is not the desired behavior for this data set, but it may be useful if you have a dataset without headers.

How do I read data from CSV to a variable?

We will use the built in read.csv (...) function call, which reads the data in as a data frame, and assign the data frame to a variable (using <-) so that it is stored in R’s memory. Then we will explore some of the basic arguments that can be supplied to the function.


1 Answers

Let's say your file is called 'data.csv' and it contains:

var1,1,2,3,4,5,6
var2,2.1,3.9,4.6,5.2,6.1
var3,M,F,M,F,M,M

Note var1 and var3 have 6 values but var2 has only 5. So, the idea is to read the data, transpose it and then use read.csv.

read.tcsv = function(file, header=TRUE, sep=",", ...) {

  n = max(count.fields(file, sep=sep), na.rm=TRUE)
  x = readLines(file)

  .splitvar = function(x, sep, n) {
    var = unlist(strsplit(x, split=sep))
    length(var) = n
    return(var)
  }

  x = do.call(cbind, lapply(x, .splitvar, sep=sep, n=n))
  x = apply(x, 1, paste, collapse=sep) 
  out = read.csv(text=x, sep=sep, header=header, ...)
  return(out)

}

Then, you can do:

read.tcsv("data.csv")

  var1 var2 var3
1    1  2.1    M
2    2  3.9    F
3    3  4.6    M
4    4  5.2    F
5    5  6.1    M
6    6   NA    M
like image 101
Ricardo Oliveros-Ramos Avatar answered Sep 21 '22 09:09

Ricardo Oliveros-Ramos