I have a text file to read in R, but the file does not seem to be tab-delimited. The only structure of the file is that columns always finish at some point (i.e. columns are right aligned).
So, first, is there a name for this type of data structure? Then, how can read it in R?
2.37 2.03 2.38
5,397 5,082 5,609
13.0 21.6 15.2 15.2
128.0 103.1 134.2 133.4
Just using read.table() doesn't work, the missing value won't be put at the right place...
# download data:
tmp <- tempfile()
f <- download.file("http://usda.mannlib.cornell.edu/usda/waob/wasde//1990s/1995/wasde-01-12-1995.txt", tmp)
D <- file(tmp)
data_enc <- readLines(D, warn=FALSE)
close(D)
dat <- sapply(strsplit(data_enc[232:236], ":"), function(x) x[2])
writeLines(dat, tmp)
## try to read data:
read.table(tmp, fill = TRUE, sep ="", header=FALSE)
Gives:
V1 V2 V3 V4
1 2.37 2.03 2.38 NA
2 5,397 5,082 5,609 NA
3 13.0 21.6 15.2 15.2
Maybe try using read.fwf to read a table of fixed width formatted data:
widths <- gregexpr("\\.\\d", readLines(tmp)[5])[[1]]+1L # line 5 looks complete
widths <- c(widths[1], diff(widths)) # posis after the decimal points as widths
read.fwf(tmp, widths = widths)
# V1 V2 V3 V4
# 1 2.37 2.03 NA 2.38
# 2 5,397 5,082 NA 5,609
# 3 13.0 21.6 15.2 15.2
# 4 128.0 103.1 134.2 133.4
# 5 146.4 130.9 156.5 155.7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With