I have a file that is tab-delimited and contains multiple tables each headed by a title, for example "Azuay\n", "Bolivar\n", "Cotopaxi\n", etc, and each table separated by two newlines. Within R, how can I read in this file and select only the table (i.e. specified rows) corresponding to e.g. "Bolivar", while ignoring the table beneath corresponding to "Cotopaxi" and the table above corresponding to "Azuay".
NB. I'd prefer not to modify the table outside R.
The data looks like this. The file is tab-separated.
Azuay
region begin stop
1A 2017761 148749885
1A 148863885 150111299
1A 150329391 150346152
1A 150432847 247191037
Bolivar
region begin stop
2A 2785 242068364
2A 736640 198339289
Cotopaxi
region begin stop
4A 2282 9951846
4A 11672561 11906166
This seems to do the job:
read.entry.table <- function(file, entry) {
lines <- readLines(file)
table.entry <- lines == entry
if (sum(table.entry) != 1) stop(paste(entry, "not found"))
empty.lines <- which(lines == "")
empty.lines <- c(empty.lines, length(lines) + 1L)
table.start <- which(table.entry) + 1L
table.end <- empty.lines[which(empty.lines > table.start)[1]] - 1L
return(read.table(textConnection(lines[seq(from = table.start,
to = table.end)]),
header = TRUE))
}
read.entry.table("test.txt", "Bolivar")
# region begin stop
# 1 2A 2785 242068364
# 2 2A 736640 198339289
read.entry.table("test.txt", "Cotopaxi")
# region begin stop
# 1 4A 2282 9951846
# 2 4A 11672561 11906166
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With