I need to to read a .txt file from an URL, but would like to skip the rows until a row with a certain value. The URL is https://fred.stlouisfed.org/data/HNOMFAQ027S.txt and the data takes the following form:
"
... (number of rows)
... (number of rows)
... (number of rows)
DATE VALUE
1945-01-01 144855
1946-01-01 138515
1947-01-01 136405
1948-01-01 135486
1949-01-01 142455
"
I would like to skip all rows until the row with "DATE // VALUE" and start importing the data from this line onwards (including "DATE // VALUE"). Is there a way to do this with data.table's fread() - or any other way, such as with dplyr?
Thank you very much in advance for your effort and your time!
Best,
c.
Here's a way to get to extract that info from those text files using readr::read_lines, dplyr, and string handling from stringr.
library(tidyverse)
library(stringr)
df <- data_frame(lines = read_lines("https://fred.stlouisfed.org/data/HNOMFAQ027S.txt")) %>%
filter(str_detect(lines, "^\\d{4}-\\d{2}-\\d{2}")) %>%
mutate(date = str_extract(lines, "^\\d{4}-\\d{2}-\\d{2}"),
value = as.numeric(str_extract(lines, "[\\d-]+$"))) %>%
select(-lines)
df
#> # A tibble: 286 x 2
#> date value
#> <chr> <dbl>
#> 1 1945-10-01 1245
#> 2 1946-01-01 NA
#> 3 1946-04-01 NA
#> 4 1946-07-01 NA
#> 5 1946-10-01 1298
#> 6 1947-01-01 NA
#> 7 1947-04-01 NA
#> 8 1947-07-01 NA
#> 9 1947-10-01 1413
#> 10 1948-01-01 NA
#> # ... with 276 more rows
I filtered for all the lines you want to keep using stringr::str_detect, then extracted out the info you want from the string using stringr::str_extract and regexes.
Combining fread with unix tools:
> fread("curl -s https://fred.stlouisfed.org/data/HNOMFAQ027S.txt | sed -n -e '/^DATE.*VALUE/,$p'")
DATE VALUE
1: 1945-10-01 1245
2: 1946-01-01 .
3: 1946-04-01 .
4: 1946-07-01 .
5: 1946-10-01 1298
---
282: 2016-01-01 6566888
283: 2016-04-01 6741075
284: 2016-07-01 7022321
285: 2016-10-01 6998898
286: 2017-01-01 7448792
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With