Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R 3.5 - read.csv not able to read UTF-16 csv file

My code is as follows:

read.csv("http://asic.gov.au/Reports/YTD/2018/RR20180420-001-SSDailyYTD.csv", skip=1, fileEncoding = "UTF-16", sep = "\t", header = FALSE)
  • R 3.4.3 - Code executes cleanly
  • R 3.5.0 - gives the following error:

Error in read.table(file = file, header = header, sep = sep, quote = quote, : no lines available in input

@hrbrmstr - session info readout

sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] stringr_1.3.0      RCurl_1.95-4.10    bitops_1.0-6       tidyr_0.8.0        lubridate_1.7.4   
 [6] zoo_1.8-1          ggplot2_2.2.1.9000 data.table_1.10.5  magrittr_1.5       dplyr_0.7.4       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.16      bindr_0.1.1       munsell_0.4.3     colorspace_1.3-2  lattice_0.20-35  
 [6] R6_2.2.2          rlang_0.2.0       plyr_1.8.4        tools_3.5.0       grid_3.5.0       
[11] gtable_0.2.0      withr_2.1.2       yaml_2.1.18       lazyeval_0.2.1    assertthat_0.2.0 
[16] tibble_1.4.2      bindrcpp_0.2.2    purrr_0.2.4       glue_1.2.0        stringi_1.1.7    
[21] compiler_3.5.0    pillar_1.2.2      scales_0.5.0.9000 pkgconfig_2.0.1  
  • Is this a bug in the new R version or
  • Am I missing something

Please help me!

like image 342
cephalopod Avatar asked Apr 27 '18 20:04

cephalopod


People also ask

How do I convert a CSV file to UTF-16?

Open the file you just saved and you'll see "UTF-16 Unicode Text" as the selected option in the "File -> Save As..." dialog. If you "File -> Save" then the contents of the ". csv" file are what you'd get if you saved it as "UTF-16 Unicode Text".

What is the difference between Read_csv and read csv?

The read_csv function imports data into R as a tibble, while read. csv imports a regular old R data frame instead.


1 Answers

I received errors as well. This, however, works:

csv_url <- "http://asic.gov.au/Reports/YTD/2018/RR20180420-001-SSDailyYTD.csv"

download.file(csv_url, basename(csv_url))

read.csv(
  basename(csv_url), skip = 1, fileEncoding = "UTF-16", sep = "\t", header = FALSE
)

If you do not have an account on https://bugs.r-project.org/bugzilla/ or it won't let you make one, let me know and I'll file a bug report for you after testing on a linux system (I'm on macOS and if you could add the output of sessionInfo() to your question that'd be 👍)

I will note that even readr::read_tsv() has issues with this file so there may be legitimate encoding issues that older R versions were less stringent about.

like image 134
hrbrmstr Avatar answered Sep 21 '22 15:09

hrbrmstr