Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding::UndefinedConversionError: "\xE4" from ASCII-8BIT to UTF-8

I tried to fetch this CSV-File with Net::HTTP.

File.open(file, "w:UTF-8") do |f|
  content = Net::HTTP.get_response(URI.parse(url)).body
  f.write(content)
end

After reading my local csv file again, i got some weird output.

Nationalit\xE4t;Alter 0-5

I tried to encode it to UTF-8, but got the error Encoding::UndefinedConversionError: "\xE4" from ASCII-8BIT to UTF-8

The rchardet gem tolds me the content is ISO-8859-2. But convert to UTF-8 will not work.

After open it in a normal Texteditor, i see it normal encoded.

like image 691
Mindbreaker Avatar asked May 07 '13 21:05

Mindbreaker


1 Answers

You can go with force_encoding:

require 'net/http'

url = "http://data.linz.gv.at/katalog/population/abstammung/2012/auslg_2012.csv"
File.open('output', "w:UTF-8") do |f|
  content = Net::HTTP.get_response(URI.parse(url)).body
  f.write(content.force_encoding("UTF-8"))
end

But this will make you lose some acentuation in your .cvs file

If you are deadly sure that you always will use this URL as input, and the file will always keep this encoding, you can do

# encoding: utf-8
require 'net/http'

url = "http://data.linz.gv.at/katalog/population/abstammung/2012/auslg_2012.csv"
File.open('output', "w:UTF-8") do |f|
  content = Net::HTTP.get_response(URI.parse(url)).body
  f.write(content.encode("UTF-8", "ISO-8859-15"))
end

But this will only work to this file.

like image 69
fotanus Avatar answered Sep 29 '22 06:09

fotanus