Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysterious leading "empty" character at beginning of a string which came from CSV file

During the process of reading a CSV file into an Array I noticed the very first array element, which is a string, contains a leading "" .

For example:

str = contacts[0][0]
p str

gives me...

"SalesRepName"

Then by sheer chance I happened to try:

str = contacts[0][0].split(//)
p str

and that gave me...

["", "S", "a", "l", "e", "s", "R", "e", "p", "N", "a", "m", "e"]

I've checked every other element in the array and this is the only one that has a string containing leading "".

like image 450
holaymolay Avatar asked Nov 08 '15 09:11

holaymolay


1 Answers

Now, before I could post this question I stumbled upon the answer. Apparently, the act of me writing up the question gave me the idea of determining the ascii number of this "" character.

str = contacts[0][0].split(//)
p str[0].codepoints

gave me

[65279]

upon inquiring about ascii character 65279 I found this article: https://stackoverflow.com/a/6784805/3170942

According to SLaks:

It's a zero-width no-break space. It's more commonly used as a byte-order mark (BOM).

This, in turn, led me to the solution here: https://stackoverflow.com/a/7780559/3170942
In this response, knut provided an elegant solution, which looked like this:

File.open('file.txt', "r:bom|utf-8"){|file|
  text_without_bom = file.read
}

With , "r:bom|utf-8" being the key element I was looking for. So I adapated it to my code, which became this:

CSV.foreach($csv_path + $csv_file, "r:bom|utf-8") do |row|
  contacts << row
end

I spent hours on this stupid problem. Hopefully, this will save you some time!

like image 146
holaymolay Avatar answered Nov 09 '22 14:11

holaymolay