Excel (at least in Office 2007 on XP) can behave differently depending on whether a CSV file is imported by opening it from the File->Open menu or by double-clicking on the file in Explorer.
I have a CSV file that is in UTF-8 encoding and contains newlines in some cells. If I open this file from Excel's File->Open menu, the "import CSV" wizard pops up and the file cannot be correctly imported: the newlines start a new row even when quoted. If I open this file by double-clicking on it in an Explorer window, then it opens correctly without the intervention of the wizard.
None of the suggested solutions worked for me.
What actually works (with any encoding):
Copy/paste the data from the csv-file (open in a text editor), then perform "text to columns" --> data gets transformed incorrectly.
The next stap is to go to the nearest empty column or empty worksheet and copy/paste again (same thing what you already have in your clipboard) --> automagically works now.
If you are doing this manually, download LibreOffice and use LibreOffice Calc to import your CSV. It does a much better job of stuff like this than any version of Excel I've tried, and it can save to XLS or XLSX as required if you need to transfer to Excel afterwards.
But if you're stuck with Excel and need a better fix, there seems to be a way. It seems to be locale dependent (which seems idiotic, in my humble opinion). I don't have Excel 2007, but I have Excel 2010, and the example given:
ID,Name,Description
"12345","Smith, Joe","Hey.
My name is Joe."
doesn't work. I wrote it in Notepad and chose Save as..., and next to the Save button you can choose the encoding. I chose UTF-8 as suggested, but with no luck. Changing the commas to semicolons worked for me, though. I didn't change anything else, and it just worked. So I changed the example to look like this, and chose the UTF-8 encoding when saving in Notepad:
ID;Name;Description
"12345";"Smith, Joe";"Hey.
My name is Joe."
But there's a catch! The only way it works is if you double-click the CSV file to open it in Excel. If I try to import data from text and chose this CSV, then it still fails on quoted newlines.
But there's another catch! The working field separator (comma in the original example, semicolon in my case) seems to depend on the system's Regional Settings (set under Control Panel -> Region and Language). In Norway, comma is the decimal separator. Excel seems to avoid this character and prefer a semicolon instead. I have access to another computer set to UK English locale, and on that computer, the first example with a comma separator works fine (only on doubleclick), and the one with semicolon actually fails! So much for interoperability. If you want to publish this CSV online and users may have Excel, I guess you have to publish both versions and suggest that people check which file gives the correct number of rows.
So all the details that I've been able to gather to get this to work are:
Hope this helps someone.
I have finally found the problem!
It turns out that we were writing the file using Unicode encoding, rather than ASCII or UTF-8. Changing the encoding on the FileStream seems to solve the problem.
Thanks everyone for all your suggestions!
Use Google Sheets and import the CSV file.
Then you can export that to use in Excel
Remove the newline/linefeed characters (\n
with Notepad++). Excel will still recognise the carriage return character (\r
) to separate records.
As mentioned newline characters are supported inside CSV fields but Excel doesn't always handle them gracefully. I faced a similar issue with a third party CSV that possibly had encoding issues but didn't improve with encoding changes.
What worked for me was removing all newline characters (\n
). This has the effect of collapsing fields to a single record assuming that your records are separated by the combination of a carriage return and a newline (CR/LF). Excel will then properly import the file and recognise new records by the carriage return.
Obviously a cleaner solution is to first replace the real newlines (\r\n
) with a temporary character combination, replacing the newlines (\n
) with your seperating character of choice (e.g. comma in a semicolon file) and then replacing the temporary characters with proper newlines again.
If the field contains a leading space, Excel ignores the double quote as a text qualifier. The solution is to eliminate leading spaces between the comma (field separator) and double-quote. For example:
Broken:
Name,Title,Description
"John", "Mr.", "My detailed description"
Working:
Name,Title,Description
"John","Mr.","My detailed description"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With