Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ASP.NET CSV Excel issue with strange characters

I'm exporting a table of data to CSV format, e.g.:

    "COL1","COL2","COL3"
    "1","some text", "£232.00"
    "2","some more text", "£111.00"
    "3","other text", "£2.00"

The code to export is fairly simple using an ashx handler:

    context.Response.Clear()
    context.Response.ContentType = "text/csv"
    context.Response.AddHeader("Content-disposition", "attachment;filename=data.csv")
    context.Response.AddHeader("Cache-Control", "must-revalidate")
    context.Response.AddHeader("Pragma", "must-revalidate")
    context.Response.Write(data)
    context.Response.Flush()
    context.Response.End()

My issue is when Excel tries to open the exported file the character  appears before all £ signs, e.g. £232.00 when the value should be £232.00.

like image 462
user23048345 Avatar asked Mar 15 '10 18:03

user23048345


People also ask

Can CSV handle special characters?

So if you're working with CSV, what special characters are actually supported? Technically, all of them! There are no specified limits of what characters can be used in a CSV file.

Why is showing as  in Excel?

When opening the CSV file in Excel, the right single quotation mark character is converted into the "’" symbol. This is because Excel opens the CSV file using the default windows encoding (Windows-1252) and hence the character is encoded as per the Windows-1252 characterset.


1 Answers

You need to set the encoding i.e. the Response.ContentEncoding property.

After finally looking this up in Reflector, the default ContentEncoding is likely to be Encoding.Default or Encoding.UTF8, if a system.web/globalization config section exists, unless it is overridden by a responseEncoding property in that config section.

As to whether setting the ContentEncoding property at all performs differently to not doing so:

  • If you set it to the same object as its current setting, it explicitly catches that and does nothing;
  • However, if you change the setting, it calls HttpWriter.UpdateResponseEncoding, which calls HttpWriter.FlushCharBuffer before switching to the new encoding if there's anything buffered to write already.
  • If you do not set ContentEncoding it is set to the default I mentioned above by first reference to HttpWriter.UpdateResponseEncoding, which is definitely called before any content is written because HttpWriter._responseEncodingUpdated is checked in the writing methods and only updated to True at the end of HttpWriter.UpdateResponseEncoding.

As such, I believe the OP needed to change the encoding to Unicode and I assume Phil Hale's comment means he had a system.web/globalization config section that needed overriding.

Notes:

  • The existing code calling Response.Clear only affects content, not headers, so, contrary to my previous suggestion, I don't believe it is relevant, other than removing any chance of existing content to be present to be flushed before the ContentEncoding change.
  • This is based upon the Reflector 6 output from System.Web from .NET Framework 2.0.
like image 80
Mark Hurd Avatar answered Sep 24 '22 19:09

Mark Hurd