Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I have a million null characters at the end of my CSV file

Tags:

c#

asp.net-mvc

I am generating a CSV export of data on the fly. I do this by getting a hashset of items, then transforming them row for row and writing them to a MemoryStream, which in turn gets sent to the client as a FileResult. Problem is there is about a million NULL characters at the end of the file, I would guess the number of these characters is equal to the number of items in the hashset. But they're at the end of the file, not the end of each line.

Anyways, the code is like such:

The Controller method:

public ActionResult ExportList(ListExportModel model)
{
  System.IO.MemoryStream ms = ls.ExportListToCsv(model,Server.MapPath("~/uploads"));
  return File(ms.GetBuffer(),"text/csv",model.MailingList.ListName + ".csv");
}

The ExportListToCsv method

public MemoryStream ExportListToCsv(ListExportModel model, string folderpath)
{
    MemoryStream stream = new MemoryStream();
    StreamWriter writer = new StreamWriter(stream);

    writer.WriteLine(string.Join(",", model.Columns));
    var data = GetListItemsFromCsv(model.ListId, folderpath);

    XmlDocument doc = new XmlDocument();
    // Parallel.ForEach(data, (li) =>
    foreach (var li in data)
    {
        string line = "";
        foreach (var field in model.Columns)
        {
            doc.LoadXml(li.CustomFields);
            switch (field)
            {
                //our standard fields
                    case "email":
                        line += li.Email + ",";
                        break;
                    case "tel":
                        line += li.Tel + ",";
                        break;

                    default:
                        line += (doc.SelectNodes("//" + field))[0].Value + ",";
                        break;
                }
            }

            writer.WriteLine(line.TrimEnd(','));
        }
        writer.Flush();
        stream.Position = 0;
        return stream;
    }

And the file (all dummy data, no actual persons were harmed during the making of the screenshot): enter image description here

Note: I get the same results regardless of whether I use

writer.Flush()

and

stream.Position = 0

or not

like image 676
Captain Kenpachi Avatar asked Dec 03 '14 10:12

Captain Kenpachi


People also ask

How do I handle null values in a CSV file?

In CSV files, a NULL value is typically represented by two successive delimiters (e.g. ,, ) to indicate that the field contains no data; however, you can use string values to denote NULL (e.g. null ) or any unique string.

How do I get rid of null characters?

A backslash followed by three 0's represents the null character. This just deletes these characters and writes the result to a new file.


2 Answers

You are calling GetBuffer() instead of ToArray().

See: http://msdn.microsoft.com/en-us/library/system.io.memorystream.toarray

This method omits unused bytes in MemoryStream from the array. To get the entire buffer, use the GetBuffer method.

This method returns a copy of the contents of the MemoryStream as a byte array. If the current instance was constructed on a provided byte array, a copy of the section of the array to which this instance has access is returned. See the MemoryStream constructor for details.

like image 160
leppie Avatar answered Oct 09 '22 15:10

leppie


It looks like there could be a lot of blank lines at the end of your .csv file. This would cause the default: case to be executed numerous times at the end of processing.

like image 40
Resource Avatar answered Oct 09 '22 16:10

Resource