Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What could cause an XML file to be filled with null characters?

Tags:

This is a tricky question. I suspect it will require some advanced knowledge of file systems to answer.

I have a WPF application, "App1," targeting .NET framework 4.0. It has a Settings.settings file that generates a standard App1.exe.config file where default settings are stored. When the user modifies settings, the modifications go in AppData\Roaming\MyCompany\App1\X.X.0.0\user.config. This is all standard .NET behavior. However, on occasion, we've discovered that the user.config file on a customer's machine isn't what it's supposed to be, which causes the application to crash.

The problem looks like this: user.config is about the size it should be if it were filled with XML, but instead of XML it's just a bunch of NUL characters. It's character 0 repeated over and over again. We have no information about what had occurred leading up to this file modification.

enter image description here

We can fix that problem on a customer's device if we just delete user.config because the Common Language Runtime will just generate a new one. They'll lose the changes they've made to the settings, but the changes can be made again.

However, I've encountered this problem in another WPF application, "App2," with another XML file, info.xml. This time it's different because the file is generated by my own code rather than by the CLR. The common themes are that both are C# WPF applications, both are XML files, and in both cases we are completely unable to reproduce the problem in our testing. Could this have something to do with the way C# applications interact with XML files or files in general?

Not only can we not reproduce the problem in our current applications, but I can't even reproduce the problem by writing custom code that generates errors on purpose. I can't find a single XML serialization error or file access error that results in a file that's filled with nulls. So what could be going on?

App1 accesses user.config by calling Upgrade() and Save() and by getting and setting the properties. For example:

if (Settings.Default.UpgradeRequired)
{
    Settings.Default.Upgrade();
    Settings.Default.UpgradeRequired = false;
    Settings.Default.Save();
}

App2 accesses info.xml by serializing and deserializing the XML:

public Info Deserialize(string xmlFile)
{
    if (File.Exists(xmlFile) == false)
    {
        return null;
    }

    XmlSerializer xmlReadSerializer = new XmlSerializer(typeof(Info));

    Info overview = null;

    using (StreamReader file = new StreamReader(xmlFile))
    {
        overview = (Info)xmlReadSerializer.Deserialize(file);
        file.Close();
    }

    return overview;
}

public void Serialize(Info infoObject, string fileName)
{
    XmlSerializer writer = new XmlSerializer(typeof(Info));

    using (StreamWriter fileWrite = new StreamWriter(fileName))
    {
        writer.Serialize(fileWrite, infoObject);
        fileWrite.Close();
    }
}

We've encountered the problem on both Windows 7 and Windows 10. When researching the problem, I came across this post where the same XML problem was encountered in Windows 8.1: Saved files sometime only contains NUL-characters

Is there something I could change in my code to prevent this, or is the problem too deep within the behavior of .NET?

It seems to me that there are three possibilities:

  1. The CLR is writing null characters to the XML files.
  2. The file's memory address pointer gets switched to another location without moving the file contents.
  3. The file system attempts to move the file to another memory address and the file contents get moved but the pointer doesn't get updated.

I feel like 2 and 3 are more likely than 1. This is why I said it may require advanced knowledge of file systems.

I would greatly appreciate any information that might help me reproduce, fix, or work around the problem. Thank you!

like image 397
Kyle Delaney Avatar asked Mar 13 '18 15:03

Kyle Delaney


People also ask

What is null in XML?

In an XML document, the usual way to represent a null value is to leave the element or attribute empty. Some business messages use a special value to represent a null value: <price>-999</price> . This style of null representation is supported by the DFDL and MRM parsers.

How check if XML is null?

To perform null checks on XML you can follow either of these approaches: 1: Perform Is Object check on the parent element along with the Null Check e.g. 1. You can use skipNullOn everywhere if you want to skip all the fields when null.

How do I remove a null character from a file?

Using the -d switch we delete a character. A backslash followed by three 0's represents the null character. This just deletes these characters and writes the result to a new file.


2 Answers

It's well known that this can happen if there is power loss. This occurs after a cached write that extends a file (it can be a new or existing file), and power loss occurs shortly thereafter. In this scenario the file has 3 expected possible states when the machine comes back up:

1) The file doesn't exist at all or has its original length, as if the write never happened.

2) The file has the expected length as if the write happened, but the data is zeros.

3) The file has the expected length and the correct data that was written.

State 2 is what you are describing. It occurs because when you do the cached write, NTFS initially just extends the file size accordingly but leaves VDL (valid data length) untouched. Data beyond VDL always reads back as zeros. The data you were intending to write is sitting in memory in the file cache. It will eventually get written to disk, usually within a few seconds, and following that VDL will get advanced on disk to reflect the data written. If power loss occurs before the data is written or before VDL gets increased, you will end up in state 2.

This is fairly easy to repro, for example by copying a file (the copy engine uses cached writes), and then immediately pulling the power plug on your computer.

like image 97
Craig Barkhouse Avatar answered Sep 28 '22 18:09

Craig Barkhouse


I had a similar problem and I was able to trace my problem to corrupted HDD.

Description of my problem (all related informations):

  • Disk attached to mainboard (SATA):

    • SSD (system),

    • 3 * HDD.

      One of the HDD's had a bad blocks and there were even problems reading the disk structure (directories and file listing).

  • Operation system: Windows 7 x64

  • file system (on all disks): NTFS

When the system tried to read or write to the corrupted disk (user request or automatic scan or any other reason) and the attempt failed, all write operations (to other disk's) were incorrect. The files created on system disk (mostly configuration files by another applications) were written and were valid (probably because the files were cashed in RAM) on direct check of file content.

Unfortunately, after a restart, all the files (written after the failed write/read access on corrupted drive) had the correct size, but the content of the files was 'zero byte' (exactly like in your case).

Try rule out hardware related problems. You can try to check 'copy' the file (after a change) to a different machine (upload to web/ftp). Or try to save specific content to a fixed file. When the check file on different will be correct, or when the fixed content file will be 'empty', the reason is probably on local machine. Try to change HW components, or reinstall the system.

like image 43
Julo Avatar answered Sep 28 '22 18:09

Julo