Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DataTable internal index is corrupted

Tags:

c#

.net

datatable

I am working with a .NET WinForms app in C#, running against the 3.5 .NET framework. In this app, I am setting the .Expression member of a DataColumn in a DataTable, like so:

DataColumn column = dtData.Columns["TestColumn"];
column.Expression = "some expression";

The 2nd line, where I actually set Expression, will sometimes result in the following exception:

FileName=
LineNumber=0
Source=System.Data
TargetSite=Int32 RBInsert(Int32, Int32, Int32, Int32, Boolean)
System.InvalidOperationException: DataTable internal index is corrupted: '5'.
   at System.Data.RBTree`1.RBInsert(Int32 root_id, Int32 x_id, Int32 mainTreeNodeID, Int32 position, Boolean append)
   at System.Data.RBTree`1.RBInsert(Int32 root_id, Int32 x_id, Int32 mainTreeNodeID, Int32 position, Boolean append)
   at System.Data.Index.InitRecords(IFilter filter)
   at System.Data.Index.Reset()
   at System.Data.DataTable.ResetInternalIndexes(DataColumn column)
   at System.Data.DataTable.EvaluateExpressions(DataColumn column)
   at System.Data.DataColumn.set_Expression(String value)

There is no perceptible rhyme or reason as to when the error will occur; in loading the same data set, it may work fine but then reloading it will fail, and vice versa. This leads me to think it is related to a race condition, where another write operation is occurring on the DataTable as I'm trying to modify one of its columns. However, the code relating to DataTables is not multi-threaded and runs only on the UI thread.

I have searched the web and Microsoft forums, and there is much discussion and confusion over this issue. Back when the issue was first reported in 2006, the thought was that it was an flaw in the .NET framework, and there were some supposed hotfixes released that were presumably rolled into later versions of the .NET framework. However, people have reported mixed results in applying those hotfixes, which are no longer applicable to the current framework.

Another prevailing theory is that there are operations on the DataTable which, though seemingly innocuous, are actually write operations. For example, creating a new DataView based on a DataTable is actually a write operation on the table itself, because it creates an internal index in the DataTable for later reference. These write operations are not thread-safe, so it sometimes happens that a race condition leads to an unthread-safe write coinciding with our access of the DataTable. This, in turn, causes the internal index of the DataTable to become corrupted, leading to the exception.

I have tried putting lock blocks around each DataView creation in the code, but, as I mentioned before, code utilizing the DataTable is not threaded, and the locks had no effect, in any case.

Has anyone seen this and successfully solved / worked around it?


No, unfortunately I can not. Loading the DataTable has already occurred by the time I get a hold of it to apply an Expression to one of its DataColumn's. I could remove the column and then re-add it using the code you suggested, but is there a particular reason why that would solve the internal index is corrupted problem?

like image 726
user30525 Avatar asked Jan 16 '09 15:01

user30525


3 Answers

I just had the same issue while importing rows, as it seems, calling DataTable.BeginLoadData before the insert fixed it for me.

Edit: As it turns out, this only fixed it on one side, now adding rows throws this exception.

Edit2: Suspending binding as suggested by Robert Rossney fixed the adding problem for me, too. I simply removed the DataSource from the DataGridView and readded it after I was done with the DataTable.

Edit3: Still not fixed...the exception keeps creeping up in all different places in my code since Thursday...this is by far the strangest and most f****ing bug I've encountered in the Framework so far (and I've seen many odd things in the 3 years I've been working with .NET 2.0, enough to warrant that not a single of my future projects will be build on it). But enough of ranting, back on topic.

I've read through the whole discussion at the Microsoft support forums and I'll give you a brief summary of it. The original bug report originates in '05.

  • March '06: Bug is reported the first time, investigation starts. Throughout the course of the next year it is reported in different forms and different manifestations.
  • March '07: Finally a hotfix with number KB 932491 is released (don't get your hopes up), it links against a download of an completely irrelevant looking hotfix, or at least so it seems. Throughout the next months many report that the hotfix does not work, some are reporting success.
  • July '07: Last sign of live from Microsoft (with a complete useless answer), beyond this point is no further response from Microsoft. No further confirmations, no attempts on support, no requests for more information...nothing. Beyond this point there's only community related information.

No seriously, this sums it up in my opinion. I was able to extract the following information from the whole discussion:

  • The DataTable is not Thread-Safe. You'll have to Lock/Synchronize it on your own if you have Multi-Threading anywhere on it.
  • The corruption of the index happens somewhere before the actual exception is thrown.
  • One possible corruption source is either an applied Expression or an applied Sort.
  • Another possible source is the DataTable.ListChanged() event, do never modify data in this event or any event which spawns from it. This includes different Changed events from bound controls.
  • There are possible issues when binding the DefaultView against a control. Always use DataTable.BeginLoadData() and DataTable.EndLoadData().
  • Creation and manipulation of the DefaultView is a writing operation on the DataTable (and its Index), the Flying Spaghetti Monster knows why.

The possible source of this is most likely a race condition, either in our source code or in the code of the framework. As it seems, Microsoft is unable to fix this bug or has no intention to. Either way, check your code for race conditions, it has something to do with the DefaultView in my opinion. At some point an Insert or a manipulation of the data is corrupting the internal Index because the changes are not properly propagated through the whole DataTable.

I'll of course report back when I find further informations or additional fixes. And sorry if I get a little bit emotional here, but I've spent three days trying to pinpoint this issue, and it slowly starts to look like a good reason to get a new job.

Edit4: I was able to avoid this bug by completely removing the binding (control.DataSource = null;) and re-adding it after the loading of the data is completed. Which fuels my thought that it has something to do with the DefaultView and the events which spawn from the bound controls.

like image 171
Bobby Avatar answered Sep 24 '22 14:09

Bobby


Personally, this particular bug has been my nemesis for 3 weeks in various fashions. I have solved it in one part of my code base and it shows up elsewhere (I believe I finally squashed it tonight). The exception info is rather unhelpful, and a way to force a reindex would have been a nice feature given the lack of MS to solve the problem.

I wouldn't look for MS's hotfix -- they have a KB article on it, then redirect you to an ASP.Net fix that is completely unrelated.

Ok - enough complaining. Let's see what has actually helped me in resolving this particular issue in the various places I've encountered it:

  • Avoid using Default Views, and modifying the Default View if possible. Btw, .Net 2.0 has a number of reader/writer locks on creating views, so they are not the issue they were pre 2.0.
  • Call AcceptChanges() where possible.
  • Be careful about .Select(expression), since there is no reader/writer lock in this code -- and it is the only place (at least according to a person on the usenet so take it w/ a grain of salt -- however, this is very similar to your issue -- so using Mutexes may help)
  • Set AllowDBNull to the column in question (questionable value, but reported on the usenet -- I've used it only in places where it makes sense)
  • Make sure that you are not setting null (C#)/Nothing (VB) to a DataRow field. Use DBNull.Value instead of null. In your case you may wish to check that the field is not null, the expression syntax does supports the IsNull(val, alt_val) operator.
  • This has probably helped me the most (absurd as it sounds): If a value is not changing, don't assign it. So in your case use this instead of your outright assignment:

    if (column.Expression != "some expression") column.Expression = "some expression";

(I removed the square brackets, not sure why they were there).

Edit (5/16/12): Just ran into this issue repeatedly (with an UltraGrid/UltraWinGrid). Used the advice of removing the sort on the DataView, and then added a sorted column which matched the DataView sort, and this resolved the issue.

like image 20
torial Avatar answered Sep 23 '22 14:09

torial


You mention "not threadsafe". Are you manipulating the object from different threads? If so then that might very well be the reason for the corruption.

like image 32
Lasse V. Karlsen Avatar answered Sep 22 '22 14:09

Lasse V. Karlsen