Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What can cause System.Move to occasionally give wrong results?

Tags:

delphi

The last few days we have had some strange problems with our database components developed by a third party. There has been no changes to these components for months. The code that HAS changed the last few days is our own code and we have also updated our gui-components developed by another third party.

After debugging I have found that a call to System.Move in one of the database component procedures occasionally gives wrong results!

Please take a look at the code below from the database components and read my comments. How can this inconsistent behaviour happen? Can anyone give me an idea of how to procede to find the cause of this inconsistent behaviour? NB! I dont think there is anything wrong with THIS code, it is only shown to explain the problem "symptoms". My guess is that there is some sort of memory corruption or something, caused by our code or the updated gui-component-code.

Edit: Take a look at the blogpost linked below. It seems that it could be related to my problem. At least as I read it it confirms that System.Move can give wrong results: http://blog.excastle.com/2007/08/28/delphi-bug-of-the-day-fpu-stack-leak/

Edit: Sorry for not posting my "solution" earlyer but here it comes: When using Delphi 2007 my problem was solved by using FastMove which replaces System.Move. After upgrading to Delphi 2010 i have yet to encounter the problem, an we are no longer using FastMove.

Procedure InternalDescribe;
var 
  cbufl: sb4; //sb4=LongInt
  cbuf: array[0..30] of char;
  cbufp: PChar;
  //....
begin
  //..Some code
  repeat
    //...Some code to initialize cbufp and cbufl

    //On the 15. iteration the values immediately Before Move are always these:
    //cbufp = 'STDPRODUCTSTOREDELEMENTSCOUNT'
    //cbuf = ('S', 'T', 'A', 'T', 'U', 'S', #0, 'E', 'V', 'A', 'R', 'R', 'E', 'C', 'I', 'D', #0, 'D', 'U', 'C', 'T', 'I', 'D', #0, #0, #0, #0, #0, #0, #0, #0)
    //cbufl = 29

    Move(cbufp^, cbuf, cbufl);

    //Values immediately After Move should then be:
    //cbuf = ('S', 'T', 'D', 'P', 'R', 'O', 'D', 'U', 'C', 'T', 'S', 'T', 'O', 'R', 'E', 'D', 'E', 'L', 'E', 'M', 'E', 'N', 'T', 'S', 'C', 'O', 'U', 'N', 'T', #0, #0)

    //But sometimes this Move results in this value( 1 in 5..15 times):
    //cbuf = ('S', 'T', 'D', 'P', 'R', 'O', 'D', 'U', 'C', 'T', 'S', 'T', 'O', 'R', 'E', 'D', #0, #0, #0, #0, #0, 'N', 'T', 'S', 'C', 'O', 'U', 'N', 'T', #0, #0) }

  until SomeCondition; 
  //...Some more code
end;
like image 551
Fredrik Loftheim Avatar asked Sep 23 '09 14:09

Fredrik Loftheim


People also ask

What would cause an error in DNA replication?

Today, scientists suspect that most DNA replication errors are caused by mispairings of a different nature: either between different but nontautomeric chemical forms of bases (e.g., bases with an extra proton, which can still bind but often with a mismatched nucleotide, such as an A with a G instead of a T) or between ...

Why is it important to check for errors in DNA replication?

During the process of DNA replication, errors can sometimes occur. Nucleotide bases may be inserted, deleted, or mismatched into the DNA strand incorrectly. For this reason, it is important for the biological system to have mechanisms in place to detect and repair these errors.

What causes a soft error?

A soft error is caused mainly by radiation rays. When particles included in the radiation rays clash with a semiconductor, charged particles are generated within the semiconductor. The charged particles distort the data contained in the memory cells to generate a soft error.

What enzyme removes the wrong nucleotides after replication?

A DNA polymerase then replaces the missing section with correct nucleotides, and an enzyme called a DNA ligase seals the gap 2. Mismatch repair. A mismatch is detected in newly synthesized DNA.


4 Answers

Move doesn't give wrong results, or at least I've never seen any situation in which it did. It's more likely that you've got something unexpected in the buffer. Try adding calls to Windows.OutputDebugString in this routine to see what you're copying before and after.

like image 167
Mason Wheeler Avatar answered Nov 15 '22 03:11

Mason Wheeler


Careful - you're assuming that a Char = 1 byte. That was fine before D2009, but in D2009 and D2010 a char is 2 bytes. Move always works with bytes. Is it possible these problems happened after you upgraded to D2009 or D2010?

like image 22
Jim Avatar answered Nov 15 '22 04:11

Jim


I can confirm that it does fail sometimes. I've just spent a few days tracking it down. Could not believe it. In our case we have .NET 2.0, web site running under IIS 6 or IIS7 calling some COM components written in Delphi 2007, and under a moderate load it would all of sudden start failing to move bytes 16-19 of 28 bytes - sometimes. Most of the time it works. You are omost likely going to stike problems with moves on sizes in the range 9..31 bytes.

We ended up putting a CompareMem() check after each System.Move() and found that the ComparewMem failed sometimes - and this was moving between two buffers/arrays/structures allocated on stack! Boy was I surprised!

Took ages to duplicate. In essence, System.Move from D2006 onwards is unreliable due to stuff getting left on the FPU stack. Would all be fine if the FPU stack was clear.

The blog post entry noted above is correct. HOwever whatever the fix is, it does not effect system.Move() and therefore if you have a DLL/COM written in Delphi 2006 or later you will have problems at some stage.

I checked out D2010 and the code in System.Move has not been changed. In our case, I'm going to revert System.Move to the Delphi 7 version - just recompile all the system units using the make file.

like image 43
Myles Penlington Avatar answered Nov 15 '22 03:11

Myles Penlington


Just for your information (in case some else has same problem too): we did an upgrade of our software for a customer, and the complete touchscreen locked up when our application was started! Windows was completely frozen! The pc had to be restarted (power off). It took some time to figure out the cause of the complete freeze.

Fortunately we had one (only 1!) stacktrace of an AV in FastMove.LargeSSEMove. I disabled the usage of SSE in fastmove, and the problem is gone.

By the way: touchscreen has an VIA Nehemiah cpu with an S3 chipset.

like image 37
André Avatar answered Nov 15 '22 05:11

André