Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug a (possible) RTL problem?

I'm asking this because I'm out of good ideas...hoping for someone else's fresh perspective.

I have a user running our 32-bit Delphi application (compiled with BDS 2006) on a Windows 7 64-bit system. Our software was "working fine" until a couple weeks ago. Now suddenly it isn't: it throws an Access Violation while initializing (instancing objects).

We've had him reinstall all our software--starting all over from scratch. Same AV error. We disabled his anti-virus software; same error.

Our stack tracing code (madExcept) for some reason wasn't able to provide a stack trace to the line of the error, so we've sent a couple error logging versions for the user to install and run, to isolate the line which generates the error...

Turns out, it's a line which instances a simple TStringList descendant (there's no overridden Create constructor, etc.--basically the Create is just instancing a TStringList which has a few custom methods associated with the descendant class.)

I'm tempted to send the user yet another test .EXE; one which just instances a plain-vanilla TStringList, to see what happens. But at this point I feel like I'm flailing at windmills, and risk wearing out the user's patience if I send too many more "things to try".

Any fresh ideas on a better approach to debugging this user's problem? (I don't like bailing out on a user's problems...those tend to be the ones which, if ignored, suddenly become an epidemic that 5 other users suddenly "find".)

EDIT, as Lasse requested:

procedure T_fmMain.AfterConstruction;
begin
  inherited;
      //Logging shows that we return from the Inherited call above,
      //then AV in the following line...
  FActionList := TAActionList.Create;
  ...other code here...
end;

And here's the definition of the object being created...

type
  TAActionList = class(TStringList)
  private
    FShadowList: TStringList;              //UPPERCASE shadow list
    FIsDataLoaded : boolean;
  public
    procedure AfterConstruction; override;
    procedure BeforeDestruction; override;
    procedure DataLoaded;
    function Add(const S: string): Integer; override;
    procedure Delete(Index : integer); override;
    function IndexOf(const S : string) : Integer; override;
  end;

implementation

procedure TAActionList.AfterConstruction;
begin
  Sorted := False;              //until we're done loading
  FShadowList := TStringList.Create;
end;
like image 648
Mark Wilsdorf Avatar asked May 13 '11 22:05

Mark Wilsdorf


1 Answers

I hate these kind of problems, but I reckon you should focus on what's happening recently BEFORE the object tries to get constructed.

The symptoms you describe sound like typical heap corruption, so maybe you have something like...

  • An array being written to outside bounds? (turn bounds checking on, if you have it off)
  • Code trying to access an object which has been deleted?

Since my answer above, you've posted code snippets. This does raise a couple of possible issues that I can see.

a: AfterConstruction vs. modified constructor: As others have mentioned, using AfterConstruction in this way is at best not idiomatic. I don't think it's truly "wrong", but it's a possible smell. There's a good intro to these methods on Dr. Bob's site here.

b: overridden methods Add, Delete, IndexOf I'm guessing these methods use the FshadowList item in some way. Is it remotely possible that these methods are being invoked (and thus using FShadowList) before the FShadowList is created? This seems possible because you're using the AfterConstruction methods above, by which time virtual methods should 'work'. Hopefully this is easy to check with a debugger by setting some breakpoints and seeing the order they get hit in.

like image 59
Roddy Avatar answered Sep 17 '22 19:09

Roddy