Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I stop parsing an XML document with IVBSAXXMLReader in Delphi?

In order to quickly parse some large XML documents in a Delphi (2007) program, I have implemented the IVBSAXContentHandler interface and use it like this:

FXMLReader := CoSAXXMLReader60.Create;
FXMLReader.contentHandler := Self;
FXMLReader.parseURL(FXmlFile);

This works fine, as long as I simply parse the whole file, but I'd like to stop once I found the content that I am looking for. So my implementation of IVBSAXContentHandler.startElement checks for some condition and when it is true should abort further parsing. I tried this:

procedure TContentHandler.startElement(var strNamespaceURI, strLocalName,  strQName: WideString; const oAttributes: IVBSAXAttributes);
begin
  if SomeCondition then
    SysUtils.Abort;
end;

Unfortunately this raises the rather unhelpful EOleException "Catastrophic failure". (I also tried raising a custom exception with the same result.)

MSDN says the following:

The ErrorHandler interface essentially allows the XMLReader to signal the ContentHandler implementation that it wants to abort processing. Conversely, ContentHandler implementations can indicate to the XMLReader that it wants to abort processing. This can be accomplished by simply raising an application-specific exception. This is especially useful for aborting processing once the implementation finds what it is looking for:

Private Sub IVBSAXContentHandler_characters(ByVal strChars As String)
' I found what I was looking for, abort processing
  Err.Raise vbObjectError + errDone, "startElement", _
        "I got what I want, let's go play!"
End Sub

So, apparently somehow I also need to implement the IVBSAXErrorHandler interface somehow. This interface needs three methods:

procedure TContentHandler.error(const oLocator: IVBSAXLocator;
  var strErrorMessage: WideString; nErrorCode: Integer);
begin

end;

procedure TContentHandler.fatalError(const oLocator: IVBSAXLocator;
  var strErrorMessage: WideString; nErrorCode: Integer);
begin

end;

procedure TContentHandler.ignorableWarning(const oLocator: IVBSAXLocator;
  var strErrorMessage: WideString; nErrorCode: Integer);
begin

end;

and also must be assigned before calling the ParseURL method:

FXMLReader := CoSAXXMLReader60.Create;
FXMLReader.contentHandler := Self;
FXMLReader.errorHandler := Self;
FXMLReader.parseURL(FXmlFile);

Unfortunately that doesn't make any difference, because now the fatalError handler gets called with strErrorMessage = 'Catastrophic failure'. With an empty method body this still results in the above mentioned unhelpful EOleException "Catastrophic failure".

So, now I am out of ideas:

  • Do I need to implement something special in the errorhandler interface?
  • Do I need to raise a particular exception instead of EAbort?
  • Or am I missing something else?

EDIT:

Based on Ondrej Kelle's answer, here is the solution I finally used:

Declare the following constant:

const
  // idea taken from Delphi 10.1 unit System.Win.ComObj:
  EExceptionRaisedHRESULT = HResult(E_UNEXPECTED or (1 shl 29)); // turn on customer bit

Add two new fields to the TContentHandler class:

FExceptObject: TObject;
FExceptAddr: Pointer;

Add this code to the destructor:

FreeAndNil(FExceptObject);

Add a new method SafeCallException:

function TContentHandler.SafeCallException(ExceptObject: TObject; ExceptAddr: Pointer): HResult;
var
  GUID: TGUID;
  exc: Exception;
begin
  if ExceptObject is Exception then begin
    exc := Exception(ExceptObject);
    // Create a copy of the exception object and store it in the FExceptObject field
    FExceptObject := exc.NewInstance;
    Exception(FExceptObject).Create(exc.Message);
    Exception(FExceptObject).HelpContext := exc.HelpContext;
    // Store the exception address in the FExceptAddr field
    FExceptAddr := ExceptAddr;
    // return a custom HRESULT
    Result := EExceptionRaisedHRESULT;
  end else begin
    ZeroMemory(@GUID, SizeOf(GUID));
    Result := HandleSafeCallException(ExceptObject, ExceptAddr, GUID, '', '');
  end;
end;

Add an exception handler to the calling code:

var
  exc: Exception;
begin
  try
    FXMLReader := CoSAXXMLReader60.Create;
    FXMLReader.contentHandler := Self;
    // we do not need an errorHandler
    FXMLReader.parseURL(FXmlFile);
    FXMLReader := nil;
  except
    on e: EOleException do begin
      // Check for the custom HRESULT
      if e.ErrorCode = EExceptionRaisedHRESULT then begin
        // Check that the exception object is assigned
        if Assigned(FExceptObject) then begin
          exc := Exception(FExceptObject);
          // set the pointer to NIL
          FExceptObject := nil;
          // raise the exception a the given address
          raise exc at FExceptAddr;
        end;
      end;
      // fallback: raise the original exception
      raise;
    end;
  end;
end;

While this works for me, it has a serious flaw: It copies only the Message and the HelpContext property of the original exception. So, if there are more properties/fields, e.g.

EInOutError = class(Exception)
public
  ErrorCode: Integer;
end;

These will not be initialized when the exception is re-raised in the calling code.

The advantage is that you will get the correct exception address in the debugger. Beware that you won't get the correct call stack.

like image 204
dummzeuch Avatar asked Aug 17 '16 08:08

dummzeuch


1 Answers

Simply calling Abort; is fine. In this case, just override SafeCallException in your IVBSAXContentHandler implementor class:

function TContentHandler.SafeCallException(ExceptObject: TObject; ExceptAddr: Pointer): HRESULT;
begin
  Result := HandleSafeCallException(ExceptObject, ExceptAddr, TGUID.Empty, '', '');
end;

HandleSafeCallException supplied in ComObj will cause EAbort you're raising to be translated into a HRESULT value E_ABORT which will then be translated back to EAbort by SafeCallError.

Alternatively, you can raise your own exception class, override SafeCallException to translate it into your specific HRESULT value and replace SafeCallErrorProc with your own to translate it back into your Delphi exception which you can then handle on the calling side.

like image 68
Ondrej Kelle Avatar answered Nov 07 '22 10:11

Ondrej Kelle