Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove null characters from WideString in Delphi 2006

Tags:

unicode

delphi

I have a WideString variable containing some data but when the string was assigned some extra nulls where added at more or less random places in the data. I now need to strip these nulls out of the variable. If it had been a string I would have checked each Char to see if Char(x) = 0 but as this is a WideString I dont think this work? How can I best strip these out?

I'm using Delphi 2006

like image 619
Marius Avatar asked Dec 10 '22 14:12

Marius


2 Answers

What you're seeing probably aren't null characters. They're probably just the upper eight bits of a character with a code-point value less than 256.

If you really do have null characters in your string that aren't supposed to be there, the first thing you should do is figure out how they're getting there. There's probably a bug in your program if they're there when they shouldn't be.

If the code that generates the string is bug-free and you still have unwanted null characters, then you can remove them fairly easily. The common way to remove stuff from a string is with the Delete standard function. You can specify any character by its numeric value with the # syntax, and the compiler can usually figure out whether it needs to represent an AnsiChar or a WideChar.

procedure RemoveNullCharacters(var s: WideString);
var
  i: Integer;
begin
  i := 1;
  while i < Length(s) do
    if s[i] = #0 then
      Delete(s, i, 1)
    else
      Inc(i);
end;

But that may re-allocate the string many times (once for each null character). To avoid that, you can pack the string in-place:

procedure RemoveNullCharacters(var s: WideString);
var
  i, j: Integer;
begin
  j := 0;
  for i := 1 to Length(s) do
    if s[i] <> #0 then begin
      Inc(j);
      s[j] := s[i];
    end;
  if j < Length(s) then
    SetLength(s, j);
end;

Those functions will work for any of Delphi's string types; just change the parameter type.

like image 190
Rob Kennedy Avatar answered Mar 04 '23 03:03

Rob Kennedy


Those are not extra nulls. They're part of the string.

You should do some reading on multi-byte characters, which includes WideStrings. Characters are more than one byte in size, and some of those extra bytes are NULLs.

You might start here with Nick Hodges' articles on Unicode, written when Delphi 2009 was first released to help people transition from single-byte characters to multi-byte ones. There are three articles in the series, IIRC.

like image 22
Ken White Avatar answered Mar 04 '23 01:03

Ken White