Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot remove `Null Characters` from a string

I asked a similar question couple months ago. Thanks to Rob Kennedy I could load my whole text into a Richedit BUT I couldn't remove Null chars. I could load my text because I used Stream.


Now in this code:

var
  strm : TMemorystream;
  str  : UTF8string;
  ss   : TStringstream;

begin
  strm := tmemorystream.Create;

  try
    strm.LoadFromFile('C:\Text.txt');
    setstring(str,PAnsichar(strm.Memory),strm.Size);
    str := StringReplace(str, #0, '', [rfReplaceAll]);  //This line doesn't work at all
    ss  := tstringstream.Create(str);
    Richedit1.Lines.LoadFromStream(ss);
  finally
    strm.Free;
    ss.Free;
  end;
end;

I converted TMemorystream to string to remove Null Chars with StringReplace() and then converted it again to TStringstream to load it with Richedit.lines.LoadFromStream.

But my problem is that I can't remove Null Character using StringReplace(). I can replace other characters but not #0.

Is there any way to remove null charcters directly in TMemorystream and load it into a Richedit? How? If it's not possible or it's very complex, how can I remove them when I convert my text to string?

Thanks.

like image 533
Sky Avatar asked Dec 01 '22 18:12

Sky


2 Answers

Sertac's answer is accurate and you should accept it. If performance is important, and you have a large string with frequent instances of the null character then you should try to reduce the number of heap allocations. Here is how I would implement this:

function RemoveNull(const Input: string): string;
var
  OutputLen, Index: Integer;
  C: Char;
begin
  SetLength(Result, Length(Input));
  OutputLen := 0;
  for Index := 1 to Length(Input) do
  begin
    C := Input[Index];   
    if C <> #0 then
    begin
      inc(OutputLen);
      Result[OutputLen] := C;
    end;
  end;
  SetLength(Result, OutputLen);
end;

If you want to do it directly in the memory stream, then you can do it like this:

procedure RemoveNullFromMemoryStream(Stream: TMemoryStream);
var
  i: Integer;
  pIn, pOut: PByte;
begin
  pIn := Stream.Memory;
  pOut := pIn;
  for i := 0 to Stream.Size-1 do
  begin
    if pIn^ <> 0 then
    begin
      pOut^ := pIn^;
      inc(pOut);
    end;
    inc(pIn);
  end;
  Stream.SetSize(NativeUInt(pOut)-NativeUInt(Stream.Memory));
end;
like image 50
David Heffernan Avatar answered Dec 10 '22 04:12

David Heffernan


As far as I can see, all searching/replacing utilities, at one time or other, cast the input to a PChar, which '#0' is the termination character. Hence they never go past the string part that's before the first Null. You may need to devise your own mechanism. Just a quick example:

var
  i: Integer;
begin
  Assert(str <> '');
  i := 1;
  while i <= Length(str) do
    if str[i] = #0 then
      Delete(str, i, 1)
    else
      Inc(i);

Replacing in the stream would similarly involve testing each character and then adjusting the stream accordingly before moving on after you decide to delete one.

like image 42
Sertac Akyuz Avatar answered Dec 10 '22 04:12

Sertac Akyuz