Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

StrUtils.SplitString not working as expected

I use the StrUtils in to split a string into a TStringDynArray, but the output was not as expected. I will try to explain the issue:

I have a string str: 'a'; 'b'; 'c'
Now I called StrUtils.SplitString(str, '; '); to split the string and I expected an array with three elements: 'a', 'b', 'c'

But what I got is an array with five elements: 'a', '', 'b', '', 'c'.
When I split with just ';' instead of '; ' I get three elements with a leading blank.

So why do I get empty strings in my first solution?

like image 378
Obl Tobl Avatar asked Mar 07 '16 09:03

Obl Tobl


2 Answers

This function is designed not to merge consecutive separators. For instance, consider splitting the following string on commas:

foo,,bar

What would you expect SplitString('foo,,bar', ',') to return? Would you be looking for ('foo', 'bar') or should the answer be ('foo', '', 'bar')? It's not clear a priori which is right, and different use cases might want different output.

If your case, you specified two delimiters, ';' and ' '. This means that

'a'; 'b'

splits at ';' and again at ' '. Between those two delimiters there is nothing, and hence an empty string is returned in between 'a' and 'b'.

The Split method from the string helper introduced in XE3 has a TStringSplitOptions parameter. If you pass ExcludeEmpty for that parameter then consecutive separators are treated as a single separator. This program:

{$APPTYPE CONSOLE}

uses
  System.SysUtils;

var
  S: string;

begin
  for S in '''a''; ''b''; ''c'''.Split([';', ' '], ExcludeEmpty) do begin
    Writeln(S);
  end;
end.

outputs:

'a'
'b'
'c'

But you do not have this available to you in XE2 so I think you are going to have to roll your own split function. Which might look like this:

function IsSeparator(const C: Char; const Separators: string): Boolean;
var
  sep: Char;
begin
  for sep in Separators do begin
    if sep=C then begin
      Result := True;
      exit;
    end;
  end;
  Result := False;
end;

function Split(const Str, Separators: string): TArray<string>;
var
  CharIndex, ItemIndex: Integer;
  len: Integer;
  SeparatorCount: Integer;
  Start: Integer;
begin
  len := Length(Str);
  if len=0 then begin
    Result := nil;
    exit;
  end;

  SeparatorCount := 0;
  for CharIndex := 1 to len do begin
    if IsSeparator(Str[CharIndex], Separators) then begin
      inc(SeparatorCount);
    end;
  end;

  SetLength(Result, SeparatorCount+1); // potentially an over-allocation
  ItemIndex := 0;
  Start := 1;
  CharIndex := 1;
  for CharIndex := 1 to len do begin
    if IsSeparator(Str[CharIndex], Separators) then begin
      if CharIndex>Start then begin
        Result[ItemIndex] := Copy(Str, Start, CharIndex-Start);
        inc(ItemIndex);
      end;
      Start := CharIndex+1;
    end;
  end;

  if len>Start then begin
    Result[ItemIndex] := Copy(Str, Start, len-Start+1);
    inc(ItemIndex);
  end;

  SetLength(Result, ItemIndex);
end;

Of course, all of this assumes that you want a space to act as a separator. You've asked for that in the code, but perhaps you actually want just ; to act as a separator. In that case you probably want to pass ';' as the separator, and trim the strings that are returned.

like image 146
David Heffernan Avatar answered Nov 08 '22 03:11

David Heffernan


SplitString is defined as

function SplitString(const S, Delimiters: string): TStringDynArray;

One would thought that Delimiters denote single delimiter string used for splitting string, but it actually denotes set of single characters used to split string. Each character in Delimiters string will be used as one of possible delimiters.

SplitString

Splits a string into different parts delimited by the specified delimiter characters. SplitString splits a string into different parts delimited by the specified delimiter characters. S is the string to be split. Delimiters is a string containing the characters defined as delimiters.

like image 32
Dalija Prasnikar Avatar answered Nov 08 '22 01:11

Dalija Prasnikar