Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find and Count Words in a String in Delphi?

I have a string comprising numerous words. How do I find and count the total amount of times that a particular word appears?

 E.g "hello-apple-banana-hello-pear"

How would I go about finding all the "hello's" in the example above?

Thanks.

like image 638
john12 Avatar asked Sep 02 '11 12:09

john12


4 Answers

In Delphi XE you can use StrUtils.SplitString.

Something like this

var
    Words: TstringDynArray;
    Word: string;
    WordCount: Integer;
begin
    WordCount := 0;
    Words := SplitString('hello-apple-banana-hello-pear', '-');
    for Word in Words do
    begin
        if Word = 'hello' then
            inc(WordCount);
    end;
like image 91
Mikael Eriksson Avatar answered Nov 02 '22 13:11

Mikael Eriksson


This would depend entirely on how you define a word and the text from which you wish to pull the words. If a "word" is everything between spaces, or "-" in your example, then it becomes a fairly simple task. If, however, you want to deal with hyphenated words, abbreviations, contractions, etc. then it becomes a lot more difficult.

More information please.

EDIT: After rereading your post, and if the example you give is the only one you want, then I'd suggest this:

function CountStr(const ASearchFor, ASearchIn : string) : Integer;
var
  Start : Integer;
begin
  Result := 0;
  Start := Pos(ASearchFor, ASearchIn);
  while Start > 0 do
    begin
      Inc(Result);
      Start := PosEx(ASearchFor, ASearchIn, Start + 1);
    end;
end;

This will catch ALL instances of a sequence of characters.

like image 27
Jerry Gagnon Avatar answered Nov 02 '22 13:11

Jerry Gagnon


I'm sure there is plenty of code around to do this sort of thing, but it's easy enough to do it yourself with the help of Generics.Collections.TDictionary<K,V>.

program WordCount;

{$APPTYPE CONSOLE}

uses
  SysUtils, Character, Generics.Collections;

function IsSeparator(const c: char): Boolean;
begin
  Result := TCharacter.IsWhiteSpace(c);//replace this with whatever you want
end;

procedure PopulateWordDictionary(const s: string; dict: TDictionary<string, Integer>);

  procedure AddItem(Item: string);
  var
    Count: Integer;
  begin
    if Item='' then
      exit;
    Item := LowerCase(Item);
    if dict.TryGetValue(Item, Count) then
      dict[Item] := Count+1
    else
      dict.Add(Item, 1);
  end;

var
  i, len, Start: Integer;
  Item: string;
begin
  len := Length(s);
  Start := 1;
  for i := 1 to len do begin
    if IsSeparator(s[i]) then begin
      AddItem(Copy(s, Start, i-Start));
      Start := i+1;
    end;
  end;
  AddItem(Copy(s, Start, len-Start+1));
end;

procedure Main;
var
  dict: TDictionary<string, Integer>;
  pair: TPair<string, Integer>;
begin
  dict := TDictionary<string, Integer>.Create;
  try
    PopulateWordDictionary('hello  apple banana Hello pear', dict);
    for pair in dict do
      Writeln(pair.Key, ': ', pair.Value);
  finally
    dict.Free;
  end;
end;

begin
  try
    Main;
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
end.

Output:

hello: 2
banana: 1
apple: 1
pear: 1

Note: I'm working with Delphi 2010 and don't have SplitString() available.

like image 6
David Heffernan Avatar answered Nov 02 '22 12:11

David Heffernan


A very clever implementation I saw somewhere on the web:

{ Returns a count of the number of occurences of SubText in Text }
function CountOccurences( const SubText: string;
                          const Text: string): Integer;
begin
  if (SubText = '') OR (Text = '') OR (Pos(SubText, Text) = 0) then
    Result := 0
  else
    Result := (Length(Text) - Length(StringReplace(Text, SubText, '', [rfReplaceAll]))) div  Length(subtext);
end;  { CountOccurences }
like image 3
RobertFrank Avatar answered Nov 02 '22 12:11

RobertFrank