I have developed an application that scans basically everywhere for a file or list of files. When I scan small folders like 10 000 files and sub files there is no problem. But when I scan for instance my entire users folder with more than 100 000 items, it is very heavy on my processor. It takes about 40% of my processor's power.
Is there a way to optimize this code so that it uses less CPU?
procedure GetAllSubFolders(sPath: String);
var
Path: String;
Rec: TSearchRec;
begin
try
Path := IncludeTrailingBackslash(sPath);
if FindFirst(Path + '*.*', faAnyFile, Rec) = 0 then
try
repeat
Application.ProcessMessages;
if (Rec.Name <> '.') and (Rec.Name <> '..') then
begin
if (ExtractFileExt(Path + Rec.Name) <> '') And
(ExtractFileExt(Path + Rec.Name).ToLower <> '.lnk') And
(Directoryexists(Path + Rec.Name + '\') = False) then
begin
if (Pos(Path + Rec.Name, main.Memo1.Lines.Text) = 0) then
begin
main.ListBox1.Items.Add(Path + Rec.Name);
main.Memo1.Lines.Add(Path + Rec.Name)
end;
end;
GetAllSubFolders(Path + Rec.Name);
end;
until FindNext(Rec) <> 0;
finally
FindClose(Rec);
end;
except
on e: Exception do
ShowMessage(e.Message);
end;
end;
My app searches for all the files in a selected folder and sub-folder, zip's them and copies them to another location you specify.
The Application.ProcessMessages
command is there to make sure the application doesn't look like it is hanging and the user closes it. Because finding 100 000 files for instance can take an hour or so...
I am concerned about the processor usage, The memory is not really affected.
Note: The memo is to make sure the same files are not selected twice.
I see the following performance problems:
Application.ProcessMessages
is somewhat expensive. You are polling for messages rather than using a blocking wait, i.e. GetMessage
. As well as the performance issue, the use of Application.ProcessMessages
is generally an indication of poor design for various reasons and one should, in general, avoid the need to call it.Memo1.Lines.Text
is extraordinarily expensive.Pos
is likewise massively expensive.DirectoryExists
is expensive and spurious. The attributes returned in the search record contain that information.I would make the following changes:
ProcessMessages
. You'll need to devise some way to transport the information back to the main thread for display in the GUI.O(1)
lookup. Take care with case-insensitivity of file names, an issue that you have perhaps neglected so far. This replaces the memo.Rec.Attr
. That is check that Rec.Attr and faDirectory <> 0
.I agree with the answer which says you would do best to do what you're doing in a background thread and I don't want to encourage you to persist in doing it in your main thread.
However, if you go to a command prompt and do this:
dir c:\*.* /s > dump.txt & notepad dump.txt
you may be surprised quite how quickly Notepad pops into view.
So there are few things you could do to speed up your GetAllSubFolders, even if you keep it in your main thread, e.g. to bracket the code by calls to main.Memo1.Lines.BeginUpdate and main.Memo1.Lines.EndUpdate, likewise main.Listbox1.Items.BeginUpdate and EndUpdate. This will stop these controls being updated while it executes (which is actually what your code is spending most of its time doing, that and the "if Pos( ...)" business I've commented on below). And, if you haven't gathered already, Application.ProcessMessages is evil (mostly).
I did some timings on my D: drive, which is a 500Gb SSD with 263562 files in 35949 directories.
The reason I've included the code below in this answer is that you'll find it much easier to execute in a thread because it gathers the results into a TStringlist, whose contents you can then assign to your memo and listbox once the thread has completed.
A few comments on the code in your q, which I imagine you might have got from somewhere.
It pointlessly recurses even when the current entry in Rec is a plain file. The code below only recurses if the current Rec entry is a directory.
It apparently tries to avoid duplicates by the "if Pos( ...)" business, which shouldn't be necessary (except maybe if there's a symbolic link (e.g created with the MkLink command) somewhere that points elsewhere on the drive) and does it in a highly inefficient manner, i.e. by searching for the filename in the memo contents - those will get longer and longer as it finds more files). In the code below, the stringlist is set up to discard duplicates and has its Sorted property set to True, which makes its checking for duplicates much quicker, becauseit can then do a binary search through its contents rather than a serial one.
It calculates Path + Rec.Name 6 times for each thing it finds, which is avoidably inefficient at r/t and inflates the source code. This is only a minor point, though, compared to the first two.
Code:
function GetAllSubFolders(sPath: String) : TStringList;
procedure GetAllSubFoldersInner(sPath : String);
var
Path,
AFileName,
Ext: String;
Rec: TSearchRec;
Done: Boolean;
begin
Path := IncludeTrailingBackslash(sPath);
if FindFirst(Path + '*.*', faAnyFile, Rec) = 0 then begin
Done := False;
while not Done do begin
if (Rec.Name <> '.') and (Rec.Name <> '..') then begin
AFileName := Path + Rec.Name;
Ext := ExtractFileExt(AFileName).ToLower;
if not ((Rec.Attr and faDirectory) = faDirectory) then begin
Result.Add(AFileName)
end
else begin
GetAllSubFoldersInner(AFileName);
end;
end;
Done := FindNext(Rec) <> 0;
end;
FindClose(Rec);
end;
end;
begin
Result := TStringList.Create;
Result.BeginUpdate;
Result.Sorted := True;
Result.Duplicates := dupIgnore; // don't add duplicate filenames to the list
GetAllSubFoldersInner(sPath);
Result.EndUpdate;
end;
procedure TMain.Button1Click(Sender: TObject);
var
T1,
T2 : Integer;
TL : TStringList;
begin
T1 := GetTickCount;
TL := GetAllSubfolders('D:\');
try
Memo1.Lines.BeginUpdate;
try
Memo1.Lines.Text := TL.Text;
finally
Memo1.Lines.EndUpdate;
end;
T2 := GetTickCount;
Caption := Format('GetAll: %d, Load: %d, Files: %d', [T2 - T1, GetTickCount - T2, TL.Count]);
finally
TL.Free;
end;
end;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With