i want to process a text file line by line. In the olden days i loaded the file into a StringList
:
slFile := TStringList.Create();
slFile.LoadFromFile(filename);
for i := 0 to slFile.Count-1 do
begin
oneLine := slFile.Strings[i];
//process the line
end;
Problem with that is once the file gets to be a few hundred megabytes, i have to allocate a huge chunk of memory; when really i only need enough memory to hold one line at a time. (Plus, you can't really indicate progress when you the system is locked up loading the file in step 1).
The i tried using the native, and recommended, file I/O routines provided by Delphi:
var
f: TextFile;
begin
Reset(f, filename);
while ReadLn(f, oneLine) do
begin
//process the line
end;
Problem withAssign
is that there is no option to read the file without locking (i.e. fmShareDenyNone
). The former stringlist
example doesn't support no-lock either, unless you change it to LoadFromStream
:
slFile := TStringList.Create;
stream := TFileStream.Create(filename, fmOpenRead or fmShareDenyNone);
slFile.LoadFromStream(stream);
stream.Free;
for i := 0 to slFile.Count-1 do
begin
oneLine := slFile.Strings[i];
//process the line
end;
So now even though i've gained no locks being held, i'm back to loading the entire file into memory.
Is there some alternative to Assign
/ReadLn
, where i can read a file line-by-line, without taking a sharing lock?
i'd rather not get directly into Win32 CreateFile
/ReadFile
, and having to deal with allocating buffers and detecting CR
, LF
, CRLF
's.
i thought about memory mapped files, but there's the difficulty if the entire file doesn't fit (map) into virtual memory, and having to maps views (pieces) of the file at a time. Starts to get ugly.
i just want Reset
with fmShareDenyNone
!
With recent Delphi versions, you can use TStreamReader
. Construct it with your file stream, and then call its ReadLine
method (inherited from TTextReader
).
An option for all Delphi versions is to use Peter Below's StreamIO unit, which gives you AssignStream
. It works just like AssignFile
, but for streams instead of file names. Once you've used that function to associate a stream with a TextFile
variable, you can call ReadLn
and the other I/O functions on it just like any other file.
You can use this sample code:
TTextStream = class(TObject)
private
FHost: TStream;
FOffset,FSize: Integer;
FBuffer: array[0..1023] of Char;
FEOF: Boolean;
function FillBuffer: Boolean;
protected
property Host: TStream read FHost;
public
constructor Create(AHost: TStream);
destructor Destroy; override;
function ReadLn: string; overload;
function ReadLn(out Data: string): Boolean; overload;
property EOF: Boolean read FEOF;
property HostStream: TStream read FHost;
property Offset: Integer read FOffset write FOffset;
end;
{ TTextStream }
constructor TTextStream.Create(AHost: TStream);
begin
FHost := AHost;
FillBuffer;
end;
destructor TTextStream.Destroy;
begin
FHost.Free;
inherited Destroy;
end;
function TTextStream.FillBuffer: Boolean;
begin
FOffset := 0;
FSize := FHost.Read(FBuffer,SizeOf(FBuffer));
Result := FSize > 0;
FEOF := Result;
end;
function TTextStream.ReadLn(out Data: string): Boolean;
var
Len, Start: Integer;
EOLChar: Char;
begin
Data:='';
Result:=False;
repeat
if FOffset>=FSize then
if not FillBuffer then
Exit; // no more data to read from stream -> exit
Result:=True;
Start:=FOffset;
while (FOffset<FSize) and (not (FBuffer[FOffset] in [#13,#10])) do
Inc(FOffset);
Len:=FOffset-Start;
if Len>0 then begin
SetLength(Data,Length(Data)+Len);
Move(FBuffer[Start],Data[Succ(Length(Data)-Len)],Len);
end else
Data:='';
until FOffset<>FSize; // EOL char found
EOLChar:=FBuffer[FOffset];
Inc(FOffset);
if (FOffset=FSize) then
if not FillBuffer then
Exit;
if FBuffer[FOffset] in ([#13,#10]-[EOLChar]) then begin
Inc(FOffset);
if (FOffset=FSize) then
FillBuffer;
end;
end;
function TTextStream.ReadLn: string;
begin
ReadLn(Result);
end;
Usage:
procedure ReadFileByLine(Filename: string);
var
sLine: string;
tsFile: TTextStream;
begin
tsFile := TTextStream.Create(TFileStream.Create(Filename, fmOpenRead or fmShareDenyWrite));
try
while tsFile.ReadLn(sLine) do
begin
//sLine is your line
end;
finally
tsFile.Free;
end;
end;
If you need support for ansi and Unicode in older Delphis, you can use my GpTextFile or GpTextStream.
As it seems the FileMode variable is not valid for Textfiles, but my tests showed that multiple reading from the file is no problem. You didn't mention it in your question, but if you are not going to write to the textfile while it is read you should be good.
What I do is use a TFileStream but I buffer the input into fairly large blocks (e.g. a few megabytes each) and read and process one block at a time. That way I don't have to load the whole file at once.
It works quite quickly that way, even for large files.
I do have a progress indicator. As I load each block, I increment it by the fraction of the file that has additionally been loaded.
Reading one line at a time, without something to do your buffering, is simply too slow for large files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With