Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get all lines containing a string in a huge text file - as fast as possible?

Tags:

powershell

In Powershell, how to read and get as fast as possible the last line (or all the lines) which contains a specific string in a huge text file (about 200000 lines / 30 MBytes) ? I'm using :

get-content myfile.txt | select-string -pattern "my_string" -encoding ASCII | select -last 1

But it's very very long (about 16-18 seconds). I did tests without the last pipe "select -last 1", but it's the same time.

Is there a faster way to get the last occurence (or all occurences) of a specific string in huge file?

Perhaps it's the needed time ... Or it there any possiblity to read the file faster from the end as I want the last occurence? Thanks

like image 331
SA345 Avatar asked Jan 23 '14 14:01

SA345


People also ask

How do I search for a specific word in a large text file?

Below is a list of popular programs and how to find text in the files they open. In many applications, you can use the Ctrl + F shortcut keys to open the Find option. On an Apple running macOS, you can use Command + F to open the find option. Finding text in a Word document.

What is the PowerShell equivalent of grep?

The simplest PowerShell equivalent to grep is Select-String. The Select-String cmdlet provides the following features: Search by regular expressions (default); Search by literal match (the parameter -Simple);


1 Answers

Try this:

get-content myfile.txt -ReadCount 1000 |
 foreach { $_ -match "my_string" }

That will read your file in chunks of 1000 records at a time, and find the matches in each chunk. This gives you better performance because you aren't wasting a lot of cpu time on memory management, since there's only 1000 lines at a time in the pipeline.

like image 76
mjolinor Avatar answered Sep 19 '22 18:09

mjolinor