I have a log file with size of 2.5 GB. Is there any way to split this file into smaller files using windows command prompt?

<pre class="prettyprint"><code>Set Arg = WScript.Arguments set WshShell = createObject("Wscript.Shell") Set Inp = WScript.Stdin Set Outp = Wscript.Stdout Set rs = CreateObject("ADODB.Recordset") With rs .Fields.Append "LineNumber", 4 .Fields.Append "Txt", 201, 5000 .Open LineCount = 0 Do Until Inp.AtEndOfStream LineCount = LineCount + 1 .AddNew .Fields("LineNumber").value = LineCount .Fields("Txt").value = Inp.readline .UpDate Loop .Sort = "LineNumber ASC" If LCase(Arg(1)) = "t" then If LCase(Arg(2)) = "i" then .filter = "LineNumber < " & LCase(Arg(3)) + 1 ElseIf LCase(Arg(2)) = "x" then .filter = "LineNumber > " & LCase(Arg(3)) End If ElseIf LCase(Arg(1)) = "b" then If LCase(Arg(2)) = "i" then .filter = "LineNumber > " & LineCount - LCase(Arg(3)) ElseIf LCase(Arg(2)) = "x" then .filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1 End If End If Do While not .EOF Outp.writeline .Fields("Txt").Value .MoveNext Loop End With </code></pre> Cut <pre class="prettyprint"><code>filter cut {t|b} {i|x} NumOfLines </code></pre> Cuts the number of lines from the top or bottom of file. <pre class="prettyprint"><code>t - top of the file b - bottom of the file i - include n lines x - exclude n lines </code></pre> Example <pre class="prettyprint"><code>cscript /nologo filter.vbs cut t i 5 < "%systemroot%\win.ini" </code></pre> Another way This outputs lines 5001+, adapt for your use. This uses almost no memory. <pre class="prettyprint"><code>Do Until Inp.AtEndOfStream Count = Count + 1 If count > 5000 then OutP.WriteLine Inp.Readline End If Loop </code></pre>

Of course there is! Win CMD can do a lot more than just split text files :) Split a text file into separate files of 'max' lines each: <pre class="prettyprint"><code>Split text file (max lines each): : Initialize set input=file.txt set max=10000 set /a line=1 >nul set /a file=1 >nul set out=!file!_%input% set /a max+=1 >nul echo Number of lines in %input%: find /c /v "" < %input% : Split file for /f "tokens=* delims=[" %i in ('type "%input%" ^| find /v /n ""') do ( if !line!==%max% ( set /a line=1 >nul set /a file+=1 >nul set out=!file!_%input% echo Writing file: !out! ) REM Write next file set a=%i set a=!a:*]=]! echo:!a:~1!>>out! set /a line+=1 >nul ) </code></pre> If above code hangs or crashes, this example code splits files faster (by writing data to intermediate files instead of keeping everything in memory): eg. To split a file with 7,600 lines into smaller files of maximum 3000 lines. <ol> <li>Generate regexp string/pattern files with <code>set</code> command to be fed to <code>/g</code> flag of <code>findstr</code> </li> </ol> list1.txt <blockquote> \[[0-9]\] \[[0-9][0-9]\] \[[0-9][0-9][0-9]\] \[[0-2][0-9][0-9][0-9]\] </blockquote> list2.txt <blockquote> \[[3-5][0-9][0-9][0-9]\] </blockquote> list3.txt <blockquote> \[[6-9][0-9][0-9][0-9]\] </blockquote> <ol start="2"> <li>Split the file into smaller files:</li> </ol> <blockquote> <pre class="prettyprint"><code>type "%input%" | find /v /n "" | findstr /b /r /g:list1.txt > file1.txt type "%input%" | find /v /n "" | findstr /b /r /g:list2.txt > file2.txt type "%input%" | find /v /n "" | findstr /b /r /g:list3.txt > file3.txt </code></pre> </blockquote> <ol start="3"> <li>remove prefixed line numbers for each file split: eg. for the 1st file:</li> </ol> <blockquote> <pre class="prettyprint"><code>for /f "tokens=* delims=[" %i in ('type "%cd%\file1.txt"') do ( set a=%i set a=!a:*]=]! echo:!a:~1!>>file_1.txt) </code></pre> </blockquote> Notes: Works with leading whitespace, blank lines & whitespace lines. Tested on Win 10 x64 CMD, on 4.4GB text file, 5651982 lines.

How to split large text file in windows?

4 Answers

If you have installed Git for Windows, you should have Git Bash installed, since that comes with Git.

Use the split command in Git Bash to split a file:

into files of size 500MB each: split myLargeFile.txt -b 500m
into files with 10000 lines each: split myLargeFile.txt -l 10000

Tips:

If you don't have Git/Git Bash, download at https://git-scm.com/download
If you lost the shortcut to Git Bash, you can run it using C:\Program Files\Git\git-bash.exe

That's it!

I always like examples though...

Example:

enter image description here

You can see in this image that the files generated by split are named xaa, xab, xac, etc.

These names are made up of a prefix and a suffix, which you can specify. Since I didn't specify what I want the prefix or suffix to look like, the prefix defaulted to x, and the suffix defaulted to a two-character alphabetical enumeration.

Another Example:

This example demonstrates

using a filename prefix of MySlice (instead of the default x),
the -d flag for using numerical suffixes (instead of aa, ab, ac, etc...),
and the option -a 5 to tell it I want the suffixes to be 5 digits long:

enter image description here

answered Oct 24 '22 22:10

Josh Withee

Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
    Set rs = CreateObject("ADODB.Recordset")
    With rs
        .Fields.Append "LineNumber", 4 

        .Fields.Append "Txt", 201, 5000 
        .Open
        LineCount = 0
        Do Until Inp.AtEndOfStream
            LineCount = LineCount + 1
            .AddNew
            .Fields("LineNumber").value = LineCount
            .Fields("Txt").value = Inp.readline
            .UpDate
        Loop

        .Sort = "LineNumber ASC"

        If LCase(Arg(1)) = "t" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber < " & LCase(Arg(3)) + 1
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber > " & LCase(Arg(3))
            End If
        ElseIf LCase(Arg(1)) = "b" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber > " & LineCount - LCase(Arg(3))
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
            End If
        End If

        Do While not .EOF
            Outp.writeline .Fields("Txt").Value

            .MoveNext
        Loop
    End With

Cut

filter cut {t|b} {i|x} NumOfLines

Cuts the number of lines from the top or bottom of file.

t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines

Example

cscript /nologo filter.vbs cut t i 5 < "%systemroot%\win.ini"

Another way This outputs lines 5001+, adapt for your use. This uses almost no memory.

Do Until Inp.AtEndOfStream
         Count = Count + 1
         If count > 5000 then
            OutP.WriteLine Inp.Readline
         End If
Loop

answered Oct 24 '22 22:10

bill

Below code split file every 500

@echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=upload.txt
REM Edit this value to change the number of lines per file.
SET LPF=15000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile

REM Do not change beyond this line.

SET SFX=%BFN:~-3%

SET /A LineNum=0
SET /A FileNum=1

For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1

echo %%l >> %SFN%!FileNum!.%SFX%

if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)

)
endlocal
Pause

See below: https://forums.techguy.org/threads/solved-split-a-100000-line-csv-into-5000-line-csv-files-with-dos-batch.1023949/

answered Oct 24 '22 22:10

Bhanu Sinha

Of course there is! Win CMD can do a lot more than just split text files :)

Split a text file into separate files of 'max' lines each:

Split text file (max lines each):
: Initialize
set input=file.txt
set max=10000

set /a line=1 >nul
set /a file=1 >nul
set out=!file!_%input%
set /a max+=1 >nul

echo Number of lines in %input%:
find /c /v "" < %input%

: Split file
for /f "tokens=* delims=[" %i in ('type "%input%" ^| find /v /n ""') do (

if !line!==%max% (
set /a line=1 >nul
set /a file+=1 >nul
set out=!file!_%input%
echo Writing file: !out!
)

REM Write next file
set a=%i
set a=!a:*]=]!
echo:!a:~1!>>out!
set /a line+=1 >nul
)

If above code hangs or crashes, this example code splits files faster (by writing data to intermediate files instead of keeping everything in memory):

eg. To split a file with 7,600 lines into smaller files of maximum 3000 lines.

Generate regexp string/pattern files with set command to be fed to /g flag of findstr

list1.txt

\[[0-9]\]
\[[0-9][0-9]\]
\[[0-9][0-9][0-9]\]
\[[0-2][0-9][0-9][0-9]\]

list2.txt

\[[3-5][0-9][0-9][0-9]\]

list3.txt

\[[6-9][0-9][0-9][0-9]\]

Split the file into smaller files:

type "%input%" | find /v /n "" | findstr /b /r /g:list1.txt > file1.txt
type "%input%" | find /v /n "" | findstr /b /r /g:list2.txt > file2.txt
type "%input%" | find /v /n "" | findstr /b /r /g:list3.txt > file3.txt

remove prefixed line numbers for each file split:
eg. for the 1st file:

for /f "tokens=* delims=[" %i in ('type "%cd%\file1.txt"') do (
set a=%i
set a=!a:*]=]!
echo:!a:~1!>>file_1.txt)

Notes:
Works with leading whitespace, blank lines & whitespace lines.

Tested on Win 10 x64 CMD, on 4.4GB text file, 5651982 lines.

answered Oct 24 '22 21:10

Zimba

Related questions
                            
                                Can Go compiler be installed on Windows?
                            
                                Several ways to call a windows batch file from another one or from prompt. Which one in which case?
                            
                                LLVM C++ IDE for Windows
                            
                                Speedup IntelliJ-Idea
                            
                                Emacs in Windows
                            
                                Windows is not passing command line arguments to Python programs executed from the shell
                            
                                Why does Windows use CR LF?
                            
                                Get user's non-truncated Active Directory groups from command line
                            
                                What is a good light-weight CSV viewer? [closed]
                            
                                Creating a file name as a timestamp in a batch job
                            
                                How to set up Spark on Windows?
                            
                                scp from Linux to Windows [closed]
                            
                                New to MongoDB Can not run command mongo
                            
                                How to use GNU Make on Windows?
                            
                                How to deal with files with a name longer than 259 characters?
                            
                                What version of Visual Studio is Python on my computer compiled with?
                            
                                How to test if an executable exists in the %PATH% from a windows batch file?
                            
                                How to execute Python scripts in Windows?
                            
                                Windows ignores JAVA_HOME: how to set JDK as default?
                            
                                git clone error: RPC failed; curl 56 OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to split large text file in windows?

Tags:

text

split

windows

cmd

size

Albin

People also ask

4 Answers

That's it!

Josh Withee

bill

Bhanu Sinha

Zimba

Recent Activity

Donate For Us