I have a log file with size of 2.5 GB. Is there any way to split this file into smaller files using windows command prompt?
To be able to open such large CSV files, you need to download and use a third-party application. If all you want is to view such files, then Large Text File Viewer is the best choice for you. For actually editing them, you can try a feature-rich text editor like Emacs, or go for a premium tool like CSV Explorer.
Click Content > Split to open the window in the snapshot below. Click the Files tab and press Add to select a PDF to split. Select the Options tab, and enter a value in the Split by the number of files box. That's the number of split files you'll get.
To split a file into pieces, you simply use the split command. By default, the split command uses a very simple naming scheme. The file chunks will be named xaa, xab, xac, etc., and, presumably, if you break up a file that is sufficiently large, you might even get chunks named xza and xzz.
If you have installed Git for Windows, you should have Git Bash installed, since that comes with Git.
Use the split
command in Git Bash to split a file:
into files of size 500MB each: split myLargeFile.txt -b 500m
into files with 10000 lines each: split myLargeFile.txt -l 10000
Tips:
If you don't have Git/Git Bash, download at https://git-scm.com/download
If you lost the shortcut to Git Bash, you can run it using C:\Program Files\Git\git-bash.exe
I always like examples though...
Example:
You can see in this image that the files generated by split
are named xaa
, xab
, xac
, etc.
These names are made up of a prefix and a suffix, which you can specify. Since I didn't specify what I want the prefix or suffix to look like, the prefix defaulted to x
, and the suffix defaulted to a two-character alphabetical enumeration.
Another Example:
This example demonstrates
MySlice
(instead of the default x
),-d
flag for using numerical suffixes (instead of aa
, ab
, ac
, etc...),-a 5
to tell it I want the suffixes to be 5 digits long:Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Set rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append "LineNumber", 4
.Fields.Append "Txt", 201, 5000
.Open
LineCount = 0
Do Until Inp.AtEndOfStream
LineCount = LineCount + 1
.AddNew
.Fields("LineNumber").value = LineCount
.Fields("Txt").value = Inp.readline
.UpDate
Loop
.Sort = "LineNumber ASC"
If LCase(Arg(1)) = "t" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber < " & LCase(Arg(3)) + 1
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber > " & LCase(Arg(3))
End If
ElseIf LCase(Arg(1)) = "b" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber > " & LineCount - LCase(Arg(3))
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
End If
End If
Do While not .EOF
Outp.writeline .Fields("Txt").Value
.MoveNext
Loop
End With
Cut
filter cut {t|b} {i|x} NumOfLines
Cuts the number of lines from the top or bottom of file.
t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines
Example
cscript /nologo filter.vbs cut t i 5 < "%systemroot%\win.ini"
Another way This outputs lines 5001+, adapt for your use. This uses almost no memory.
Do Until Inp.AtEndOfStream
Count = Count + 1
If count > 5000 then
OutP.WriteLine Inp.Readline
End If
Loop
Below code split file every 500
@echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=upload.txt
REM Edit this value to change the number of lines per file.
SET LPF=15000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile
REM Do not change beyond this line.
SET SFX=%BFN:~-3%
SET /A LineNum=0
SET /A FileNum=1
For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1
echo %%l >> %SFN%!FileNum!.%SFX%
if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)
)
endlocal
Pause
See below: https://forums.techguy.org/threads/solved-split-a-100000-line-csv-into-5000-line-csv-files-with-dos-batch.1023949/
Of course there is! Win CMD can do a lot more than just split text files :)
Split a text file into separate files of 'max' lines each:
Split text file (max lines each):
: Initialize
set input=file.txt
set max=10000
set /a line=1 >nul
set /a file=1 >nul
set out=!file!_%input%
set /a max+=1 >nul
echo Number of lines in %input%:
find /c /v "" < %input%
: Split file
for /f "tokens=* delims=[" %i in ('type "%input%" ^| find /v /n ""') do (
if !line!==%max% (
set /a line=1 >nul
set /a file+=1 >nul
set out=!file!_%input%
echo Writing file: !out!
)
REM Write next file
set a=%i
set a=!a:*]=]!
echo:!a:~1!>>out!
set /a line+=1 >nul
)
If above code hangs or crashes, this example code splits files faster (by writing data to intermediate files instead of keeping everything in memory):
eg. To split a file with 7,600 lines into smaller files of maximum 3000 lines.
set
command to be fed to /g
flag of findstr
list1.txt
\[[0-9]\]
\[[0-9][0-9]\]
\[[0-9][0-9][0-9]\]
\[[0-2][0-9][0-9][0-9]\]
list2.txt
\[[3-5][0-9][0-9][0-9]\]
list3.txt
\[[6-9][0-9][0-9][0-9]\]
type "%input%" | find /v /n "" | findstr /b /r /g:list1.txt > file1.txt type "%input%" | find /v /n "" | findstr /b /r /g:list2.txt > file2.txt type "%input%" | find /v /n "" | findstr /b /r /g:list3.txt > file3.txt
for /f "tokens=* delims=[" %i in ('type "%cd%\file1.txt"') do ( set a=%i set a=!a:*]=]! echo:!a:~1!>>file_1.txt)
Notes:
Works with leading whitespace, blank lines & whitespace lines.
Tested on Win 10 x64 CMD, on 4.4GB text file, 5651982 lines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With