I have a large file which contains data for 10 years. I want to split it into files that contain 1 year of data each. The data in the file is in the following format: GBPUSD,20100201,000200,1.5969,1.5969,1.5967,1.5967,4 GBPUSD,20100201,000300,1.5967,1.5967,1.5960,1.5962,4 Characters 8-11 contain the year. I would like to use that as the filename with .txt on the end. So 2011.txt, 2012.txt etc The file contains around 4million rows. I'm using Ubuntu Linux

Here's one way using <code>awk</code>: <pre class="prettyprint"><code>awk '{ print > substr($0,8,4) ".txt" }' file </code></pre> If the length of the first field can vary, you may prefer: <pre class="prettyprint"><code>awk -F, '{ print > substr($2,0,4) ".txt" }' file </code></pre>

Quickest way to split a large file based on text within the file in linux

1 Answers

Here's one way using awk:

awk '{ print > substr($0,8,4) ".txt" }' file

If the length of the first field can vary, you may prefer:

awk -F, '{ print > substr($2,0,4) ".txt" }' file

answered Oct 14 '22 04:10

Steve

Related questions
                            
                                Writing multithreaded TCP server on Linux
                            
                                Linux socket using multiple threads to send
                            
                                Android JNI, how to load library with soname libxx.so.1.2.3
                            
                                Is it safe to call dlclose(NULL)?
                            
                                running a.out on another computer
                            
                                How can I run a Java GUI application on a headless Linux that does not support GUI?
                            
                                GCC/G++: building without GNU unique object symbols for older Linux kernels
                            
                                OCR - Getting text from image using tesseract 3.0 and imagemagick 6.6.5
                            
                                If/When does the does deallocated heap memory get reclaimed?
                            
                                Linux C++: Accessing network statistics
                            
                                Proper way to test for readability/writability of a folder
                            
                                How do I use find to copy and remove extensions keeping the same subdirectory structure [closed]
                            
                                Checking if an SSH tunnel is up and running
                            
                                How to break a line (add a newline) in read -p in Bash?
                            
                                Rsync excluding everything except 1 directory tree
                            
                                how to get thread scheduling policy from console
                            
                                Exception in thread "main" java.lang.UnsatisfiedLinkError: no lwjgl in java.library.path
                            
                                How can you ask bash for the current options?
                            
                                sched_setaffinity cpu affinity in linux
                            
                                You don't have permission error in Apache in CentOS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Quickest way to split a large file based on text within the file in linux

Tags:

linux

bash

sed

awk

zio

People also ask

1 Answers

Steve

Recent Activity

Donate For Us