Splitting A File On Delimiter

Tags:

I have a file on a Linux system that is roughly 10GB. It contains 20,000,000 binary records, but each record is separated by an ASCII delimiter "$". I would like to use the split command or some combination thereof to chunk the file into smaller parts. Ideally I would be able to specify that the command should split every 1,000 records (therefore every 1,000 delimiters) into separate files. Can anyone help with this?

875

asked Jun 01 '11 11:06

Jeffrey Kevin Pry

1 Answers

The only unorthodox part of the problem seems to be the record separator. I'm sure this is fixable in awk pretty simply - but I happen to hate awk.

I would transfer it in the realm of 'normal' problems first:

tr '$' '\n' < large_records.txt | split -l 1000

This will by default create xaa, xab, xac... files; look at man split for more options

answered Oct 19 '22 23:10

sehe

Related questions
                            
                                what is the 'what' command on AIX under LINUX
                            
                                File which responds to isatty(3)
                            
                                cURL: idle timeout interval more than specified value
                            
                                saving configs in qt linux
                            
                                Is there any way to replace a function in a library?
                            
                                Measuring FLOPs of an application with the linux perf tool
                            
                                i install phpmyadmin on my ubuntu but its not working
                            
                                looking for a stand-alone, in-memory data server with sequential access [closed]
                            
                                Why is this MYSQL statement giving me an error?
                            
                                help - change diff symbol "<", "|" or ">" to a desired one?
                            
                                Multiple commands within Supervisor - Python/Linux
                            
                                Change congestion control algorithms per connection
                            
                                Opera preventDefault() on keydown event
                            
                                How to manipulate page cache in Linux?
                            
                                dlsym()'ing a global variable in C++
                            
                                How can I use the flex lexical scanner generator as part of my program?
                            
                                Where can I find source code of ldconfig? [closed]
                            
                                Sending signal from kernel to user space [closed]
                            
                                Examples of inter process communication(IPC)
                            
                                How to create a virtual device in linux?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Splitting A File On Delimiter

Tags:

linux

split

awk

Jeffrey Kevin Pry

People also ask

1 Answers

sehe

Recent Activity

Donate For Us