Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a csv file into multiple files based on a pattern

I have a csv file with the following structure:

time,magnitude
0,13517
292.5669,370
620.8469,528
0,377
832.3269,50187
5633.9419,3088
20795.0950,2922
21395.6879,2498
21768.2139,647
21881.2049,194
0,3566
292.5669,370
504.1510,712
1639.4800,287
46709.1749,365
46803.4400,500

I'd like to split this csv file into separate csv files, like the following:

File 1:

time,magnitude
0,13517
292.5669,370
620.8469,528

File 2:

time,magnitude
0,377
832.3269,50187
5633.9419,3088
20795.0950,2922
21395.6879,2498

and so on..

I've read several similar posts (e.g., this, this, or this one), but they all search for specific values in a column and save each groups of values into a separate file. However, in my case, the values of time column are not the same. I'd like to split base on a condition: If time = 0, save that row and all subsequent rows in a new file until the next time =0.

Can someone please let me know how to do this?

like image 382
mOna Avatar asked May 22 '26 22:05

mOna


2 Answers

With pandas, you can use groupby and boolean indexing :

#pip install pandas
import pandas as pd

df = pd.read_csv("input_file.csv", sep=",") # <- change the sep if needed

for n, g in df.groupby(df["time"].eq(0).cumsum()):
    g.to_csv(f"file_{n}.csv", index=False, sep=",")

Output :

    time  magnitude   # <- file_1.csv
  0.0000      13517
292.5669        370
620.8469        528

      time  magnitude # <- file_2.csv
    0.0000        377
  832.3269      50187
 5633.9419       3088
20795.0950       2922
21395.6879       2498
like image 143
Timeless Avatar answered May 24 '26 11:05

Timeless


datasplit.awk

#!/usr/bin/awk -f

BEGIN
{
    filename = "output_file_"
    fileext = ".csv"
    FS = ","

    c = 0
    file = filename c fileext
    getline
    header = $0
}
{
    if ($1 == 0){
        c = c + 1
        file = filename c fileext
        print header > file
        print $0 >> file
    } else {
        print >> file
    }
}

Make the file executable:

chmod +x datasplit.awk

Start in the folder where the data shall be written:

datasplit.awk datafile
like image 34
dodrg Avatar answered May 24 '26 10:05

dodrg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!