I'd like to read the content of multiple files, process their data individually (because of performance and hardware resources) and write my results into one 'big' netCDF4 file.
Right now I'm able to read the files, process their data, but I struggle with the resulting multiple arrays. I wasn't able to merge them correctly.
I've got a 3d array (time,long,lat) containing my calculated value for each day. What I like to do is to merge all the arrays I've got into one big array before I write it into my netCDF4 file. (all days in one array)
Here two example arrays:
My expected result is:
How can I achive that structure?
allDays=day1+day2
my data will be aggregated.When I use:
allDays=[]
allDays.append(day1)
allDays.append(day2)
my data will be surrounded by a new array.
FYI: I'm using Ubuntu 14.04 and Python: 3.5 (Anaconda)
When you do
allDays=[]
allDays.append(day1)
allDays.append(day2)
You are making a list of pointers to existing data, rather than repackaging the data. You could do:
allDays=[]
allDays.append(day1[:])
allDays.append(day2[:])
And now it will copy the data out of day1 and into the new allDays array. This will double your memory usage, so perhaps best to issue a del day1
after each addition to allDays.
Having said all that, if you use Pandas (usually recommended for time series data) or Numpy, this whole thing would be a lot quicker and use a lot less memory. Numpy arrays cannot hold pointers like python lists can, so the copy there is implied. Hope that clears some things up for you :) I can also highly recommend this video by Ned
Now you can do something like this with python 3:
tst1 = [1, 2, 3]
tst2 = [4, 5, 6]
ts3 = [*tst1, *tst2]
with results: [1, 2, 3, 4, 5, 6]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With