I'm newish to python and even more new to pandas, numpy. I'm trying to format a GPS RINEX file so that the file is split into satellites (32 in total). Each file (i.e. satellite) should then be formatted so by epoch (30 second intervals), where each of the signals' data (7 in total) is then displayed in the correnponding columns. For example:
SV1
2014-11-07 00:00:00 L1 L2 P1 P2 C1 S1 S2
2014-11-07 00:00:30 L1 L2 P1 P2 C1 S1 S2
2014-11-07 00:00:30 L1 L2 P1 P2 C1 S1 S2
The code, in particular the function, which I'm working on is:
def read_data_chunk(self, RINEXfile, CHUNK_SIZE = 10000):
obss = np.empty((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.float64) * np.NaN
llis = np.zeros((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
signal_strengths = np.zeros((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
epochs = np.zeros(CHUNK_SIZE, dtype='datetime64[us]')
flags = np.zeros(CHUNK_SIZE, dtype=np.uint8)
i = 0
while True:
hdr = self.read_epoch_header(RINEXfile)
#print hdr
if hdr is None:
break
epoch, flags[i], sats = hdr
epochs[i] = np.datetime64(epoch)
sat_map = np.ones(len(sats)) * -1
for n, sat in enumerate(sats):
if sat[0] == 'G':
sat_map[n] = int(sat[1:]) - 1
obss[i], llis[i], signal_strengths[i] = self.read_obs(RINEXfile, len(sats), sat_map)
i += 1
if i >= CHUNK_SIZE:
break
print "obss.ndim: {0}".format(obss.ndim)
print "obss.shape: {0}" .format(obss.shape)
print "obss.size: {0}".format(obss.size)
print "obss.dtype: {0}".format(obss.dtype)
print "obss.itemsize: {0}".format(obss.itemsize)
print "obss: {0}".format(obss)
y = np.split(obss, 32, 1)
print "y.ndim: {0}".format(y[3].ndim)
print "y.shape: {0}" .format(y[3].shape)
print "y.size: {0}".format(y[3].size)
print "y_0: {0}".format(y[3])
return obss[:i], llis[:i], signal_strengths[:i], epochs[:i], flags[:i]
The print statements are there just to understand the dimensions involved, the results of which:
obss.ndim: 3
obss.shape: (10000L, 32L, 7L)
obss.size: 2240000
obss.dtype: float64
obss.itemsize: 8
y.ndim: 3
y.shape: (10000L, 1L, 7L)
y.size: 70000
The exact problem I'm encountering is just how to manipulate exactly so that the array is split into its subsequent 32 parts (i.e. the satellites). Below is an example of the output so far:
sats = np.rollaxis(obss, 1, 0)
sat = sats[5] #sv6
sat.shape: (10000L, 7L)
sat.ndim: 2
sat.size: 70000
sat.dtype: float64
sat.item
size: 8
sat: [[ -7.28308440e+06 -5.66279406e+06 2.38582902e+07 ..., 2.38582906e+07 4.70000000e+01 4.20000000e+01] [ -7.32362993e+06 -5.69438797e+06 2.38505736e+07 ..., 2.38505742e+07 4.70000000e+01 4.20000000e+01] [ -7.36367675e+06 -5.72559325e+06 2.38429526e+07 ..., 2.38429528e+07 4.60000000e+01 4.20000000e+01]
The output above is for the 6th satellite ("sat") and shows the signals for the first 3 epochs. I tried the below code to open up new files separately but the resulting text files just displayed the output below:
Code:
for i in range(32):
sat = obss[:, i]
open(((("sv{0}").format(sat)),'w').writelines(sat))
Output in text file:
ø ø ø ø ø ø ø
So obviously there's something wrong with the manipulation of the array that I'm overlooking. The read_data_chunk function is called from the read_data function:
def read_data(self, RINEXfile):
obs_data_chunks = []
while True:
obss, _, _, epochs, _ = self.read_data_chunk(RINEXfile)
if obss.shape[0] == 0:
break
obs_data_chunks.append(pd.Panel( np.rollaxis(obss, 1, 0), items=['G%02d' % d for d in range(1, 33)], major_axis=epochs,minor_axis=self.obs_types).dropna(axis=0, how='all').dropna(axis=2, how='all'))
print "obs_data_chunks: {0}".format(obs_data_chunks)
self.data = pd.concat(obs_data_chunks, axis=1)
The next step I tried was in the above code, as I figured this array is perhaps the right one to be manipulated. The final print statement:
obs_data_chunks: [<class 'pandas.core.panel.Panel'>
Dimensions: 32 (items) x 2880 (major_axis) x 7 (minor_axis)
Items axis: G01 to G32
Major_axis axis: 2014-04-27 00:00:00 to 2014-04-27 23:59:30
Minor_axis axis: L1 to S2]
I tried to figure out how to deal with the obs_data_chunks array using:
odc = np.rollaxis(obs_data_chunks, 1)
odc_temp = odc[5]
but received an error: AttributeError: 'list' object has no attribute 'ndim'
It depends on what exactly you want to do with these 32 satellite subsets. As far as I can tell, the way you currently have obss, with shape (10000, 32, 7), you already have it "split" in a way. Here's how you can access them:
Slice along the 'satellite' dimension, which is axis=1:
sat = obss[:, 0] # all the data for satellite 0, with shape (10000, 7)
sat = obss[:, i] # for any i from 0 through 31.
sats = obss[:, :3] # the first three satellites
If you find that you are mainly indexing by satellite, you can move its axis to the front with np.rollaxis:
sats = np.rollaxis(obss, 1)
sats.shape
# (32, 10000, 7)
sat = sats[i] # satellite i, equivalent to obss[:, i]
sat = sats[:3] # first three satellites
If you want to loop through the satellites, as you would in your y = np.split(obss) example, an easier way to do that is:
for i in range(32):
sat = obss[:, i]
...
or, if you roll the axis for sats, you can just do:
sats = np.rollaxis(obss, 1)
for sat in sats:
...
Finally, if you really want a list of the satellites, you can do
sats = np.rollaxis(obss, 1)
satlist = list(sats)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With