I have a text file of temperature data that looks like this:
3438012868.0 0.0 21.7 22.6 22.5 22.5 21.2
3438012875.0 0.0 21.6 22.6 22.5 22.5 21.2
3438012881.9 0.0 21.7 22.5 22.5 22.5 21.2
3438012888.9 0.0 21.6 22.6 22.5 22.5 21.2
3438012895.8 0.0 21.6 22.5 22.6 22.5 21.3
3438012902.8 0.0 21.6 22.5 22.5 22.5 21.2
3438012909.7 0.0 21.6 22.5 22.5 22.5 21.2
3438012916.6 0.0 21.6 22.5 22.5 22.5 21.2
3438012923.6 0.0 21.6 22.6 22.5 22.5 21.2
3438012930.5 0.0 21.6 22.5 22.5 22.5 21.2
3438012937.5 0.0 21.7 22.5 22.5 22.5 21.2
3438012944.5 0.0 21.6 22.5 22.5 22.5 21.3
3438012951.4 0.0 21.6 22.5 22.5 22.5 21.2
3438012958.4 0.0 21.6 22.5 22.5 22.5 21.3
3438012965.3 0.0 21.6 22.6 22.5 22.5 21.2
3438012972.3 0.0 21.6 22.5 22.5 22.5 21.3
3438012979.2 0.0 21.6 22.6 22.5 22.5 21.2
3438012986.1 0.0 21.6 22.5 22.5 22.5 21.3
3438012993.1 0.0 21.6 22.5 22.6 22.5 21.2
3438013000.0 0.0 21.6 0.0 22.5 22.5 21.3
3438013006.9 0.0 21.6 22.6 22.5 22.5 21.2
3438013014.4 0.0 21.6 22.5 22.5 22.5 21.3
3438013021.9 0.0 21.6 22.5 22.5 22.5 21.3
3438013029.9 0.0 21.6 22.5 22.5 22.5 21.2
3438013036.9 0.0 21.6 22.6 22.5 22.5 21.2
3438013044.6 0.0 21.6 22.5 22.5 22.5 21.2
but the entire file is much longer, this is the first few lines. The first column is a timestamp and the next 6 columns are temperature recordings. I need to write a loop that will find the average of the 6 measurements but will ignore measurement of 0.0 because this just means the sensor wasn't turned on. Later in the measurements, the first column does have a measurement. Is there a way for me to write an if statement or another way to only find averages of the non-zero numbers in a list? Right now, I have:
time = []
t1 = []
t2 = []
t3 = []
t4 = []
t5 = []
t6 = []
newdate = []
temps = open('file_path','r')
sepfile = temps.read().replace('\n','').split('\r')
temps.close()
for plotpair in sepfile:
data = plotpair.split('\t')
time.append(float(data[0]))
t1.append(float(data[1]))
t2.append(float(data[2]))
t3.append(float(data[3]))
t4.append(float(data[4]))
t5.append(float(data[5]))
t6.append(float(data[6]))
for data_seconds in time:
date = datetime(1904,1,1,5,26,02)
delta = timedelta(seconds=data_seconds)
newdate.append(date+delta)
for datapoint in t2,t3,t4,t5,t6:
temperatures = np.array([t2,t3,t4,t5,t6]).mean(0).tolist()
which only finds the average for the last 5 measurements. I'm hoping to find a better method that will ignore 0.0's and include the first column when it is a non-0.
You can divide the sum() by the len() of a list of numbers to find the average. Or, you can find the average of a list using the Python mean() function. Finding the average of a set of values is a common task in Python.
Prior questions show you have NumPy installed. So using NumPy, you could set the zeros to NaN and then call np.nanmean
to take the mean, ignoring NaNs:
import numpy as np
data = np.genfromtxt('data')
data[data == 0] = np.nan
means = np.nanmean(data[:, 1:], axis=1)
yields
array([ 22.1 , 22.08 , 22.08 , 22.08 , 22.1 , 22.06 , 22.06 ,
22.06 , 22.08 , 22.06 , 22.08 , 22.08 , 22.06 , 22.08 ,
22.08 , 22.08 , 22.08 , 22.08 , 22.08 , 21.975, 22.08 ,
22.08 , 22.08 , 22.06 , 22.08 , 22.06 ])
You can make an truncated/trimmed mean using scipy.stats.tmean
Or you can check if float(data[X]) is equal to 0, before appending it to the corresponding list
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With