I have some acceleration data for which I am trying to count the length of sequences given a set of conditions. In this case I want to count the length of a sequence when the acceleration moves > 2.78 and then drops back below 0.
An example would be
[-1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21]
The return result here would be a count of 7 (2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11)
I have done this previously to identify the length of sequences strictly >2.78 using the following code. I need to build on this to provide lengths using 0 as the endpoint.
def get_Accel_lengths( array ) :
    s = ''.join( ['0' if i < 2.78 else '1' for i in resultsQ4['AccelInt']] )
    parts = s.split('0')
    return [len(p) for p in parts if len(p) > 0]
Q4Accel = get_Accel_lengths(resultsQ4['AccelInt'])
Q4Accel = pd.DataFrame(Q4Accel)
Q4Accel 
Using the above example, the result for this code would be 2 (2.88, 2.86)
Len() Method There is a built-in function called len() for getting the total number of items in a list, tuple, arrays, dictionary, etc. The len() method takes an argument where you may provide a list and it returns the length of the given list.
The built-in len function returns the length of a sequence.
Using itertools.dropwhile and takewhile:
l = [-1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21]
list(takewhile(lambda x: x > 0, dropwhile(lambda x: x < 2.78, l)))
Output:
[2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11]
Or just to get len:
sum(1 for _ in takewhile(lambda x: x > 0, dropwhile(lambda x: x < 2.78,  l)))
# 7
                        will this work if there are multiple times this occurs in the dataset? I want to identify each one.
Let's switch from takewhile and dropwhile to groupby with a global boolean flag to identify multiple sequences.  I'm simply going to concatenate your data onto itself to simulate two sequences:
from itertools import groupby
def keyfunc(datum):
    global in_sequence
    if datum < 0:
        in_sequence = False
    elif datum > 2.78:
        in_sequence = True
    return in_sequence
data = [
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
]
sequences = []
in_sequence = False
for valid, sequence in groupby(data, keyfunc):
    if valid:
        sequences.append(list(sequence))
print(*sequences, sep='\n')
print(*map(len, sequences), sep='\n')
OUTPUT
> python3 test.py
[2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11]
[2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11]
7
7
> 
Is it possible to tighten it up though to only provide the len numbers as I want to then convert into a df and export to csv?
Perhaps something like this:
from itertools import groupby
data = [
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
]
def sequence_lengths(data):
    in_sequence = False
    def keyfunc(datum):
        nonlocal in_sequence
        if datum < 0:
            in_sequence = False
        elif datum > 2.78:
            in_sequence = True
        return in_sequence
    lengths = []
    for valid, sequence in groupby(data, keyfunc):
        if valid:
                lengths.append(len(list(sequence)))
    return lengths
print(sequence_lengths(data))
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With