I have some acceleration data for which I am trying to count the length of sequences given a set of conditions. In this case I want to count the length of a sequence when the acceleration <code>moves > 2.78</code> and then drops back below <code>0</code>. An example would be <pre class="prettyprint"><code>[-1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21] </code></pre> The return result here would be a count of 7 (2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11) I have done this previously to identify the length of sequences strictly >2.78 using the following code. I need to build on this to provide lengths using 0 as the endpoint. <pre class="prettyprint"><code>def get_Accel_lengths( array ) : s = ''.join( ['0' if i < 2.78 else '1' for i in resultsQ4['AccelInt']] ) parts = s.split('0') return [len(p) for p in parts if len(p) > 0] Q4Accel = get_Accel_lengths(resultsQ4['AccelInt']) Q4Accel = pd.DataFrame(Q4Accel) Q4Accel </code></pre> Using the above example, the result for this code would be <code>2</code> (<code>2.88</code>, <code>2.86</code>)

Using <code>itertools.dropwhile</code> and <code>takewhile</code>: <pre class="prettyprint"><code>l = [-1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21] list(takewhile(lambda x: x > 0, dropwhile(lambda x: x < 2.78, l))) </code></pre> Output: <pre class="prettyprint"><code>[2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11] </code></pre> Or just to get <code>len</code>: <pre class="prettyprint"><code>sum(1 for _ in takewhile(lambda x: x > 0, dropwhile(lambda x: x < 2.78, l))) # 7 </code></pre>

<blockquote> will this work if there are multiple times this occurs in the dataset? I want to identify each one. </blockquote> Let's switch from <code>takewhile</code> and <code>dropwhile</code> to <code>groupby</code> with a global boolean flag to identify multiple sequences. I'm simply going to concatenate your data onto itself to simulate two sequences: <pre class="prettyprint"><code>from itertools import groupby def keyfunc(datum): global in_sequence if datum < 0: in_sequence = False elif datum > 2.78: in_sequence = True return in_sequence data = [ -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21, -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21, ] sequences = [] in_sequence = False for valid, sequence in groupby(data, keyfunc): if valid: sequences.append(list(sequence)) print(*sequences, sep='\n') print(*map(len, sequences), sep='\n') </code></pre> OUTPUT <pre class="prettyprint"><code>> python3 test.py [2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11] [2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11] 7 7 > </code></pre> <blockquote> Is it possible to tighten it up though to only provide the len numbers as I want to then convert into a df and export to csv? </blockquote> Perhaps something like this: <pre class="prettyprint"><code>from itertools import groupby data = [ -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21, -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21, ] def sequence_lengths(data): in_sequence = False def keyfunc(datum): nonlocal in_sequence if datum < 0: in_sequence = False elif datum > 2.78: in_sequence = True return in_sequence lengths = [] for valid, sequence in groupby(data, keyfunc): if valid: lengths.append(len(list(sequence))) return lengths print(sequence_lengths(data)) </code></pre>

Length (count) of sequences with start and end condition Python

Tags:

python

python-3.x

conditional-statements

I have some acceleration data for which I am trying to count the length of sequences given a set of conditions. In this case I want to count the length of a sequence when the acceleration moves > 2.78 and then drops back below 0.

An example would be

[-1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21]

The return result here would be a count of 7 (2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11)

I have done this previously to identify the length of sequences strictly >2.78 using the following code. I need to build on this to provide lengths using 0 as the endpoint.

def get_Accel_lengths( array ) :
    s = ''.join( ['0' if i < 2.78 else '1' for i in resultsQ4['AccelInt']] )
    parts = s.split('0')
    return [len(p) for p in parts if len(p) > 0]
Q4Accel = get_Accel_lengths(resultsQ4['AccelInt'])
Q4Accel = pd.DataFrame(Q4Accel)
Q4Accel

Using the above example, the result for this code would be 2 (2.88, 2.86)

750

asked Jul 10 '20 04:07

Jake

2 Answers

Using itertools.dropwhile and takewhile:

l = [-1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11, -0.21]
list(takewhile(lambda x: x > 0, dropwhile(lambda x: x < 2.78, l)))

Output:

[2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11]

Or just to get len:

sum(1 for _ in takewhile(lambda x: x > 0, dropwhile(lambda x: x < 2.78,  l)))
# 7

163

answered Nov 27 '22 01:11

Chris

will this work if there are multiple times this occurs in the dataset? I want to identify each one.

Let's switch from takewhile and dropwhile to groupby with a global boolean flag to identify multiple sequences. I'm simply going to concatenate your data onto itself to simulate two sequences:

from itertools import groupby

def keyfunc(datum):
    global in_sequence

    if datum < 0:
        in_sequence = False
    elif datum > 2.78:
        in_sequence = True

    return in_sequence

data = [
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
]

sequences = []
in_sequence = False

for valid, sequence in groupby(data, keyfunc):
    if valid:
        sequences.append(list(sequence))

print(*sequences, sep='\n')
print(*map(len, sequences), sep='\n')

OUTPUT

> python3 test.py
[2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11]
[2.88, 2.86, 2.53, 1.98, 1.21, 0.89, 0.11]
7
7
>

Is it possible to tighten it up though to only provide the len numbers as I want to then convert into a df and export to csv?

Perhaps something like this:

from itertools import groupby

data = [
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
    -1.1, -1, 0, 1.2, 1.8, 2, 2.88, 2.86,
    2.53, 1.98, 1.21, 0.89, 0.11, -0.21,
]

def sequence_lengths(data):
    in_sequence = False

    def keyfunc(datum):
        nonlocal in_sequence

        if datum < 0:
            in_sequence = False
        elif datum > 2.78:
            in_sequence = True

        return in_sequence

    lengths = []

    for valid, sequence in groupby(data, keyfunc):
        if valid:
                lengths.append(len(list(sequence)))

    return lengths

print(sequence_lengths(data))

answered Nov 27 '22 01:11

cdlane

Related questions
                            
                                How to sort a list of sub-lists by the contents of sub-lists, where sub-lists contain strings and booleans?
                            
                                How to exit cleanly from flask and waitress running as a windows pywin32 service
                            
                                Is there a way to take screen shots of desktop that not current active one using mss?
                            
                                How to set navigator.webdriver to undefined with Selenium for Firefox (geckodriver)
                            
                                VSCode autocomplete not working for OpenCV installed from source
                            
                                Install Detectron2 on Windows 10
                            
                                How to implement maclaurin series in keras?
                            
                                Is "if __name__ == '__main__'" required in a __main__.py?
                            
                                How do I run a program installed with pip in windows?
                            
                                Comparing Plumbr to other options for making a chart with R in a Python script
                            
                                Tkinter progress bar how to correctly implement it in a model dialog box
                            
                                A quick way to write a decision into a column based on the corresponding rows using pandas?
                            
                                Changing in the Quantity of variants reflecting in the wrong item in Order Summary
                            
                                Google Collab How to show value of assignments?
                            
                                Even though tuples are immutable, they are stored in different addresses in interactive mode. Why?
                            
                                Delete an element from torch.Tensor
                            
                                Why does django's `apps.get_model()` return a `__fake__.MyModel` object
                            
                                ValueError: illegal value in 4-th argument of internal None when running sklearn LinearRegression().fit()
                            
                                how to download all the python packages mentioned in the requirement.txt to a folder in linux?
                            
                                Create CSV from XML/Json using Python Pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With