When slicing a dataframe using loc, df.loc[start:end] both start and end are included. Is there an easy way to exclude the end when using loc?

Easiest I can think of is <code>df.loc[start:end].iloc[:-1]</code>. Chops off the last one.

None of the answers addresses the situation where <code>end</code> is not part of the index. The more general solution is simply comparing the index to <code>start</code> and <code>end</code>, that way you can enforce either of them being inclusive of exclusive. <pre class="prettyprint lang-py prettyprint-override"><code>df[(df.index >= start) & (df.index < end)] </code></pre> For instance: <pre class="prettyprint lang-py prettyprint-override"><code>>>> import pandas as pd >>> import numpy as np >>> df = pd.DataFrame( { "x": np.arange(48), "y": np.arange(48) * 2, }, index=pd.date_range("2020-01-01 00:00:00", freq="1H", periods=48) ) >>> start = "2020-01-01 14:00" >>> end = "2020-01-01 19:30" # this is not in the index >>> df[(df.index >= start) & (df.index < end)] x y 2020-01-01 14:00:00 14 28 2020-01-01 15:00:00 15 30 2020-01-01 16:00:00 16 32 2020-01-01 17:00:00 17 34 2020-01-01 18:00:00 18 36 2020-01-01 19:00:00 19 38 </code></pre>

Pandas slicing excluding the end

3 Answers

Easiest I can think of is df.loc[start:end].iloc[:-1].

Chops off the last one.

answered Oct 23 '22 09:10

WillZ

loc includes both the start and end, one less ideal work around is to get the index position and use iloc to slice the data frame (assume you don't have duplicated index):

df=pd.DataFrame({'A':[1,2,3,4]}, index = ['a','b','c','d'])

df.iloc[df.index.get_loc('a'):df.index.get_loc('c')]

#   A
#a  1
#b  2

df.loc['a':'c']

#   A
#a  1
#b  2
#c  3

answered Oct 23 '22 10:10

Psidom

None of the answers addresses the situation where end is not part of the index. The more general solution is simply comparing the index to start and end, that way you can enforce either of them being inclusive of exclusive.

df[(df.index >= start) & (df.index < end)]

For instance:

>>> import pandas as pd
>>> import numpy as np

>>> df = pd.DataFrame(
    {
        "x": np.arange(48),
        "y": np.arange(48) * 2,
    },
    index=pd.date_range("2020-01-01 00:00:00", freq="1H", periods=48)
)

>>> start = "2020-01-01 14:00"
>>> end = "2020-01-01 19:30" # this is not in the index

>>> df[(df.index >= start) & (df.index < end)]

                    x   y
2020-01-01 14:00:00 14  28
2020-01-01 15:00:00 15  30
2020-01-01 16:00:00 16  32
2020-01-01 17:00:00 17  34
2020-01-01 18:00:00 18  36
2020-01-01 19:00:00 19  38

answered Oct 23 '22 08:10

Giorgio Balestrieri

Related questions
                            
                                How to cache requirements for a Django project on Travis-CI?
                            
                                Pyplot: show only first 3 lines in legend
                            
                                sqlite3.OperationalError: no such column:
                            
                                Sort dictionary alphabetically when the key is a string (name)
                            
                                How to print BASE_DIR from settings.py from django app in terminal?
                            
                                Django Rest Framework 3.1 breaks pagination.PaginationSerializer
                            
                                Return SQLAlchemy results as dicts instead of lists
                            
                                Using pandas.Dataframe.groupby without alphabetical ordering
                            
                                Elegant way to match a string to a random color matplotlib
                            
                                psycopg2 on elastic beanstalk - can't deploy app
                            
                                why is logged_out.html not overriding in django registration?
                            
                                Difference between encoding utf-8 and utf8 in Python 3.5
                            
                                Python's closure - local variable referenced before assignment
                            
                                Terminate a Python multiprocessing program once a one of its workers meets a certain condition
                            
                                Flask session variable not persisting between requests
                            
                                PySpark — UnicodeEncodeError: 'ascii' codec can't encode character
                            
                                Dropping foreign keys in Alembic downgrade?
                            
                                Remove Multiple Blanks In DataFrame
                            
                                Check if column value is in other columns in pandas
                            
                                How to change values in a dataframe Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas slicing excluding the end

Tags:

python

indexing

pandas

zcadqe

People also ask

3 Answers

WillZ

Psidom

Giorgio Balestrieri

Recent Activity

Donate For Us