Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is Numpy equivalence of dataframe.loc() in Pandas

I have a 120,000*4 numpy array as shown below. Each row is a sample. The first column is time in second, or the index using Pandas terminology.

0.014      14.175  -29.97  -22.68 
0.022      13.905  -29.835 -22.68
0.030      12.257  -29.32  -22.67
... ...
1259.980   -0.405   2.205   3.825
1259.991   -0.495   2.115   3.735

I want to select the rows recorded between 100.000 to 200.000 sec and save it into a new array. If this were a Pandas dataframe, I would simply write df.loc[100:200]. What is the equivalent operation in numpy?

This is NOT a question of feasibility. I simply wonder if there are any pythonic one-line solutions.

like image 815
F.S. Avatar asked Jul 24 '18 23:07

F.S.


People also ask

What is loc in NumPy?

loc. Access a group of rows and columns by label(s) or a boolean array. . loc[] is primarily label based, but may also be used with a boolean array.

What is Pandas DataFrame loc?

pandas. DataFrame. loc[] is a property that is used to access a group of rows and columns by label(s) or a boolean array. Pandas DataFrame is a two-dimensional tabular data structure with labeled axes. i.e. columns and rows.

What is the difference between NumPy and DataFrame?

Comparison between DataFrame and Array Numpy arrays can be multi-dimensional whereas DataFrame can only be two-dimensional. Arrays contain similar types of objects or elements whereas DataFrame can have objects or multiple or similar data types. Both array and DataFrames are mutable.

Is at and loc same in Pandas?

at is a single element and using . loc maybe a Series or a DataFrame. Returning single value is not the case always. It returns array of values if the provided index is used multiple times.

How do I call Loc[] from a pandas Dataframe?

If you’re familiar with calling methods in Python, this should be very familiar. Essentially, you’re going to use “dot notation” to call loc [] after specifying a Pandas Dataframe. So first, you’ll specify a Pandas DataFrame object.

How to select data from a pandas Dataframe by label?

The Pandas loc method enables you to select data from a Pandas DataFrame by label. It allows you to “ loc ate” data in a DataFrame. That’s where we get the name loc [].

What is pandas Dataframe in Python?

Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure of the Pandas.

How to use the LOC property in a Dataframe?

The loc property gets, or sets, the value (s) of the specified labels. Specify both row and column with a label. To access more than one row, use double brackets and specify the labels, separated by commas: You can also specify a slice of the DataFrame with from and to labels, separated by a colon:


2 Answers

This assumes indexes are sorted:

IIUC,

x=np.array([ [1,2,3,4],
           [5,6,7,8],
           [9,10,11,12],
           [13,14,15,16]])

x[(x[:,0] >= 5) & (x[:,0] <= 9) ]

So you would have 100 and 200 instead of 5 and 9.


For a more general solution, check Wen`s answer

like image 161
rafaelc Avatar answered Sep 30 '22 04:09

rafaelc


Data from Raf

x[np.where(x[:,0]==5)[0][0]:np.where(x[:,0]==9)[0][0]+1,:]
Out[341]: 
array([[ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Notice

only using greater and less than for that can not fully replace the .loc, the back end of .loc is index position not value range

For example

df
Out[348]: 
       0   1   2   3
0      1   2   3   4
1      5   6   7   8
4444   9  10  11  12
3     13  14  15  16

df.loc[1:3]
Out[347]: 
       0   1   2   3
1      5   6   7   8
4444   9  10  11  12
3     13  14  15  16
like image 20
BENY Avatar answered Sep 30 '22 04:09

BENY