I have a dataframe x:
x = pd.DataFrame(np.random.randn(3,3), index=[1,2,3], columns=['A', 'B', 'C'])
x
A B C
1 0.256668 -0.338741 0.733561
2 0.200978 0.145738 -0.409657
3 -0.891879 0.039337 0.400449
and I would like to select a bunch of index column pairs to populate a new Series. For example, I could select [(1, 'A'), (1, 'B'), (1, 'A'), (3, 'C')]
which would generate a list or array or series with 4 elements:
[0.256668, -0.338741, 0.256668, 0.400449]
Any idea of how I should do that?
So, if you want to select the 5th row in a DataFrame, you would use df. iloc[[4]] since the first row is at index 0, the second row is at index 1, and so on. . loc selects rows based on a labeled index.
To select a single column, use square brackets [] with the column name of the column of interest.
You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.
How to Select Rows by Index in a Pandas DataFrame 1 Example 1: Select Rows Based on Integer Indexing. 2 Example 2: Select Rows Based on Label Indexing. 3 The Difference Between .iloc and .loc. So, if you want to select the 5th row in a DataFrame, you would use df.iloc [... 4 Additional Resources. More ...
Often you may want to select the columns of a pandas DataFrame based on their index value. If you’d like to select columns based on integer indexing, you can use the .iloc function. If you’d like to select columns based on label indexing, you can use the .loc function.
Check how cool is the tool A Pandas DataFrame is a structure that represents data in a tabular format. It contains columns and rows, with each column representing a different data type. You can select specific columns from a DataFrame using the column name.
Dataframe.loc [ ] : This function is used for labels. Collectively, they are called the indexers. These are by far the most common ways to index data. These are four function which help in getting the elements, rows, and columns from a DataFrame. Indexing operator is used to refer to the square brackets following an object.
I think get_value()
and lookup()
is faster:
import numpy as np
import pandas as pd
x = pd.DataFrame(np.random.randn(3,3), index=[1,2,3], columns=['A', 'B', 'C'])
locations = [(1, "A"), (1, "B"), (1, "A"), (3, "C")]
print x.get_value(1, "A")
row_labels, col_labels = zip(*locations)
print x.lookup(row_labels, col_labels)
If your pairs are positions instead of index/column names,
row_position = [0,0,0,2]
col_position = [0,1,0,2]
x.values[row_position, col_position]
Or get the position from np.searchsorted
row_position = np.searchsorted(x.index,row_labels,sorter = np.argsort(x.index))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With