I have the following dataframe:
df = pd.DataFrame([[0, 1, 7, 0, 1, 8, 3, 0],
[7, 3, 4, 0, 4, 9, 7, 0]],
columns=pd.MultiIndex.from_product([["first", "second"],
["A", "B", "C", "D"]]))
print(df)
first second
A B C D A B C D
0 0 1 7 0 1 8 3 0
1 7 3 4 0 4 9 7 0
I want to check, whether the values in first are present in any of the columns of second. Only the same row should be compared.
The resulting dataframe should look like this:
A B C D
0 True True False True
1 True False True True
What is the best way of doing this? I have already played around with df["first"].isin(df["second"] but it only compares A with A, B with B, ... Also tried it in combination with .any() but I can't seem to make it work.
Your help is greatly appreciated!
Thank you in advance.
Use in operator on a Series to check if a column contains/exists a string value in a pandas DataFrame. df['Courses'] returns a Series object with all values from column Courses , pandas. Series. unique will return unique values of the Series object.
Pandas Series: equals() function The equals() function is used to test whether two Pandas objects contain the same elements. This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.
During data analysis, one might need to compute the difference between two rows for comparison purposes. This can be done using pandas. DataFrame. diff() function.
The compare method in pandas shows the differences between two DataFrames. It compares two data frames, row-wise and column-wise, and presents the differences side by side. The compare method can only compare DataFrames of the same shape, with exact dimensions and identical row and column labels.
Pandas.Series.isin () function is used to check whether a column contains a list of multiple values. It returns a boolean Series showing each element in the Series matches an element in the passed sequence of values exactly.
Accessing Data in a MultiIndex DataFrame in Pandas 1. Selecting data via the first level index When it comes to select data on a DataFrame, Pandas loc is one of the top... 2. Selecting data via multi-level index If you want to read London ’s Day weather on 2019–07–01, you can simply do: >>>... 3. ...
The following syntax shows how to select all rows of the DataFrame that contain the value 25 in any of the columns: df [df.isin( [25]).any(axis=1)] points assists rebounds 0 25 5 11 The following syntax shows how to select all rows of the DataFrame that contain the values 25, 9, or 6 in any of the columns:
A MultiIndex (also known as a hierarchical index) DataFrame allows you to have multiple columns acting as a row identifier and multiple rows acting as a header identifier. With MultiIndex , you can do some sophisticated data analysis, especially for working with higher dimensional data.
np.any(df['first'].T.values[:, :, None] == df['second'].values, axis=-1).T
array([[ True, True, False, True],
[ True, False, True, True]])
Another solution:
df['first'].apply(lambda x: x.isin(df.loc[x.name, ('second')]), axis=1)
Output:
A B C D
0 True True False True
1 True False True True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With