Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract tuple values in pandas dataframe for use of matplotlib?

I have the following dataframe:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

x = np.arange(10)
x = np.concatenate((x,x))
y = []
for i in range(2):
    y.append(np.random.random_integers(0,10,20))

d = {'A': [(x[i], y[0][i]) for i in range(20)],
    'B': [(x[i], y[1][i]) for i in range(20)]} 
df = pd.DataFrame(d, index = list('aaaaaaaaaabbbbbbbbbb'))

df

    A        B
a  (0, 2)  (0, 10)
a  (1, 0)   (1, 8)
a  (2, 3)   (2, 8)
a  (3, 7)   (3, 8)
a  (4, 8)  (4, 10)
a  (5, 2)   (5, 0)
a  (6, 1)   (6, 4)
a  (7, 3)   (7, 9)
a  (8, 4)   (8, 4)
a  (9, 4)  (9, 10)
b  (0, 0)   (0, 3)
b  (1, 2)  (1, 10)
b  (2, 8)   (2, 3)
b  (3, 1)   (3, 7)
b  (4, 6)   (4, 1)
b  (5, 8)   (5, 3)
b  (6, 1)   (6, 4)
b  (7, 1)   (7, 1)
b  (8, 2)   (8, 7)
b  (9, 9)   (9, 3)

How do I make the following plots?

Plot 1 is on column 'A', 2 lines (one line for index = a, the other for index = b), x values are the first elements of the tuples. y values are the 2nd elements of the tuple.

Plot 2 is on column'B', the rest is the same as plot 1.

I cannot figure out how I can extract values from the tuples in the dataframe.

In addition, will groupby be helpful in this case?

In reality, I have about a thousand columns of data, 5 groups, each group ~500 rows. So I'm looking for a quick way to solve this (dataframe size ~2500 x 1000)

Thanks a lot

like image 525
HP Peng Avatar asked May 03 '16 06:05

HP Peng


People also ask

How do I extract a value from a pandas DataFrame in Python?

get_value() function is used to quickly retrieve the single value in the data frame at the passed column and index. The input to the function is the row label and the column label.

Can DataFrame store tuples?

DataFrame() function. The Pandas DataFrame object will store the data in a tabular format, Here the tuple element of the list object will become the row of the resultant DataFrame.

How do I select data from a pandas DataFrame?

For example, you can use dataframe. iloc[0:1, :] to select the first row of a dataframe and all of the columns, or dataframe. iloc[ :, 0:1] to select the first column of a dataframe and all of the rows.

How do you value using ILOC?

iloc[] to Get a Cell Value by Column Position. If you wanted to get a cell value by column number or index position use DataFrame. iloc[] , index position starts from 0 to length-1 (index starts from zero). In order to refer last column use -1 as the column position.


2 Answers

Here is how to unpack your tuples using zip. The * unpacks the argument list of each column.

df['A.x'], df['A.y'] = zip(*df.A)
df['B.x'], df['B.y'] = zip(*df.B)

>>> df.head()
        A       B  A.x  A.y  B.x  B.y
a  (0, 6)  (0, 0)    0    6    0    0
a  (1, 8)  (1, 4)    1    8    1    4
a  (2, 8)  (2, 5)    2    8    2    5
a  (3, 5)  (3, 2)    3    5    3    2
a  (4, 2)  (4, 4)    4    2    4    4
like image 140
Alexander Avatar answered Sep 30 '22 09:09

Alexander


I think you can use indexing with str only:

df['a1'], df['a2'] = df['A'].str[0], df['A'].str[1]
df['b1'], df['b2'] = df['B'].str[0], df['B'].str[1]

print (df)
         A       B  a1  a2  b1  b2
a   (0, 5)  (0, 1)   0   5   0   1
a   (1, 0)  (1, 5)   1   0   1   5
a   (2, 3)  (2, 9)   2   3   2   9
a   (3, 3)  (3, 8)   3   3   3   8
a   (4, 7)  (4, 9)   4   7   4   9
a   (5, 9)  (5, 4)   5   9   5   4
a   (6, 3)  (6, 3)   6   3   6   3
a   (7, 5)  (7, 0)   7   5   7   0
a   (8, 2)  (8, 3)   8   2   8   3
a   (9, 4)  (9, 5)   9   4   9   5
b   (0, 7)  (0, 0)   0   7   0   0
b   (1, 6)  (1, 2)   1   6   1   2
b   (2, 8)  (2, 3)   2   8   2   3
b   (3, 8)  (3, 8)   3   8   3   8
b  (4, 10)  (4, 1)   4  10   4   1
b   (5, 1)  (5, 3)   5   1   5   3
b   (6, 6)  (6, 3)   6   6   6   3
b   (7, 7)  (7, 3)   7   7   7   3
b   (8, 7)  (8, 7)   8   7   8   7
b   (9, 8)  (9, 0)   9   8   9   0
like image 26
jezrael Avatar answered Sep 30 '22 11:09

jezrael