How to create a dataframe column from values with frequency count?

Question

Given a problem set, with values and their associated frequencies, how can the sample be created in a dataframe?

Find the mean of this dataset
Value: 1 | 2 | 3
Freq:  3 | 4 | 2

Which represents the sample, [1, 1, 1, 2, 2, 2, 2, 3, 3].

I input this into Python:

>>> import pandas as pd
>>> df = pd.DataFrame({'value':[1, 2, 3], 'freq':[4, 5, 2]})
>>> df
   value  freq
0      1     3
1      2     4
2      3     2

It's not difficult to find solve basic statistics with this format. For example, the mean for this dataset is (df['value'] * df['freq']).sum() / df['freq'].sum(). However it would be nice to use built in functions/attributes such as .mean(). To do this I need to input the value/freq data as raw value data into the data frame. My end goal is this:

Does anybody know how to input datasets given in value/frequency form and create a data frame of raw data? Thank you.

O Pardal · Accepted Answer

An option is to use np.repeat

import numpy as np

values = [1,2,3]

frequency = [3,4,2]

df = pd.DataFrame(np.repeat(values, frequency), columns=['data'])

df.mean()

How to create a dataframe column from values with frequency count?

Tags:

python

pandas

Farzad Saif

1 Answers

O Pardal

Recent Activity

Donate For Us

How to create a dataframe column from values with frequency count?

Tags:

python

pandas

Farzad Saif

1 Answers

O Pardal

Related questions

Recent Activity

Donate For Us