Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fill NaN values of DataFrame with random values from the column, depending on frequency

I am trying to fill a pandas dataframe NAN using random data of every column, and that random data appears in every column depeding on its frecuency. I have this:

def MissingRandom(dataframe):
        import random
        dataframe = dataframe.apply(lambda x: x.fillna(
                random.choices(x.value_counts().keys(),
                               weights = list(x.value_counts()))[0]))
    return dataframe

I get the DataFrame filled in with random data but its the same data for all the missing data of the column. I would like this data to be different for every missing of the column but I am not able to do it. Could anybody help me?

Thank you very much

like image 716
Deco1998 Avatar asked Jan 01 '26 07:01

Deco1998


1 Answers

Please see below my solution. Firstly i created a function that fills a series based on your criteria (frequencies as weights in the random function) and finally, we apply this function to all clumns of the dataframe:

from collections import Counter
def fillcolumn(ser):
        cna=len(ser[ser.isna()])
        l=ser[ser.notna()]
        d=Counter(l)    
        m=random.choices(list(d.keys()), weights = list(d.values()), k=cna)
        ser[ser.isna()]=m
        return ser
    
for i in df.columns:
    df[i]=fillcolumn(df[i])

Your full code:

def MissingRandom(dataframe):
    import random
    from collections import Counter
    def fillcolumn(ser):
        cna=len(ser[ser.isna()])
        l=ser[ser.notna()]
        d=Counter(l)    
        m=random.choices(list(d.keys()), weights = list(d.values()), k=cna)
        ser[ser.isna()]=m
        return ser
        
    for i in dataframe.columns:
        dataframe[i]=fillcolumn(dataframe[i])
    return dataframe
like image 130
IoaTzimas Avatar answered Jan 02 '26 20:01

IoaTzimas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!