Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python appending Dataframe by external methods

Tags:

python

pandas

In my project I have a Strategy which stores LOGS into a dataframe. A Strategy is a succession of Block instances. Each Block can writeLog().

What I want is that each block points to the Strategy.LOGS to append lines in it.

Here is my minimal code:

import pandas as pd

class Block:
    def __init__(self, logger):
        self.logger = logger
        
    def check(self):
        self.writeLog('test!')
        
    def writeLog(self, message):
        self.logger = self.logger.append({'Date':10, 'Message':message, 'uid':11}, ignore_index=True)


class Strategy:
    def __init__(self):
        self.LOGS = pd.DataFrame(columns=['Date', 'Message', 'uid'])
        
        self.blk1 = Block(logger=self.LOGS)
        self.blk2 = Block(logger=self.LOGS)
        
    def nxt(self):
        self.blk1.check()
        self.blk2.check()   
       
        
strat = Strategy()

for i in range(0,5):
    strat.nxt()
    
print(strat.LOGS)

print(strat.blk1.logger)
print(strat.blk2.logger)

These are the outputs:

>>Empty DataFrame
Columns: [Date, Message, uid]
Index: []

>>  Date Message uid
0   10   test!  11
1   10   test!  11
2   10   test!  11
3   10   test!  11
4   10   test!  11

>>  Date Message uid
0   10   test!  11
1   10   test!  11
2   10   test!  11
3   10   test!  11
4   10   test!  11

I don't understand why the attribute LOGS of Strategy is not appended. I thought I was pointing into strat.LOGS by writing logger=self.LOGS.

Thanks for your answers.

like image 290
Graphotux Avatar asked Feb 20 '26 03:02

Graphotux


1 Answers

That is because DataFrame.append returns a new object every time. See pandas documentation on append here

In general pandas prefers immutable (stateless) objects, instead of modifying objects in-place. Because of this, I would recommend changing the code a little bit so that each Block has its own logger. This is good protection in case of multithreading as well. Then you can create a get_log method in Strategy that will let you grab the current instance of both blk.logger variables. Any sorting by date can happen here as well if needed.

import pandas as pd

class Block:
    def __init__(self):
        self.logger = pd.DataFrame(columns=['Date', 'Message', 'uid'])
        
    def check(self):
        self.writeLog('test!')
        
    def writeLog(self, message):
        self.logger = self.logger.append({'Date':10, 'Message':message, 'uid':11}, ignore_index=True)


class Strategy:
    def __init__(self):
       
        self.blk1 = Block()
        self.blk2 = Block()
        
    def nxt(self):
        self.blk1.check()
        self.blk2.check()   
        
    def get_log(self):
        return pd.concat((self.blk1.logger, self.blk2.logger), ignore_index=True)
       
        
strat = Strategy()

for i in range(0,5):
    strat.nxt()
    
print(strat.get_log())

print(strat.blk1.logger)
print(strat.blk2.logger)

Output

  Date Message uid
0   10   test!  11
1   10   test!  11
2   10   test!  11
3   10   test!  11
4   10   test!  11
5   10   test!  11
6   10   test!  11
7   10   test!  11
8   10   test!  11
9   10   test!  11
  Date Message uid
0   10   test!  11
1   10   test!  11
2   10   test!  11
3   10   test!  11
4   10   test!  11
  Date Message uid
0   10   test!  11
1   10   test!  11
2   10   test!  11
3   10   test!  11
4   10   test!  11
like image 99
lane Avatar answered Feb 27 '26 09:02

lane



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!