Scenario:
I am working on django and django rest framework, handling data in pandas dataframe.
on front end, there are functions, which can be applied on same data. User can apply first function, then dataset will change according to function applied and then user can apply another function and so on..
on back end. I keep all functions in order. loop over it and give response. My issue is, I am doing whole process from first to last function every time, which is making my process slow.
Is there a good way to keep state and process only last function. ??
You have few ways to achieve that
you can use something like django-channels to create an open connection. this way, you can have a dataframe instance associated to a connection and make changes to it.
sample code
class DataframeWebsocketHandler(WebsocketConsumer):
def connect(self):
self.accept()
self.df = pandas.DataFrame(data=d) # your own implementation here.
# send your initial data
self.send(text_data=json.dumps({
'data': self.df
}))
def disconnect(self, close_code):
pass
def receive(self, text_data):
text_data_json = json.loads(text_data)
# you will receive actions to perform here
# all actions you take on self.df will persist until websocket is closed
operation = text_data_json['operation']
perform_operation(self.df,operation)
# send changed data to the client
self.send(text_data=json.dumps({
'data': self.df
}))
you can store your current modified dataframe into a pickle and store it in a cache. which you can later load when asked to modify same.
sample code
from django.core.cache import cache
# ... your code
def new_dataframe_page(request):
# ... your code
df = pandas.DataFrame(data=d)
cache.put(some_inst_specific_key,pickle.dumps(df),3000)
request.session.put('dframe_cache',some_inst_specific_key)
def update_dataframe(request):
# ... your code
cache_key == request.session.get("dframe_cache")
df = None
if cache_key && cache.get(cache_key) is not None:
df = pickle.loads(cache.get(cache_key))
else:
# generate new cache normally and store it to session just like above
df = generate_dataframe(request)
# perform your current action on dataframe
cache.put(cache_key,pickle.dumps(df))
# return your modified dataframe.
maintain a global map variable which stores various states and when user asks to modify, directly use variables in that global map. This method is easy and way less complicated. But unfortunately this does not work in a production environment. to serve django, you usually run multiple instances of django and each instance has its own runtime. So for example if you are running 15 instances of django, all 15 of them will have separate global variables and any changes in one is not reflected in others.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With