Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to preprocess a text stream on the fly in Python?

What I need is a Python 3 function (or whatever) that would take a text stream (like sys.stdin or like that returned by open(file_name, "rt")) and return a text stream to be consumed by some other function but remove all the spaces, replace all tabs with commas and convert all the letters to lowercase on the fly (the "lazy" way) as the data is read by the consumer code.

I assume there is a reasonably easy way to do this in Python 3 like something similar to list comprehensions but don't know what exactly might it be so far.

like image 233
Ivan Avatar asked Feb 04 '18 06:02

Ivan


1 Answers

I am not sure this is what you mean, but the easiest way i can think of is to inherit from file (the type returned from open) and override the read method to do all the things you want after reading the data. A simple implementation would be:

class MyFile(file):
    def read(*args, **kwargs):
         data = super().read(*args,**kwargs)
         # process data eg. data.replace(' ',' ').replace('\t', ',').lower()
         return data
like image 157
Gal Bashan Avatar answered Oct 14 '22 21:10

Gal Bashan