I wish to read several log files as they are written and process their input with asyncio. The code will have to run on windows. From what I understand from searching around both stackoverflow and the web, asynchronous file I/O is tricky on most operating systems (select
will not work as intended, for example). While I'm sure I could do this with other methods (e.g. threads), I though I would try out asyncio to see what it is like. The most helpful answer would probably be one that describes what the "architecture" of a solution to this problem should look like, i.e. how different functions and coroutines should be called or scheduled.
The following gives me a generator that reads the files line by line (through polling, which is acceptable):
import time def line_reader(f): while True: line = f.readline() if not line: time.sleep(POLL_INTERVAL) continue process_line(line)
With several files to monitor and process, this sort of code would require threads. I have modified it slightly to be more usable with asyncio:
import asyncio def line_reader(f): while True: line = f.readline() if not line: yield from asyncio.sleep(POLL_INTERVAL) continue process_line(line)
This sort of works when I schedule it through the asyncio event loop, but if process_data
blocks, then that is of course not good. When starting out, I imagined the solution would look something like
def process_data(): ... while True: ... line = yield from line_reader() ...
but I could not figure out how to make that work (at least not without process_data
managing quite a bit of state).
Any ideas on how I should structure this kind of code?
Method 1: Read a File Line by Line using readlines() readlines() is used to read all the lines at a single go and then return them as each line a string element in a list. This function can be used for small files, as it reads the whole file content to the memory, then split it into separate lines.
Python file method next() is used when a file is used as an iterator, typically in a loop, the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit. Combining next() method with other file methods like readline() does not work right.
Use readlines() to Read the range of line from the File The readlines() method reads all lines from a file and stores it in a list. You can use an index number as a line number to extract a set of lines from it. This is the most straightforward way to read a specific line from a file in Python.
Using the aiofiles:
async with aiofiles.open('filename', mode='r') as f: async for line in f: print(line)
EDIT 1
As the @Jashandeep mentioned, you should care about blocking operations:
Another method is select
and or epoll
:
from select import select files_to_read, files_to_write, exceptions = select([f1, f2], [f1, f2], [f1, f2], timeout=.1)
The timeout
parameter is important here.
see: https://docs.python.org/3/library/select.html#select.select
EDIT 2
You can register a file for read/write with: loop.add_reader()
It uses internal EPOLL Handler inside the loop.
EDIT 3
But remember the Epoll will not work with regular files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With