Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read file line by line with asyncio

I wish to read several log files as they are written and process their input with asyncio. The code will have to run on windows. From what I understand from searching around both stackoverflow and the web, asynchronous file I/O is tricky on most operating systems (select will not work as intended, for example). While I'm sure I could do this with other methods (e.g. threads), I though I would try out asyncio to see what it is like. The most helpful answer would probably be one that describes what the "architecture" of a solution to this problem should look like, i.e. how different functions and coroutines should be called or scheduled.

The following gives me a generator that reads the files line by line (through polling, which is acceptable):

import time  def line_reader(f):     while True:         line = f.readline()         if not line:             time.sleep(POLL_INTERVAL)             continue         process_line(line) 

With several files to monitor and process, this sort of code would require threads. I have modified it slightly to be more usable with asyncio:

import asyncio  def line_reader(f):     while True:         line = f.readline()         if not line:             yield from asyncio.sleep(POLL_INTERVAL)             continue         process_line(line) 

This sort of works when I schedule it through the asyncio event loop, but if process_data blocks, then that is of course not good. When starting out, I imagined the solution would look something like

def process_data():     ...     while True:         ...         line = yield from line_reader()         ... 

but I could not figure out how to make that work (at least not without process_data managing quite a bit of state).

Any ideas on how I should structure this kind of code?

like image 848
josteinb Avatar asked Nov 20 '15 10:11

josteinb


People also ask

How do I read a file line by line?

Method 1: Read a File Line by Line using readlines() readlines() is used to read all the lines at a single go and then return them as each line a string element in a list. This function can be used for small files, as it reads the whole file content to the memory, then split it into separate lines.

How do you read the next line in Python?

Python file method next() is used when a file is used as an iterator, typically in a loop, the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit. Combining next() method with other file methods like readline() does not work right.

How do you read a specific line in a file in Python?

Use readlines() to Read the range of line from the File The readlines() method reads all lines from a file and stores it in a list. You can use an index number as a line number to extract a set of lines from it. This is the most straightforward way to read a specific line from a file in Python.


1 Answers

Using the aiofiles:

async with aiofiles.open('filename', mode='r') as f:     async for line in f:         print(line) 

EDIT 1

As the @Jashandeep mentioned, you should care about blocking operations:

Another method is select and or epoll:

from select import select  files_to_read, files_to_write, exceptions = select([f1, f2], [f1, f2], [f1, f2], timeout=.1) 

The timeout parameter is important here.

see: https://docs.python.org/3/library/select.html#select.select

EDIT 2

You can register a file for read/write with: loop.add_reader()

It uses internal EPOLL Handler inside the loop.

EDIT 3

But remember the Epoll will not work with regular files.

like image 53
pylover Avatar answered Sep 29 '22 11:09

pylover