Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing Mbox from an open file-like object in Python?

This works:

import mailbox

x = mailbox.mbox('filename.mbox')  # works

but what if I only have an open handle to the file, instead of a filename?

fp = open('filename.mbox', mode='rb')  # for example; there are many ways to get a file-like object
x = mailbox.mbox(fp)  # doesn't work

Question: What's the best (cleanest, fastest) way to open Mbox from a bytes stream = an open binary handle, without copying the bytes into a named file first?

like image 440
user124114 Avatar asked Jun 29 '19 20:06

user124114


1 Answers

mailbox.mbox() has to call the builtin function open() at some point. Thus a hacky solution would be to intercept that call and return the pre-existing file-like object. A draft solution follows:

import builtins

# FLO stands for file-like object

class MboxFromFLO:

    def __init__(self, flo):
        original_open = builtins.open

        fake_path = '/tmp/MboxFromFLO'
        self.fake_path = fake_path
        def open_proxy(*args):
            print('open_proxy{} was called:'.format(args))
            if args[0] == fake_path:
                print('Call to open() was intercepted')
                return flo
            else:
                print('Call to open() was let through')
                return original_open(*args)

        self.original_open = original_open
        builtins.open = open_proxy
        print('Instrumenting open()')

    def __enter__(self):
        return mailbox.mbox(self.fake_path)

    def __exit__(self, exc_type, exc_value, traceback):
        print('Restoring open()')
        builtins.open = self.original_open



# Demonstration
import mailbox

# Create an mbox file so that we can use it later
b = mailbox.mbox('test.mbox')
key = b.add('This is a MboxFromFLO test message')

f = open('test.mbox', 'rb')
with MboxFromFLO(f) as b:
    print('Msg#{}:'.format(key), b.get(key))

Some caveats with regard to possible future changes in the implementation of mailbox.mbox:

  1. mailbox.mbox may also open extra files besides the one passed to its constructor. Even if it doesn't, the monkey-patched open() will be used by any other Python code executed while the patch is in effect (i.e. as long as the context managed by MboxFromFLO is active). You must ensure that the fake path you generate (so that you can later recognize the correct call to open() if there are more than one such calls) doesn't conflict with any such files.

  2. mailbox.mbox may decide to somehow check the specified path before opening it (e.g. using os.path.exists(), os.path.isfile(), etc) and will fail if that path doesn't exist.

like image 186
Leon Avatar answered Oct 14 '22 07:10

Leon