As mentioned in camelot, we can extract table from particular region like:
tables = camelot.read_pdf('table_regions.pdf', table_regions=['170,370,560,270'])
But how can I find these regions for my pdf.
You can detect this regions, by some visual debugging.
https://camelot-py.readthedocs.io/en/master/user/advanced.html#visual-debugging
I know it's a late reply - but I just came across a possible solution.
If you're looking for a automated extraction method, you could use lattice in a first step, retrieve the table boundaries with tables[0]._bbox and use these numbers in a second call to camelot.read_pdf() into the argument table_areas.
Be aware that they are in a weirdly sorted format for a bbox.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With