Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find table region for camelot

As mentioned in camelot, we can extract table from particular region like:

tables = camelot.read_pdf('table_regions.pdf', table_regions=['170,370,560,270'])

But how can I find these regions for my pdf.

like image 959
Shubham Mishra Avatar asked Oct 21 '25 02:10

Shubham Mishra


2 Answers

You can detect this regions, by some visual debugging.

https://camelot-py.readthedocs.io/en/master/user/advanced.html#visual-debugging

like image 74
Stefano Fiorucci - anakin87 Avatar answered Oct 23 '25 14:10

Stefano Fiorucci - anakin87


I know it's a late reply - but I just came across a possible solution.

If you're looking for a automated extraction method, you could use lattice in a first step, retrieve the table boundaries with tables[0]._bbox and use these numbers in a second call to camelot.read_pdf() into the argument table_areas.

Be aware that they are in a weirdly sorted format for a bbox.

like image 42
Benedict Witzenberger Avatar answered Oct 23 '25 15:10

Benedict Witzenberger



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!