Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python3 : module 'tabula' has no attribute 'read_pdf'

A .py program works but the exact same code, when exposed as API, doesn't work.

The code reads the pdf with Tabula and provides the table content as a output.

I've tried :

import tabula
df = tabula.read_pdf("my_pdf")
print(df)

and

from tabula import wrapper
df = wrapper.read_pdf("my_pdf")
print(df)

I've installed tabula-py (not tabula) on AWS EC2 running Ubuntu.

More than read_pdf, I actually want to convert to CSV and give the output. But that doesn't work as well. I get the same no-attribute error i.e. module 'tabula' has no attribute 'convert_into.

The .py file and the API file (.py as well) are in the same directory and are accessed with the same user.

Any help will be highly appreciated.

EDIT : I tried to run the same python file from the API as OS command (os.system("python3 /home/ubuntu/flaskapp/tabler.py")). But it didn't work as well.

like image 659
Sukhi Avatar asked Feb 24 '20 13:02

Sukhi


2 Answers

make sure that you installed tabula-py not just tabula use

!pip install tabula-py

and to import it use

from tabula.io import read_pdf
like image 80
yasmine_chelly Avatar answered Sep 20 '22 15:09

yasmine_chelly


There is actually an entry in the FAQ about this issue specifically :

If you’ve installed tabula, it will be conflict the namespace. You should install tabula-py after removing tabula.

Although using read_csv() from tabula.io worked, as suggested by other answers, I was also able to use tabula.read_csv() after having removed tabula and reinstalled tabula-py (using pip install --force-reinstall tabula-py).

like image 31
Skippy le Grand Gourou Avatar answered Sep 17 '22 15:09

Skippy le Grand Gourou