I have a bunch of images like
What would be the good way to extract just the table structure from the image? I'm only interested extracting the straight lines.
I have been toying around with OpenCV Finding Contours code sample and the results are quite promising. I'm just wondering if there is maybe a better way?
OpenCV has a nice way to detect line segments. Here is a code snippet in python:
import math
import numpy as np
import cv2
img = cv2.imread('page2.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
lsd = cv2.createLineSegmentDetector(0)
dlines = lsd.detect(gray)
for dline in dlines[0]:
x0 = int(round(dline[0][0]))
y0 = int(round(dline[0][1]))
x1 = int(round(dline[0][2]))
y1 = int(round(dline[0][3]))
cv2.line(img, (x0, y0), (x1,y1), 255, 1, cv2.LINE_AA)
# print line segment length
a = (x0-x1) * (x0-x1)
b = (y0-y1) * (y0-y1)
c = a + b
print(math.sqrt(c))
cv2.imwrite('page2_lines.png', img)
Kindly go through my Github repository Code for table extraction
The developed code detect table and extract out information by keeping the spatial coordinates intact. 
The code detects lines from tables as shown in an image below. I hope it solves your problem.

The extracted output in terms of a table is shown below.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With