Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Identifying undulating patterns in 1d distribution

My question in brief: given a 1d distribution in Python, how can one identify regions of that distribution that have a sine-like, undulating pattern?

I'm working to identify images within page scans of historic documents. These images are essentially always full-width within the scans (that is, they're basically never juxtaposed with text). This led me to believe that the simplest solution would be to remove the regions of a page scan that contain text lines.

Using the following snippet, one can read an image into memory and measure the aggregate pixel brightness for each row across the image, top to bottom, transforming an input image into the plot below:

import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
from scipy.ndimage import imread
import numpy as np
import sys

img = imread(sys.argv[1])
row_sums = list([(sum(r)/len(r)) for r in img ])

# the size of the returned array = size of row_sums input array
window_size = 150
running_average_y = np.convolve(row_sums, np.ones((window_size,))/window_size, mode='same')

# plot the y dimension pixel distribution
plt.plot(running_average_y)
plt.show()

Input image:

enter image description here

Output plot:

enter image description here

Given this distribution, I'm now wanting to identify the regions of the curve that have the regular undulating pattern one sees in the first and last thirds of the plot (roughly speaking). Do others have ideas on how that task should be approached?

At first I tried fitting a linear model to the whole 1d distribution, but that fails for all sorts of reasons. I'm now thinking it might make sense to try and fit something like a sine-wave to segments of the curve, but that seems like overkill. Do others have ideas on how best to approach this task? Any suggestions or insights would be very appreciated!

like image 594
duhaime Avatar asked Aug 20 '17 19:08

duhaime


1 Answers

This doesn't answer your question but maybe solves your problem. Smoothing the row sums hides the fact that the lines of text in your images are well separated by white space -- as would be expected for a movable type print.

You can use the white space as a separator to partition your image into blocks. In most cases, a block corresponds to a singe line. Very large blocks correspond to images.

enter image description here

import sys
import numpy as np
import matplotlib.pyplot as plt

MIN_BLOCK_SIZE = 100 # pixels

img = plt.imread(sys.argv[1])

# find blank rows
row_sums = np.mean(img, axis=1)
threshold = np.percentile(row_sums, 75)
is_blank = row_sums > threshold

# find blocks between blank rows
block_edges = np.diff(is_blank.astype(np.int))
starts, = np.where(block_edges == -1)
stops, = np.where(block_edges == 1)
blocks = np.c_[starts, stops]

# plot steps
fig, axes = plt.subplots(3,1, sharex=True, figsize=(6.85, 6))
axes[0].plot(row_sums)
axes[0].axhline(threshold, c='r', ls='--')
axes[1].plot(is_blank)
for (start, stop) in blocks:
    if stop - start > MIN_BLOCK_SIZE:
        axes[2].axvspan(start, stop, facecolor='red')
plt.show()
like image 78
Paul Brodersen Avatar answered Oct 13 '22 18:10

Paul Brodersen