Python: Identifying undulating patterns in 1d distribution

Question

My question in brief: given a 1d distribution in Python, how can one identify regions of that distribution that have a sine-like, undulating pattern?

I'm working to identify images within page scans of historic documents. These images are essentially always full-width within the scans (that is, they're basically never juxtaposed with text). This led me to believe that the simplest solution would be to remove the regions of a page scan that contain text lines.

Using the following snippet, one can read an image into memory and measure the aggregate pixel brightness for each row across the image, top to bottom, transforming an input image into the plot below:

import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
from scipy.ndimage import imread
import numpy as np
import sys

img = imread(sys.argv[1])
row_sums = list([(sum(r)/len(r)) for r in img ])

# the size of the returned array = size of row_sums input array
window_size = 150
running_average_y = np.convolve(row_sums, np.ones((window_size,))/window_size, mode='same')

# plot the y dimension pixel distribution
plt.plot(running_average_y)
plt.show()

Input image:

enter image description here

Output plot:

enter image description here

Given this distribution, I'm now wanting to identify the regions of the curve that have the regular undulating pattern one sees in the first and last thirds of the plot (roughly speaking). Do others have ideas on how that task should be approached?

At first I tried fitting a linear model to the whole 1d distribution, but that fails for all sorts of reasons. I'm now thinking it might make sense to try and fit something like a sine-wave to segments of the curve, but that seems like overkill. Do others have ideas on how best to approach this task? Any suggestions or insights would be very appreciated!

Paul Brodersen · Accepted Answer

This doesn't answer your question but maybe solves your problem. Smoothing the row sums hides the fact that the lines of text in your images are well separated by white space -- as would be expected for a movable type print.

You can use the white space as a separator to partition your image into blocks. In most cases, a block corresponds to a singe line. Very large blocks correspond to images.

enter image description here

import sys
import numpy as np
import matplotlib.pyplot as plt

MIN_BLOCK_SIZE = 100 # pixels

img = plt.imread(sys.argv[1])

# find blank rows
row_sums = np.mean(img, axis=1)
threshold = np.percentile(row_sums, 75)
is_blank = row_sums > threshold

# find blocks between blank rows
block_edges = np.diff(is_blank.astype(np.int))
starts, = np.where(block_edges == -1)
stops, = np.where(block_edges == 1)
blocks = np.c_[starts, stops]

# plot steps
fig, axes = plt.subplots(3,1, sharex=True, figsize=(6.85, 6))
axes[0].plot(row_sums)
axes[0].axhline(threshold, c='r', ls='--')
axes[1].plot(is_blank)
for (start, stop) in blocks:
    if stop - start > MIN_BLOCK_SIZE:
        axes[2].axvspan(start, stop, facecolor='red')
plt.show()

Python: Identifying undulating patterns in 1d distribution

Tags:

python

opencv

machine-learning

numpy

classification

duhaime

1 Answers

Paul Brodersen

Recent Activity

Donate For Us

Python: Identifying undulating patterns in 1d distribution

Tags:

python

opencv

machine-learning

numpy

classification

duhaime

1 Answers

Paul Brodersen

Related questions

Recent Activity

Donate For Us