Mitch's Technical Blog

Programming Pearls: A Scanning Algorithm

August 13, 2018
algorithms, notes, python (revised August 15, 2018)
permalink

The book Programming Pearls, by Jon Bentley, should be on every serious programmer's bookshelf. (Or in my case, somewhere in my book piles.)

Chapter 8 provides a sequence of algorithms that go from O(n^3) to O(n) to solve the same problem.

The problem statement is as follows.

You're given a vector of integers in sequence. You can think of them as an array. The problem is to find the largest contiguous sum.

If all integers are positive, the solution is always the same: the sum of all array elements.

But negative values add a twist.

Here are some example data sets with associated solutions shown with ^^^.

10 -10 5 15 -3 12 14
               ^^^^^

-100 -100 201 -200 1
          ^^^^^^^^^^

2 -1 2 -1 2 -1
^^^^^^^^^^^

The key insight: When a series is "terminated" with a negative number, it can never be resurrected by a future positive number. Why? Because the subsequent positive number (or sum) will always be a higher value by itself. The opposite is also true: if a negative number does not force any prior series below zero, then the series should be kept "alive" since it might grow larger than the prior larger sum it had before it was just reduced by the negative number.

I adapted the book's implementation and extended it to return not only the maximum sum, but also the corresponding start and end indices in the array.

The lesson here is that a single creative insight can accomplish two things: 1. Dramatically improve performance; 2. Radically simplify the logic. In this case, it's also intensely beautiful, like a great poem.

The book used C, I used Python.

"""
See p.81, Programming Pearls
This extends the O(n) solution provided there by also returning
the positions of the elements which yield the greatest sum.
"""

def scan(somelist):
    """
    >>> scan([-1, 5, 6, -20, 15, 0])
    (15, [4])
    >>> scan([0, 5, 6, -20, 10, 0])
    (11, [1, 2])
    >>> scan([-5, -10, -20, -20])
    (None, [])
    >>> scan([-5, 20, -10, 22])
    (32, [1, 2, 3])
    >>> scan([])
    (None, [])
    >>> scan([0, -1, 0, -1, 0])
    (None, [])
    >>> scan([2, -1, 2, -1])
    (3, [0, 1, 2])
    """
    max_so_far = 0
    max_before_here = 0
    prev_max_so_far = 0

    max_in_prog = []
    leader = []

    for i in range(len(somelist)):
        max_before_here = max(max_before_here + somelist[i], 0)
        max_so_far = max(max_before_here, max_so_far)

        if max_before_here == 0:
            max_in_prog = []
        else:
            max_in_prog.append(i)

        if max_so_far > prev_max_so_far:
            leader = list(max_in_prog)

        prev_max_so_far = max_so_far

    if leader == []:
        max_so_far = None

    return (max_so_far, leader)


if __name__ == '__main__':
    from doctest import testmod
    testmod()


Contact: hello at escapefromsql.net