Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Any way to keep track of the last 5 data points in python

Tags:

python

So I have an array that holds several numbers. As my script runs, more and more numbers are appended to this array. However, I am not interested in all the numbers but just want to keep track of the last 5 numbers.

Currently, I just store all the numbers in the array. However, this array gets really big and it's full of unnecessary information.

I have thought about making a function that when it adds an element to the array, also removes the last element if the array already contains 5 numbers.

I also thought about making a new class to create a data structure that does what I want. However, I only need to reference this array occasionally and is only a small part of the script. So I think it is overkill if I create a whole new class to do this.

What is the best way to do this?

like image 607
user Avatar asked Jul 16 '10 00:07

user


4 Answers

I fully agree with the idea of using Python's limited-length deque if it's available, and if not, Michael Anderson's simple solution is quite adequate. (I upvoted both) But I just wanted to mention the third option of a ring buffer, which is often used for this kind of task when low memory footprint and high execution speed are important. (In other words, in situations when you probably wouldn't be using Python :-p) For example, the Linux kernel uses this structure to store log messages generated during the boot process, before the system logger starts.

A Python implementation could look like this:

class RingBuffer(object):
    def __init__(self, n):
        self._buf = [None] * n
        self._index = 0
        self._valid = 0
    def add(self, obj):
        n = len(self._buf)
        self._buf[self._index] = obj
        self._index += 1
        if self._index == n
            self._index = 0
        if self._valid < n:
            self._valid += 1
    def __len__(self):
        return self._valid
    # could include other methods for accessing or modifying the contents

Basically what it does is preallocate an array (in Python, a list) of the desired length and fill it with dummy values. The buffer also contains an "index" which points to the next spot in the list that should be filled with a value. Each time a value is added, it's stored in that spot and the index is incremented. When the index reaches the length of the array, it's reset back to zero. Here's an example (I'm using 0 instead of None for the dummy value just because it's quicker to type):

[0,0,0,0,0]
 ^
# add 1
[1,0,0,0,0]
   ^
# add 2
[1,2,0,0,0]
     ^
# add 3
[1,2,3,0,0]
       ^
# add 4
[1,2,3,4,0]
         ^
# add 5
[1,2,3,4,5]
 ^
# add 6
[6,2,3,4,5]
   ^
# add 7
[6,7,3,4,5]
     ^

and so on.

like image 92
David Z Avatar answered Oct 04 '22 13:10

David Z


Try using a deque: http://docs.python.org/library/collections.html#deque-objects

"If maxlen is not specified or is None, deques may grow to an arbitrary length. Otherwise, the deque is bounded to the specified maximum length. Once a bounded length deque is full, when new items are added, a corresponding number of items are discarded from the opposite end. Bounded length deques provide functionality similar to the tail filter in Unix. They are also useful for tracking transactions and other pools of data where only the most recent activity is of interest."

like image 32
shookster Avatar answered Oct 04 '22 14:10

shookster


The class can be pretty trivial:

class ListOfFive:
  def __init__(self):
    self.data = []
  def add(self,val):
    if len(self.data)==5:
      self.data=self.data[1:]+[val]
    else:
      self.data+=[val]

l = ListOfFive()
for i in range(1,10):
  l.add(i)
  print l.data

Output is:

[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3, 4, 5]
[2, 3, 4, 5, 6]
[3, 4, 5, 6, 7]
[4, 5, 6, 7, 8]
[5, 6, 7, 8, 9]
like image 37
Michael Anderson Avatar answered Oct 04 '22 14:10

Michael Anderson


Another neat ring buffer implementation can be found in the ActiveState Recipes - your ring buffer object starts out as an instance of RingBuffer while filling up at first, and then your instance changes its class to RingBufferFull, an optimized full implementation. It always makes me smile.

class RingBuffer:
    def __init__(self,size_max):
        self.max = size_max
        self.data = []
    def append(self,x):
        """append an element at the end of the buffer"""
        self.data.append(x)
        if len(self.data) == self.max:
            self.cur=0
            self.__class__ = RingBufferFull
    def get(self):
        """ return a list of elements from the oldest to the newest"""
        return self.data


class RingBufferFull:
    def __init__(self,n):
        raise "you should use RingBuffer"
    def append(self,x):     
        self.data[self.cur]=x
        self.cur=(self.cur+1) % self.max
    def get(self):
        return self.data[self.cur:]+self.data[:self.cur]
like image 40
Kiv Avatar answered Oct 04 '22 15:10

Kiv