Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient data structure to store time series of partially changing array?

My problem is I have an array of objects of size N. after every time (t+1), some of the values of the array may or may not change. so lets say t+1 index 15 changes but everything else stays the same.

What is the most efficient way to store something like this (in memory ) apart from just having an array of arrays of course? I want to be able to get all the values of the array for any time, say getValues(long time) in the most efficient way possible.

Say for 4 arrays

time 1 null null null xyz

time 2 null null abc xyz

(note that only abc changed here) but we still keep the values of the last index from time 1.

like image 537
Dez Udezue Avatar asked Oct 14 '22 23:10

Dez Udezue


1 Answers

What you are asking for is known in the academic CS field as "partially persistent arrays". For more information about persistence, see the survey of Kaplan.

One simple solution is to use search trees instead of arrays. You can use purely functional balanced search trees to guarantee that each read or write takes O(lg n) worst-case time (where n is the size of the array), instead of O(1) time. (If you keep the time-stamped versions in an extendible array, adding a new version is O(1) amortized and accessing an old version is O(1) worst-case.)

If you are not familiar with purely functional search trees, you might read this introduction to Clojure's persistent arrays. They are purely functional trees indexed fixed-width integers (in the form of a trie), and thus have fixed depth. This might be the fastest solution to your problem, both worst-case and real-world.

The only other partially persistent arrays that I am aware of may take longer in the worst case. Some have better amortized complexity and some have better expected complexity if arrays are accessed in various restricted ways.

like image 156
jbapple Avatar answered Oct 20 '22 17:10

jbapple