So there is a very elegant answer to a similar problem. The problem there was to build an array tree where every array had only 1, 2, 4, 8, 16, or 32 items, and where every item was at the same nesting level. I formulated this problem without having the entire system in mind (doing rapid prototyping I guess), but the current system I don't think will really work for deleting items from the middle of the array, or adding items into the middle of the array. Unfortunately.
I need the ability to add/remove items in the middle of the array because this will be used for arrays in bucketed hash tables, or general arrays which items are rapidly added and removed (like managing memory blocks). So I am thinking how to balance that with the desire to have memory block sizes of 1, 2, 4, 8, 16, or 32 items only. Hence the tree, but I think the tree needs to work slightly differently from the problem posed in that question.
What I'm thinking is having a system like follows. Each array in the nested array tree can have 1, 2, 4, 8, 16, or 32 items, but the items don't need to sit at the same level. The reason for putting items at the same level is because there is a very efficient algorithm for getItemAt(index)
if they are at the same level. But it has the problem of not allowing efficient inserts/deletes. But I think this can be solved where items in an array are at different levels by having each parent array "container" keep track of how many deeply nested children it has. It would essentially keep track of the size of the subtree. Then to find the item with getItemAt(index)
, you would traverse the top level of the tree, count the top-level tree sizes, and then narrow your search down the tree like that.
In addition, the leaf arrays (which have 1, 2, 4, 8, 16, or 32 items each) can have items removed, and then you only have to adjust that short array item's positions. So you'd go from this:
[1, 2, 3, 4, 5, 6, 7, 8]
...delete 6
, and get this (where -
is null
):
[1, 2, 3, 4, 5, 7, 8, -]
Then if you add an item 9
at say position 3, it would result in:
[1, 2, 9, 3, 4, 5, 7, 8]
This is nice because say you have a million items array. You now only have to adjust a single array with up to 32 items, rather than shifting a million items.
BUT, it gets a bit complicated when you add an item in the "middle of this tree array", but at the end of a 32-item array. You would first think you would have to shift every single subsequent array. But there is a way to make it so you don't have to do this shifting! Here is one case.
We start here:
[
[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32],
[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32]
]
Now we add an item 90
at the 16th position. We should end up with this, since this array must grow to be 4 in length:
[
[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 90,
16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 21],
[32],
-,
[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32]
]
If we delete 90
now, we would end with this:
[
[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, - ],
[32],
-,
[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32]
]
Basically, it is minimizing the changes that are made. To getByIndex(index)
it would work like this, with more metadata on the arrays:
{
size: 64,
list: [
{
size: 31,
list: [
1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, - ] },
{
size: 1,
list: [32] },
null,
{
size: 32,
list: [
1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ,
9 , 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32] }
]
}
So my question is, how can you build such a tree that only has 1, 2, 4, 8, 16, or 32 nodes at each level, which allows for inserting or removing nodes at any place in the overall conceptual "array", where the leaf nodes in the tree don't need to be at any specific level? How to implement the splice
method?
For this question, don't worry about compactification just yet, I will try and see if I can figure that out on my own first. For this question, just leave junk and nulls around wherever they end up being, which is less than ideal. I mean, if you know how to compactify things easily, then by all means include it, but I think it will add significantly to the answer, so the default should be to leave it out of the answer :)
Also note, the arrays should be treated as if they are static arrays, i.e. they can't dynamically grow in size.
The algorithm for insertItemAt(index)
would work something like this:
The algorithm for removeItemAt(index)
(the second piece of functionality of splice
) would do something like this:
Here is what I have basically so far.
const createTree = () => createTreeLeaf()
const createTreeContainer = () => {
return {
type: 'container',
list: [],
size: 0
}
}
const createTreeLeaf = () => {
return {
type: 'leaf',
list: [],
size: 0
}
}
const setItemAt = (tree, item, index) => {
let nodes = [tree]
let startSize = 0
a:
while (true) {
b:
for (let i = 0, n = nodes.length; i < n; i++) {
let node = nodes[i]
let endSize = startSize + node.size
if (startSize <= index && index < endSize) {
// it's within here.
if (node.type == 'container') {
nodes = node.list
break b
} else {
let relativeIndex = index - startSize
// grow if less than max
if (relativeIndex > node.size - 1) {
if (node.size == 32) {
// grow somehow
} else {
let newArray = new Array(node.size * 2)
node.list.forEach((x, i) => newArray[i] = x)
node.list = newArray
}
}
if (node.list[relativeIndex] == null) {
node.size++
}
node.list[relativeIndex] = item
break a
}
}
}
}
}
const insertItemAt = (tree, item, index) => {
let nodes = [tree]
let startSize = 0
a:
while (true) {
b:
for (let i = 0, n = nodes.length; i < n; i++) {
let node = nodes[i]
let endSize = startSize + node.size
if (startSize <= index && index < endSize) {
// it's within here.
if (node.type == 'container') {
nodes = node.list
break b
} else {
let relativeIndex = index - startSize
// grow if less than max
if (relativeIndex > node.size - 1 || isPowerOfTwo(node.size)) {
if (node.size == 32) {
// grow somehow
} else {
let newArray = new Array(node.size * 2)
node.list.forEach((x, i) => newArray[i] = x)
node.list = newArray
}
}
// todo: shift the items over to make room for this item
break a
}
}
}
}
}
const removeItemAt = (tree, item, index) => {
}
const appendItemToEndOfTree = (tree, item) => {
}
const getLeafContainingIndex = (tree, index) => {
if (index > tree.size - 1) {
return { node: null, index: -1 }
}
let nodes = [tree]
let startSize = 0
a:
while (true) {
b:
for (let i = 0, n = nodes.length; i < n; i++) {
let node = nodes[i]
let endSize = startSize + node.size
if (startSize <= index && index < endSize) {
if (node.type == 'container') {
nodes = node.list
break b
} else {
let relativeIndex = index - startSize
return { node, index: relativeIndex }
}
} else {
startSize = endSize
}
}
}
}
const getItemAt = (tree, getIndex) => {
const { node, index } = getLeafContainingIndex(tree, getIndex)
if (!node) return null
return node.list[index]
}
const isPowerOfTwo = (x) => {
return (x != 0) && ((x & (x - 1)) == 0)
}
const log = tree => console.log(JSON.stringify(tree.list))
// demo
const tree = createTree()
setItemAt(tree, { pos: '1st', attempt: 1 }, 0)
log(tree)
setItemAt(tree, { pos: '2nd', attempt: 2 }, 1)
log(tree)
setItemAt(tree, { pos: '2nd', attempt: 3 }, 1)
log(tree)
setItemAt(tree, { pos: '3rd', attempt: 4 }, 2)
log(tree)
The "grow somehow" section would work sort of like this I think...
NOTE: This should work like a compact list. This means it never has any blank spaces inside of it (except the trailing nulls, which aren't spaces in the array, only placeholders to support future inserts according to the 1, 2, 4, 8, 16, 32 constraints). This means you can only (1) squeeze an item in between two existing, (2) append after an item, (3) prepend before an item, and (4) delete an item (only having the empty space filled by shifting the leaf items to the left).
Because an array's length is fixed at compile time, if we use an array to implement a tree we have to set a limit on the number of nodes we will permit in the tree.
Step 1 − store data in order traversal of a binary tree into array arr[]. Step 2 − sort the array arr[] using any sorting technique. Step 3 − Now, do the inorder traversal of the tree and e copy the elements of an array to the nodes of the tree one by one.
In computer science, a hashed array tree (HAT) is a dynamic array data-structure published by Edward Sitarski in 1996, maintaining an array of separate memory fragments (or "leaves") to store the data elements, unlike simple dynamic arrays which maintain their data in one contiguous memory area.
The requirements are close to what a B+ tree offers. A 32-ary B+ tree would provide these properties:
null
fillers)null
fillers)In addition it has this useful feature:
B+ trees are search trees, which is a feature that does not match your requirements. So that means the typical keys, that are stored in the internal nodes of a B+ tree, are not needed here.
On the other hand, you require that elements can be identified by index. As you already suggested, you can extend each node with a property that provides the total count of data values that are stored in the leaves of the subtree rooted in that node. This will allow to find a data value by index in logarithmic time.
As to the requirement that node sizes should be powers of 2 up to 32: B+ trees do not provide variable node sizes. But note that all nodes in this B+ tree are guaranteed to have at least 16 used entries. The only exception is the root, which could have any number of used entries.
So in a first version of my answer I did not focus too much on this requirement: implementing it would mean that sometimes you could save some space in an non-root node by limiting its size to 16 instead of 32. But the very next insertion in that node would require it to extend (again) to a size of 32. So I considered it might not be worth the effort. Adapting the record size for the root would also not contribute much gain as it just applies to that single node.
After a comment about this, I adapted the implementation, so each node will reduce or extend its size to the previous/next power of 2 as soon as possible/needed. This means that non-root nodes may sometimes get their size reduced to 16 (while they have 16 items/children), and the root can have any of the possible powers (1, 2, 4, 8, 16, or 32) as size.
In line with your preference I avoided the use of recursion.
I opted to include the following properties in each node:
children
: this is the list of either child nodes or (in case of a leaf) of data values. This list is an array with 1, 2, 4, 8, 16 or 32 slots, where the non-used slots are filled with null
. This is very un-JavaScript like, but I understand you are actually targeting a different language, so I went with it.childCount
: this indicates how many slots in the above array are actually used. We could do without this, if we could assume that null
can never be used as data, and an occurrence would indicate the end of the real content. But anyway, I went for this property. So in theory, the content could now actually include intended null
values.treeSize
. this is the total number of data items that are in the leaves of the subtree that is rooted in this node. If this is itself a leaf node, then it will always be equal to childCount
.parent
. B+ trees don't really need a back reference from a child to a parent, but I went with it for this implementation, also because it makes it somewhat easier to provide a non-recursion based implementation (which you seem to prefer).prev
, next
: these properties reference sibling nodes (in the same level), so that each level of the tree has its nodes in one, doubly linked list. B+ trees commonly have this at the bottom level only, but it is also handy to have at the other levels.You already sketched an algorithm for insertion. It would indeed go like this:
If the root needs to split, then create a new root node that will have the two split nodes as its children.
During the execution of this algorithm, the properties of the impacted nodes should of course be well maintained.
If the root ends up with just one child node, then make the root to be that single child node, removing the top level.
Below you'll find an implementation of a Tree
class. It includes the methods you were asking for:
getItemAt
setItemAt
removeItemAt
insertItemAt
It also includes some extra methods:
Symbol.iterator
: this makes the tree instance iterable. This allows for easy iteration of values, using the linked list of the bottom level of the tree.
print
: speaks for itself
verify
: this visits every node of the tree in a breadth-first traversal, checking that all conditions are met, and no inconsistencies exist. Among many other tests, it also verifies that each node has a fill factor of more than 50%, or otherwise put: that the array size is the least possible power of two to host the content. If a test fails, a simple exception will be thrown. I didn't put any effort in providing context to the error: they should not ever occur.
Running the snippet below will create one tree instance, and perform 1000 insertions, 1000 update/retrievals, and 1000 deletions. In parallel the same actions are done on a 1-dimensional array, using splice
. After each step, the values from the tree iteration are compared with the array, and the consistency of the tree is verified. The test is performed with a maximum node capacity of 8 (instead of 32), so the tree grows faster vertically, and a lot more shifting, splitting and merging needs to happen.
class Node {
constructor(capacity) {
// Mimic fixed-size array (avoid accidentally growing it)
this.children = Object.seal(Array(capacity).fill(null));
this.childCount = 0; // Number of used slots in children array
this.treeSize = 0; // Total number of values in this subtree
// Maintain back-link to parent.
this.parent = null;
// Per level in the tree, maintain a doubly linked list
this.prev = this.next = null;
}
setCapacity(capacity) {
if (capacity < 1) return;
// Here we make a new array, and copy the data into it
let children = Object.seal(Array(capacity).fill(null));
for (let i = 0; i < this.childCount; i++) children[i] = this.children[i];
this.children = children;
}
isLeaf() {
return !(this.children[0] instanceof Node);
}
index() {
return this.parent.children.indexOf(this);
}
updateTreeSize(start, end, sign=1) {
let sum = 0;
if (this.isLeaf()) {
sum = end - start;
} else {
for (let i = start; i < end; i++) sum += this.children[i].treeSize;
}
if (!sum) return;
sum *= sign;
// Apply the sum change to this node and all its ancestors
for (let node = this; node; node = node.parent) {
node.treeSize += sum;
}
}
wipe(start, end) {
this.updateTreeSize(start, end, -1);
this.children.copyWithin(start, end, this.childCount);
for (let i = this.childCount - end + start; i < this.childCount; i++) {
this.children[i] = null;
}
this.childCount -= end - start;
// Reduce allocated size if possible
if (this.childCount * 2 <= this.children.length) this.setCapacity(this.children.length / 2);
}
moveFrom(neighbor, target, start, count=1) {
// Note: `start` can have two meanings:
// if neighbor is null, it is the value/Node to move to the target
// if neighbor is a Node, it is the index from where value(s) have to be moved to the target
// Make room in target node
if (this.childCount + count > this.children.length) this.setCapacity(this.children.length * 2);
this.children.copyWithin(target + count, target, Math.max(target + count, this.childCount));
this.childCount += count;
if (neighbor !== null) {
// Copy the children
for (let i = 0; i < count; i++) {
this.children[target + i] = neighbor.children[start + i];
}
// Remove the original references
neighbor.wipe(start, start + count);
} else {
this.children[target] = start; // start is value to insert
}
this.updateTreeSize(target, target + count, 1);
// Set parent link(s)
if (!this.isLeaf()) {
for (let i = 0; i < count; i++) {
this.children[target + i].parent = this;
}
}
}
moveToNext(count) {
this.next.moveFrom(this, 0, this.childCount - count, count);
}
moveFromNext(count) {
this.moveFrom(this.next, this.childCount, 0, count);
}
basicRemove(index) {
if (!this.isLeaf()) {
// Take node out of the level's linked list
let prev = this.children[index].prev;
let next = this.children[index].next;
if (prev) prev.next = next;
if (next) next.prev = prev;
}
this.wipe(index, index + 1);
}
basicInsert(index, value) {
this.moveFrom(null, index, value);
if (value instanceof Node) {
// Insert node in the level's linked list
if (index > 0) {
value.prev = this.children[index-1];
value.next = value.prev.next;
} else if (this.childCount > 1) {
value.next = this.children[1];
value.prev = value.next.prev;
}
if (value.prev) value.prev.next = value;
if (value.next) value.next.prev = value;
}
}
pairWithSmallest() {
return this.prev && (!this.next || this.next.childCount > this.prev.childCount)
? [this.prev, this] : [this, this.next];
}
toString() {
return "[" + this.children.map(v => v??"-").join() + "]";
}
}
class Tree {
constructor(nodeCapacity=32) {
this.nodeCapacity = nodeCapacity;
this.root = new Node(1);
this.first = this.root; // Head of doubly linked list at bottom level
}
locate(offset) {
let node = this.root;
// Normalise argument
offset = offset < 0 ? Math.max(0, node.treeSize + offset) : Math.min(offset, node.treeSize);
while (!node.isLeaf()) {
let index = 0;
let child = node.children[index];
while (offset > child.treeSize || offset === child.treeSize && child.next) {
offset -= child.treeSize;
child = node.children[++index];
}
node = child;
}
return [node, offset];
}
getItemAt(offset) {
let [node, index] = this.locate(offset);
if (index < node.childCount) return node.children[index];
}
setItemAt(offset, value) {
let [node, index] = this.locate(offset);
if (index < node.childCount) node.children[index] = value;
}
removeItemAt(offset) {
let [node, index] = this.locate(offset);
if (index >= node.childCount) return;
while (true) {
console.assert(node.isLeaf() || node.children[index].treeSize === 0);
node.basicRemove(index);
// Exit when node's fill ratio is fine
if (!node.parent || node.childCount * 2 > this.nodeCapacity) return;
// Node has potentially too few children, we should either merge or redistribute
let [left, right] = node.pairWithSmallest();
if (!left || !right) { // A node with no siblings? Must become the root!
this.root = node;
node.parent = null;
return;
}
let sumCount = left.childCount + right.childCount;
let childCount = sumCount >> 1;
// Check whether to merge or to redistribute
if (sumCount > this.nodeCapacity) { // redistribute
// Move some data from the bigger to the smaller node
let shift = childCount - node.childCount;
if (!shift) { // Boundary case: when a redistribution would bring no improvement
console.assert(node.childCount * 2 === this.nodeCapacity && sumCount === this.nodeCapacity + 1);
return;
}
if (node === left) { // move some children from right to left
left.moveFromNext(shift);
} else { // move some children from left to right
left.moveToNext(shift);
}
return;
}
// Merge:
// Move all data from the right to the left
left.moveFromNext(right.childCount);
// Prepare to delete right node
node = right.parent;
index = right.index();
}
}
insertItemAt(offset, value) {
let [node, index] = this.locate(offset);
while (node.childCount === this.nodeCapacity) { // No room here
if (index === 0 && node.prev && node.prev.childCount < this.nodeCapacity) {
return node.prev.basicInsert(node.prev.childCount, value);
}
// Check whether we can redistribute (to avoid a split)
if (node !== this.root) {
let [left, right] = node.pairWithSmallest();
let joinedIndex = left === node ? index : left.childCount + index;
let sumCount = left.childCount + right.childCount + 1;
if (sumCount <= 2 * this.nodeCapacity) { // redistribute
let childCount = sumCount >> 1;
if (node === right) { // redistribute to the left
let insertInLeft = joinedIndex < childCount;
left.moveFromNext(childCount - left.childCount - +insertInLeft);
} else { // redistribute to the right
let insertInRight = index >= sumCount - childCount;
left.moveToNext(childCount - right.childCount - +insertInRight);
}
if (joinedIndex > left.childCount ||
joinedIndex === left.childCount && left.childCount > right.childCount) {
right.basicInsert(joinedIndex - left.childCount, value);
} else {
left.basicInsert(joinedIndex, value);
}
return;
}
}
// Cannot redistribute: split node
let childCount = node.childCount >> 1;
// Create a new node that will later become the right sibling of this node
let sibling = new Node(childCount);
// Move half of node node's data to it
sibling.moveFrom(node, 0, childCount, childCount);
// Insert the value in either the current node or the new one
if (index > node.childCount) {
sibling.basicInsert(index - node.childCount, value);
} else {
node.basicInsert(index, value);
}
// Is this the root?
if (!node.parent) {
// ...then first create a parent, which is the new root
this.root = new Node(2);
this.root.basicInsert(0, node);
}
// Prepare for inserting the sibling node into the tree
index = node.index() + 1;
node = node.parent;
value = sibling;
}
node.basicInsert(index, value);
}
/* Below this point: these methods are optional */
* [Symbol.iterator]() { // Make tree iterable
let i = 0;
for (let node = this.first; node; node = node.next) {
for (let i = 0; i < node.childCount; i++) yield node.children[i];
}
}
print() {
console.log(this.root && this.root.toString());
}
verify() {
// Raise an error when the tree violates one of the required properties
if (!this.root) return; // An empty tree is fine.
if (this.root.parent) throw "root should not have a parent";
// Perform a breadth first traversal
let q = [this.root];
while (q.length) {
if (q[0].isLeaf() && this.first !== q[0]) throw "this.first is not pointing to first leaf";
let level = [];
let last = null;
for (let parent of q) {
if (!(parent instanceof Node)) throw "parent is not instance of Node";
if (parent.children.length > this.nodeCapacity) throw "node's children array is too large";
if (parent.childCount > 0 && parent.childCount * 2 <= parent.children.length) throw "node's fill ratio is too low";
for (let i = parent.childCount; i < parent.children.length; i++) {
if (parent.children[i] !== null) throw "child beyond childCount should be null but is not";
}
let treeSize = parent.treeSize;
if (parent.isLeaf()) {
for (let value of parent.children.slice(0, parent.childCount)) {
if (value === null) throw "leaf has a null as value";
if (value instanceof Node) throw "leaf has a Node as value";
}
if (parent.treeSize !== parent.childCount) throw "leaf has mismatch in treeSize and childCount";
} else {
for (let node of parent.children.slice(0, parent.childCount)) {
if (node === null) throw "internal node has a null as value";
if (!(node instanceof Node)) throw "internal node has a non-Node as value";
if (node.parent !== parent) throw "wrong parent";
if (node.prev !== last) throw "prev link incorrect";
if (last && last.next !== node) throw "next link incorrect";
if (last && last.children.length + node.children.length <= this.nodeCapacity) {
throw "two consecutive siblings have a total number of children that is too small";
}
if (node.childCount * 2 < this.nodeCapacity) {
throw "internal node is too small: " + node;
}
level.push(node);
last = node;
treeSize -= node.treeSize;
}
if (treeSize) throw "internal node treeSize sum mismatches";
}
}
if (last && last.next) throw "last node in level has a next reference";
q = level;
}
}
test(count=100, option=3) {
// option:
// 0 = always insert & delete at left side (offset 0)
// 1 = always insert & delete at right side
// 2 = always insert & delete at middle
// 3 = insert & delete at random offsets
// Create array to perform the same operations on it as on the tree
let arr = [];
// Perform a series of insertions
for (let i = 0; i < count; i++) {
// Choose random insertion index
let index = Array.isArray(option) ? option[i] : [0, i, i >> 1, Math.floor(Math.random() * (i+1))][option];
// Perform same insertion in array and tree
arr.splice(index, 0, i);
this.insertItemAt(index, i);
// Verify tree consistency and properties
this.verify();
// Verify the order of values in the array is the same as in the tree
if (arr+"" !== [...this]+"") throw i + ": tree not same as array";
}
// Perform a series of updates
for (let i = 0; i < count; i++) {
// Choose random update index
let index = Math.floor(Math.random() * count);
// Perform same insertion in array and tree
arr[index] += count;
this.setItemAt(index, this.getItemAt(index) + count);
// Verify tree consistency and properties
this.verify();
// Verify the order of values in the array is the same as in the tree
if (arr+"" !== [...this]+"") throw "tree not same as array";
}
// Perform a series of deletions
for (let i = arr.length - 1; i >= 0; i--) {
// Choose random deletion index
let index = [0, i, i >> 1, Math.floor(Math.random() * (i+1))][option];
// Perform same deletion in array and tree
arr.splice(index, 1);
this.removeItemAt(index);
// Verify tree consistency and properties
this.verify();
// Verify the order of values in the array is the same as in the tree
if (arr+"" !== [...this]+"") throw "tree not same as array";
}
}
}
// Perform 1000 insertions, 1000 updates, and 1000 deletions on a tree with node capacity of 8
new Tree(8).test(1000);
console.log("all tests completed");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With