I've a set of elements (potentially big) with an order relation:
[a,b,c,d,e,f]
and a set of frequent patterns (potentially big) with ids:
[a]:1,[b]:2,[c]:3,[a,b]:4,[b,c]:5,[a,b,c]:6
I have a sequence of ordered sets:
[a,b], [e], [c], [e,f], [a,b,c]
I want to match every set in the sequence with the ids of the corresponding patterns:
[a,b]:{1,2,4}, [e]:{}, [c]:{3}, [a,b,c]:{1,2,3,4,5,6}
My goal is to limit the number of passes over the sequence so I want to build a data structure I can use during the scan. I'm thinking of a prefix tree:
──null
├──a : 1
| |
| └──b : 4
| |
| └──c : { 5, 6 }
|
├──b : 2
| |
| └──c : 5
|
└──c : 3
I scan a set in the sequence and pass it trough the tree multiple times recursively (set, set.tail, set.tail.tail...), every time I reach a node I add the corresponding ids to an array.
Do I miss any peculiar case in my reasoning (just realized I have to put multiple ids for nodes of depth>2 if I don't want to miss [a,c] if [a,b,c] exist in the set) ?
Is there any more sophisticated data structure I can use to improve the processing time ?
Edit : In fact at depth n, I need 2^(n-2) ids with my method (considering my tree is dense). I'm not sure it's a valid way to do it...
Edit2 : another approach merging bitmaps of each single element in the sequence to build each pattern (as used in SPADE algorithm).
a : [1,0,0,0,1]
b : [0,1,0,0,1]
ab : [0,0,0,0,1]
with some array manipulations, I should be able to match this with the elements of my initial array.
If you are building a prefix tree (aka trie), all nodes are unique, so the prefix tree for the set {a,b,c} in alphabetic order with continuity look like this:
──null
├──a : 1
| |
| └──b : 4
| |
| └──c : 6
|
├──b : 2
| |
| └──c : 5
|
└──c : 3
And it maps to the prefix set { a, b, c, ab, bc, abc }.
The tree space complexity is SUM k for k = 1..N ~ O(N^2).
Node.java
class Node
{
public String str;
public ArrayList<String> child;
public Node (String str)
{
this.str = str;
this.child = new ArrayList<String>();
}
}
MyTree.java
class MyTree
{
Node head;
..
public void build_tree(String [] symbol)
{
this.head = new Node("");
build(symbol,head,0,symbol.length-1);
}
// build the prefix tree through DFS
private void build(String [] symbol, Node parent, int start, int end)
{
Node ch = null;
for (int i=start; i<=end; i++)
{
ch = new Node(symbol[i]);
parent.child.add(ch);
if (end - start > 0)
{
build(symbol,ch,i,end);
}
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With