Consider you have several patterns of dates P1 - Pn.
Some of them are simple like P1 - all Mondays, P2 - all Tuesdays; others are more complex like P4 - all working days etc.
For custom array of dates (V1, V2) I have to create the shortest result string, as it is shown on the picture:
For any array we have to create string which will represent dates in array. The simplest method is to create string like 1.5.2013, 2.5.2013, 3.5.2013 ... But the result string will be very long.
Using several predefined patterns we can create shorter result string.
For result string I use following rules:
Single date format: DD.MM.YYYY (10 characters)
Enumeration (dates and patterns): comma and space (2 characters)
Interval of dates: DD.MM.YYYY-DD.MM.YYYY (21 characters)
Interval of pattern names: Px-Py (5 characters)
Special words: except (6 characters)
Examples of result strings:
V1 using P4 pattern:
P4 except 01.05.2013-03.05.2013, 09.05.2013, 10.05.2013, 16.05.2013, 17.05.2013 (80 characters)
V1 using Pn pattern:
Pn 06.05.2013-08.05.2013, 13.05.2013-15.05.2013, 20.05.2013-24.05.2013, 27.05.2013-31.05.2013 (94 characters)
V1 using best patterns match:
P1-P3 01.05.2013-19.05.2013, P4 20.05.2013-31.05.2013 (54 characters)
The main goal is to create the shortest result string. As I understand we can achieve this by finding the best matching pattern/patterns.
Currently I'm trying to adapt knapsack problem and longest common subsequence problem, but I'm not sure if it is the right direction.
I would appreciate any ideas.
updated
Thanks to Jan Dvorak for his extra short description of my problem:
The goal is to describe V using a predefined dictionary (P1..Pn and all intervals and single dates) where intersection, union and subtraction are all allowed and each operation and atom have a predefined cost (number of characters in result string).
The so-called naive or brute force algorithm is the most intuitive approach to the string pattern-matching problem. This algorithm attempts simply to match the pattern in the target at successive positions from left to right.
Two of the most widely used pattern matching algorithms are the Naive Algorithm for pattern matching and the pattern matching algorithm using finite automata.
Pattern matching is the process of checking whether a specific sequence of characters/tokens/data exists among the given data. Regular programming languages make use of regular expressions (regex) for pattern matching.
Knuth Morris Pratt (KMP) is an algorithm, which checks the characters from left to right. When a pattern has a sub-pattern appears more than one in the sub-pattern, it uses that property to improve the time complexity, also for in the worst case. The time complexity of KMP is O(n).
After long time of searching we finally found solution which is very close to what we want.
http://www.fpz.unizg.hr/traffic/index.php/PROMTT/article/view/1287
Thanks for all how participated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With