Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to automagically create pattern based on real data?

I have many vendors in database, they all differ in some aspect of their data. I'd like to make data validation rule which is based on previous data.

Example:

A: XZ-4, XZ-23, XZ-217
B: 1276, 1899, 22711
C: 12-4, 12-75, 12

Goal: if user inputs string 'XZ-217' for vendor B, algorithm should compare previous data and say: this string is not similar to vendor B previous data.

Is there some good way/tools to achieve such comparison? Answer could be some generic algoritm or Perl module.

Edit: The "similarity" is hard to define, i agree. But i'd like to catch to algorithm, which could analyze previous ca 100 samples and then compare the outcome of analyze with new data. Similarity may based on length, on use of characters/numbers, string creation patterns, similar beginning/end/middle, having some separators in.

I feel it is not easy task, but on other hand, i think it has very wide use. So i hoped, there is already some hints.

like image 911
w.k Avatar asked Oct 10 '22 00:10

w.k


1 Answers

You may want to peruse: http://en.wikipedia.org/wiki/String_metric and http://search.cpan.org/dist/Text-Levenshtein/Levenshtein.pm (for instance)

like image 105
Alien Life Form Avatar answered Oct 13 '22 11:10

Alien Life Form