Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find string template from formatted strings?

Suppose I have a string template, e.g.,

string="This is a {object}"

Now i create two(or more) strings by formatting this string, i.e.,

string.format(object="car")
=>"This is a car"

string.format(object="2020-06-05 16:06:30")
=>"This is a 2020-06-05 16:06:30"

Now I have lost the original string somehow. Is there a way to find out the original string using the 2 new strings that I have now?

Note: I have a data set of these strings which were created from a template but the original template was lost because of editing. New strings were created from the new template and put in the same data set. I have tried using some ML based approach but it doesn't seem to work in general case. I am looking for an algorithm that gives me back the original string, it could be one or a group a strings in case the template has been changed multiple times.

like image 438
Akhil Garg Avatar asked Nov 06 '22 07:11

Akhil Garg


1 Answers

A possibility could be to match the words and formatted value options in the input strings and then compare:

import re
def get_vals(s):
   return re.findall('[\d\-]+\s[\d:]+|\w+', s)

vals = ["This is a car", "This is a 2020-06-05 16:06:30"]
r = ' '.join('{object}' if len(set(i)) > 1 else i[0] for i in zip(*map(get_vals, vals)))

Output:

'This is a {object}'
like image 59
Ajax1234 Avatar answered Nov 14 '22 10:11

Ajax1234