Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Automatically generate tags from strings with javascript

I need to -automatically- generate tags for a text string. In this case, I'll use this string:

var text = 'This text talks about loyalty in the Royal Family with Príncipe Charles';

My current implementation, generates the tags for words that are 6+ characters long, and it works fine.

words = (text).replace(/[^a-zA-Z\s]/g,function(str){return '';});
words = words.match(/\w{6,}/g);
console.log(words);

This will return:

["loyalty","Family","Prince","Charles"]

The problem is that sometimes, a tag should be a specific set of words. I need the result to be:

["loyalty","Royal Family","Príncipe Charles"]

That means, that the replace/match code should test for:

  1. words that are 6 characters long (or more); and/or
  2. if a set of words starts with an uppercase letter, those words should be joined together in the same array element. It doesn't matter if some of the words are less than 6 characters long - but at least one of them has to be 6+, e.g.: "Stop at The UK Guardián in London" should return ["The UK Guardián", "London"]

I'm obviously having trouble in the second requirement. Any ideas? Thanks!

like image 654
Andres SK Avatar asked Apr 19 '26 20:04

Andres SK


1 Answers

var text = 'This text talks about loyalty in the Royal Family with Prince Charles. Stop at The UK Guardian in London';

text.match(/(([A-Z]\w*\s*){2,})|(\w{6,})/g)

will return

["loyalty", "Royal Family ", "Prince Charles", "The UK Guardian ", "London"]

To fulfill the second requirement, it's better to run another regexp over the matches found:

var text = 'This is a Short Set Of Words about the Royal Family'

matches = text.match(/(([A-Z]\w*\s*){2,})|(\w{6,})/g)
matches.filter(function(m) {
    return m.match(/\w{6,}/)
});
like image 107
georg Avatar answered Apr 21 '26 10:04

georg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!