I have a keywords list and a blacklist. I want to delete all keywords that contain any of blacklist item. At the moment Im doing it this way:
my @keywords = ( 'some good keyword', 'some other good keyword', 'some bad keyword');
my @blacklist = ( 'bad' );
A: for my $keyword ( @keywords ) {
B: for my $bl ( @blacklist ) {
next A if $keyword =~ /$bl/i; # omitting $keyword
}
# some keyword cleaning (for instance: erasing non a-zA-Z0-9 characters, etc)
}
I was wondering is there any fastest way to do this, becouse at the moment I have about 25 milion keywords and couple of hundrets words in blacklist.
The most straightforward option is to join
the blacklist entries into a single regular expression, then grep
the keyword list for those which don't match that regex:
#!/usr/bin/env perl
use strict;
use warnings;
use 5.010;
my @keywords =
('some good keyword', 'some other good keyword', 'some bad keyword');
my @blacklist = ('bad');
my $re = join '|', @blacklist;
my @good = grep { $_ !~ /$re/ } @keywords;
say join "\n", @good;
Output:
some good keyword
some other good keyword
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With