I created an array to fetch a file and then parse the contents of that file. I already filtered out the words less than 4 characters with if(strlen($value) < 4): unset($content[$key]); endif;
My question is this - I want to remove common words from the array, but there is quite a few of them. Instead of doing these checks over and over and over on each array value, I was wondering if there was a more efficient way to do this?
Here's a sample of the code I currently am using. This list could be huge and I am thinking that there has to be a better(more efficient) way?
foreach ($content as $key=>$value) {
if(strlen($value) < 4): unset($content[$key]); endif;
if($value == 'that'): unset($content[$key]); endif;
if($value == 'have'): unset($content[$key]); endif;
if($value == 'with'): unset($content[$key]); endif;
if($value == 'this'): unset($content[$key]); endif;
if($value == 'your'): unset($content[$key]); endif;
if($value == 'will'): unset($content[$key]); endif;
if($value == 'they'): unset($content[$key]); endif;
if($value == 'from'): unset($content[$key]); endif;
if($value == 'when'): unset($content[$key]); endif;
if($value == 'then'): unset($content[$key]); endif;
if($value == 'than'): unset($content[$key]); endif;
if($value == 'into'): unset($content[$key]); endif;
}
Here's how I'd do it:
$exlcuded_words = array( 'that','have','with','this','your','will','they','from','when','then','than','into');
$replace = array_fill_keys($exlcuded_words,'');
echo str_replace(array_keys($replace),$replace,'some words that have to be with this your will they have from when then that into replaced');
The way it works: make an array, full of empty strings, where the keys are the substrings you want to remove/replace. the just use str_replace, pass the keys as a first argument, the array itself as the second argument and, in the result in this case is: some words to be replaced. This code has been tested and works just fine.
When dealing with an array, just implode it with some wacky delimiter (like %@%@% or something) and str_replace the lot, explode the lot again and Bob's your uncle
When it comes to replacing all words with less than 3 characters (which I forgot about in my original answer), that's something a regex is good at... I'd say something like preg_replace('(\b|[^a-z])[a-z]{1,3}(\b|[^a-z])/i','$1$2',implode(',',$targetArray)); or someting like that.
You might want to test this one out, because this is just off the top of my head, and untested. But this would seem to enough to get you started
Maybe this will be better:
$filter = array("that","have","with",...);
foreach ($content as $key=>$value) {
if (in_array($value,$filter)){
unset($content[$key])
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With