Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count how often the word occurs in the text in PHP

Tags:

php

In php I need to Load a file and get all of the words and echo the word and the number of times each word shows up in the text, (I also need them to show up in descending order most used words on top) ★✩

like image 352
Klanestro Avatar asked Jan 23 '10 13:01

Klanestro


3 Answers

Here's an example:

$text = "A very nice únÌcÕdë text. Something nice to think about if you're into Unicode.";

// $words = str_word_count($text, 1); // use this function if you only want ASCII
$words = utf8_str_word_count($text, 1); // use this function if you care about i18n

$frequency = array_count_values($words);

arsort($frequency);

echo '<pre>';
print_r($frequency);
echo '</pre>';

The output:

Array
(
    [nice] => 2
    [if] => 1
    [about] => 1
    [you're] => 1
    [into] => 1
    [Unicode] => 1
    [think] => 1
    [to] => 1
    [very] => 1
    [únÌcÕdë] => 1
    [text] => 1
    [Something] => 1
    [A] => 1
)

And the utf8_str_word_count() function, if you need it:

function utf8_str_word_count($string, $format = 0, $charlist = null)
{
    $result = array();

    if (preg_match_all('~[\p{L}\p{Mn}\p{Pd}\'\x{2019}' . preg_quote($charlist, '~') . ']+~u', $string, $result) > 0)
    {
        if (array_key_exists(0, $result) === true)
        {
            $result = $result[0];
        }
    }

    if ($format == 0)
    {
        $result = count($result);
    }

    return $result;
}
like image 199
Alix Axel Avatar answered Sep 23 '22 01:09

Alix Axel


$words = str_word_count($text, 1);
$word_frequencies = array_count_values($words);
arsort($word_frequencies);
print_r($word_frequencies);
like image 35
goat Avatar answered Sep 21 '22 01:09

goat


This function uses a regex to find words (you might want to change it, depending on what you define a word as)

function count_words($text)
{
    $output = $words = array();
    preg_match_all("/[A-Za-z'-]+/", $text, $words); // Find words in the text

    foreach ($words[0] as $word)
    {
        if (!array_key_exists($word, $output))
            $output[$word] = 0;

        $output[$word]++; // Every time we find this word, we add 1 to the count
    }

    return $output;
}

This iterates over each word, constructing an associative array (with the word as the key) where the value refers to the occurences of each word. (e.g. $output['hello'] = 3 => hello occured 3 times in the text).

Perhaps you might want to change the function to deal with case insensitivity (i.e. 'hello' and 'Hello' are not the same word, according to this function).

like image 21
robbo Avatar answered Sep 21 '22 01:09

robbo