Flesch-Kincaid Readability: Improve PHP function

Tags:

I wrote this PHP code to implement the Flesch-Kincaid Readability Score as a function:

function readability($text) {
    $total_sentences = 1; // one full stop = two sentences => start with 1
    $punctuation_marks = array('.', '?', '!', ':');
    foreach ($punctuation_marks as $punctuation_mark) {
        $total_sentences += substr_count($text, $punctuation_mark);
    }
    $total_words = str_word_count($text);
    $total_syllable = 3; // assuming this value since I don't know how to count them
    $score = 206.835-(1.015*$total_words/$total_sentences)-(84.6*$total_syllables/$total_words);
    return $score;
}

Do you have suggestions how to improve the code? Is it correct? Will it work?

I hope you can help me. Thanks in advance!

446

asked Jul 02 '09 21:07

caw

1 Answers

The code looks fine as far as a heuristic goes. Here are some points to consider that make the items you need to calculate considerably difficult for a machine:

What is a sentence?

Seriously, what is a sentence? We have periods, but they can also be used for Ph.D., e.g., i.e., Y.M.C.A., and other non-sentence-final purposes. When you consider exclamation points, question marks, and ellipses, you're really doing yourself a disservice by assuming a period will do the trick. I've looked at this problem before, and if you really want a more reliable count of sentences in real text, you'll need to parse the text. This can be computationally intensive, time-consuming, and hard to find free resources for. In the end, you still have to worry about the error rate of the particular parser implementation. However, only full parsing will tell you what's a sentence and what's just a period's other many uses. Furthermore, if you're using text 'in the wild' -- such as, say, HTML -- you're going to also have to worry about sentences ending not with punctuation but with tag endings. For instance, many sites don't add punctuation to h1 and h2 tags, but they're clearly different sentences or phrases.
Syllables aren't something we should be approximating

This is a major hallmark of this readability heuristic, and it's one that makes it the most difficult to implement. Computational analysis of syllable count in a work requires the assumption that the assumed reader speaks in the same dialect as whatever your syllable count generator is being trained on. How sounds fall around a syllable is actual a major part of what makes accents accents. If you don't believe me, try visiting Jamaica sometime. What this means it that even if a human were to do the calculations for this by hand, it would still be a dialect-specific score.
What is a word?

Not to wax psycholingusitic in the slightest, but you will find that space-separated words and what are conceptualized as words to a speaker are quite different. This will make the concept of a computable readability score somewhat questionable.

So in the end, I can answer your question of 'will it work'. If you're looking to take a piece of text and display this readability score among other metrics to offer some kind of conceivable added value, the discerning user will not bring up all of these questions. If you are trying to do something scientific, or even something pedagogical (as this score and those like it were ultimately intended), I wouldn't really bother. In fact, if you're going to use this to make any kind of suggestions to a user about content that they have generated, I would be extremely hesitant.

A better way to measure reading difficulty of a text would more likely be something having to do with the ratio of low-frequency words to high-frequency words along with the number of hapax legomena in the text. But I wouldn't pursue actually coming up with a heuristic like this, because it would be very difficult to empirically test anything like it.

109

answered Sep 23 '22 08:09

Robert Elwell

Related questions
                            
                                Display current post custom taxonomy in WordPress
                            
                                PHP Submit on Select
                            
                                How to set active sheet without loading an xlsx file?
                            
                                how to run php function without reloading the page
                            
                                How to get the tags associated to an article in Joomla
                            
                                what's wrong with my require_once path?
                            
                                Add new column to wordpress database
                            
                                Laravel: How to use multiple pivot table relationships
                            
                                Warning: number_format() expects parameter 1 to be double, string given [closed]
                            
                                bin/console missing after running composer install
                            
                                Stopping in-built php server on Mac Mavericks - Livecode
                            
                                php extension mcrypt must be loaded
                            
                                Redirect http to https in Yii2 .htaccess
                            
                                TokenMismatchException in VerifyCsrfToken - Laravel 5.1
                            
                                PHPMailer - OpenSSL Error
                            
                                Class 'App\Providers\AppServiceProvider' not found
                            
                                Mobile number validation pattern in PHP
                            
                                Facebook login: Please make sure your redirect_uri is identical to the one you used in the OAuth dialog
                            
                                Slim 3 getParsedBody() always null and empty
                            
                                Should I use a framework or write my own MVC?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Flesch-Kincaid Readability: Improve PHP function

Tags:

php

formula

readability

flesch-kincaid

caw

People also ask

1 Answers

Robert Elwell

Recent Activity

Donate For Us