WCAG 2.0 is at release candidate stage and if you're bored already you should probably skip this post! One of the big things about WCAG is that they are attempting to make everything automatically testable. The idea is that you should be able to apply a tool and get an accurate idea of your conformance level. WCAG 1 had too many guidelines that you had to do by hand. Quite a chore when your site has over 30 000 pages.

One of the biggest challenges of WCAG 2.0 is readability. All your content should be written for a lower secondary level (years 7-9). There a number of tools available that you can use to determine your reading level. For instance if I run the tools over this page: then I get the following (having stripped out the html tags as they don't count as words):

Dale chall reading grade - Primary (5.6)
Flesch-kincaid grade level - 9.8
Automated readability index - 9.9
Coleman liau index - 12.8
Gunning fog score - 12.6
SMOG index - 9.6

You may know the Flesch-kincaid grade level as that is the one available in Word. As you can see our reading level is too high (but only slightly) in most of the reading indexes (except for the Dale Chall reading grade).

The Flesch-kincaid grade level is defined as 0.39 * average_words_per_sentence - 84.6 * average_syllables_per_word. Research has shown that sentence length is one of the biggest predictors for readability. But here we find a problem: since we are passing in html there a number of elements that should be treated as sentences but don't actually end in a full-stop. Eg: a heading 1 tag will rarely end in a full stop, but for the purposes of this tool it should have one. Ditto for lists, as the style at the ACCC is not to put a full-stop at the end of a list item unless it is actually a full sentence.

So if we add full-stops to the html where needed we get better results:

Dale chall reading grade - Primary (5.4)
Flesch-kincaid grade level - 7.4
Automated readability index - 6.8
Coleman liau index - 12.8
Gunning fog score - 10.1

Now our flesch-kincaid is much better and well within the boundaries of readability.

You'll notice that the dale-chall reading grade hasn't changed much. It is considered to be the most accurate of all reading indexes. It uses a list of 3000 common words (words that will be in the vocabulary of most of the population). If your word is not in there, then it is considered complex and the more complex words you have in a body of text the harder it is to read.

I've modified the code for the php-text-statistics project (as it doesn't take into consideration the html issue and doesn't do the Dale-chall reading grade). I'll hopefully submit this for addition to that project and then everyone can share!

And hopefully that wasn't completely boring to everyone!


Submitted by nemesis on Wed 03/09/2008 - 12:55

Having a reading level in a guideline does raise a philosophical question though. If the majority of websites target their language at a 7th-grade level, then it provides (effectively) no encouragement to the less literate members of society to improve their literacy levels; and doesn't help to improve the literacy of people reading at higher levels.

This could lead to a further dumbing down of society. Idiocracy is awfully prophetic.