25 November 2008

Last year, Ryan McLaughlin at DaoByDesign came up with a plugin called Censortive, which replaces sensitive keywords in Wordpress blog posts with image equivalents, thereby avoiding keyword blocks like those mentioned in the last post. At the time, though, Chinese language support was problematic. But since then, some good open source Chinese font packages have been developed. Two that work are Wen Quan Yi (文泉驿) and Fireflysung (螢火飛點陣字型). There are a couple of other fonts here that might work as well. Instructions for installing Censortive are here.

The next step, of course, is making the list of keywords. Censortive works by assigning codewords to the words you want to replace, so that the actual words are not present in the html either (otherwise it wouldn’t really work). Unfortunately, that means that spreading a common set of codewords would probably work for a while, but if it were really successful the censors would begin scanning for the codewords. Let ‘em.

So we need to make a list. Here’s some places to start gathering words:

  1. The keyword list to a javascript that several sites host that claims to see how many “banned words” are in a webpage. The list seems a bit suspect, but worth searching in.

  2. Here’s a Google Doc of the banned words used in the Tom Online version of Skype.

  3. The ChinaSMACK Internet slang glossary may be useful.

  4. And here’s the Wikipedia list of censored words.

  5. This list from Roland Soong seems to still have some oomph after four years.

I would note that only ESWN’s page is blocked (for me), the rest aren’t, though maybe  the Google Docs one would be if it weren’t SSL.

So tell all your Chinese blogger friends they can now replace bad words with pictures thanks to this [nb] plugin.

