- Name: Jukka K. Korpela
Wednesday, October 24, 2012
Saturday, September 02, 2006
The paradox of Unicode adoption
Unicode works in casual memos but not in books
Everyone can use Unicode these days, as long as you work with a reasonably new computer system and use software like a common word processor (that’s PC phrase for MS Word) or even Notepad. Everyone and his dog can also create web pages using Unicode without trying hard. You can compose E-mail in Unicode, though the odds are that many recipients will see it as distorted by the software they use; webmail systems are particularly primitive as regards to using anything beyond Ascii characters. Among Internet discussion forums, there are already many alternatives that let you write and read Unicode easily, as long as you’ve learned how to type the characters.
But if you try to write a book or an article for a printed publication, you will typically be in a deep trouble if you try to use anything beyond the Latin 1 repertoire. Everything works fine in your text processor, but as soon as it reaches the publisher’s system, characters will get munged in imaginative ways. Widely used publishing software like FrameMaker or InDesign just don’t grok Unicode yet. Troubles are also ahead if you try to enter characters beyond Latin 1 into a database in the naïve expectation that databases are generally Unicode-enabled.
In practice, you should probably accept the fact that anything beyond Latin 1 needs to be expressed using images, in a printed publication. This is fairly stupid especially if you write about extended character repertoires, as I often do. You cannot show examples of special characters in running text.
I guess there is a possible solution in many cases, but it’s not acceptable to many publishing houses and typographers: the author prepares the entire material in MS Word and converts it to PDF format. That way he can check the result easily and fix it as needed. You may need to create the PDF file using font embedding techniques, so that the file contains the fonts it needs. And there are probably pitfalls, and many authors wouldn’t know how to handle the process, but I think the real main objection is that such approaches are “primitive.” The real primitiveness, however, is in the limitations of current publishing software. Software that cannot handle characters beyond an 8-bit set in any reasonable way is comparable to a system that cannot handle letters “x” and "y,” since to many languages and cultures, some “special” or “extra” characters are just as essential as “x” and "y” are in English.
Thursday, August 17, 2006
Publishing made too easy?
More or less, everyone who knows how to write, in the very technical sense of the words, as a matter of basic literacy, can have a blog.
However, the question arises whether publishing has been made too easy. When everyone can publish texts about everything, he will. It becomes more and more difficult to find anything interesting on the Web. Who would like to read everyone’s texts? Most people mostly have nothing to say that might be of interest to any wide audience, or they lack the basic skills of saying it.
Then again, I might be an elitist. Actually, I am. Most people can speak, and most people can learn to write, too. There is an ongoing change that is making written communication far more important than it used to be. Almost everyone sends text messages and E-mail, and blogs just add an interesting ingredient. The publication threshold is vanishing, and we will just need to develop better filters for what we even consider reading.