Fonts & Encodings
Download Full Version of the eBook "Fonts & Encodings"
Download - Fonts & Encodings: From Advanced Typography to Unicode and Everything in Between by Yannis Haralambous - PDF
Homo sapiens is a species that writes. And among the large number of tools used for writing, the most recent and the most complex is the computer—a tool for reading and writing, a medium for storage, and a means of exchanging data, all rolled into one. It has become a veritable space in which the text resides, a space that, as MacLuhan and others correctly predicted, has come to transcend geographic barriers and encompass the entire planet.
Within this digital space for writing, fonts and encodings serve fundamentally different needs. Rather, they form an inseparable duo, like yin and yang, Heaven and Earth, theory and practice. An encoding emerges from the tendency to conceptualize information; it is the result of an abstraction, a construction of the mind. A font is a means of visually representing writing, the result of concrete expression, a graphical construct.
An encoding is a table of characters—a character being an abstract, intangible entity. A font is a container for glyphs, which are images, drawings, physical marks of black ink on a white background. When the reader enters the digital space for writing, he participates in the unending ballet between characters and glyphs: the keys on the keyboard aremarked with glyphs; when a key is pressed, a character is transmitted to the system, which, unless the user is entering a password, in turn displays glyphs on the screen. To send an email message is to send characters, but these are displayed to the recipient in the formof glyphs. When we run a search on a text file, we search for a string of characters, but the results are shown to us as a sequence of glyphs. And so on.
For the Western reader, this perpetual metamorphosis between characters and glyphs remains on the philosophical level. That is hardly surprising, as European writing systems have divided their fundamental constituents (graphemes) so that there is a one-to-one correspondence between character and glyph. Typophiles have given us some exceptions that prove the rule: in the word “film” there are four letters (and therefore four characters) but only three glyphs (because the letters ‘f ’ and ‘i’ combine to form only one glyph). This phenomenon, which is called a ligature, can be orthographically significant (as is the case for the ligature ‘oe’, in French) or purely aesthetic (as with the f-ligatures ‘fi’, ‘ff’, ‘ffi’, etc.).
In any case, these phenomena are marginal in our very cut-and-dried Western world. In the writing systems of the East, however, the conflict between characters and glyphs becomes an integral part of daily life. In Arabic, the letters are connected and assume different forms according to their position in the word. In the languages of India and Southeast Asia, they combine to form more and more complex graphical amalgamations. In the Far East, the ideographs live in a sort of parallel universe, where they are born and die, change language and country, clone themselves, mutate genetically, and carry a multitude of meanings.
Despite the trend towards globalization, the charm of the East has in no way died out; its writing systems still fire our dreams. But every dream is a potential nightmare. Eastern writing systems present a challenge to computer science—a challenge that goes beyond mere technical problems. Since writing—just like images, speech, and music—is one of the fundamental concerns of humanity, computer science cannot approach it haphazardly: Eastern writing systemsmust be handled just as efficiently as the script that is part of our Latin cultural heritage. Otherwise, some of those writing systems may not survive computerization.
But more is at stake than the imperatives of cultural ecology. The French say that “travel educates the young”. The same goes for writing: through thinking about the writing systems of other cultures and getting to know their problems and concerns, we come to know more about our own.
Then there is also the historical perspective: in the digital space for writing that we are exploring in this book, the concepts and techniques of many centuries dwell together. Terminology, or rather the confusion that reigns in this field, clearly shows that computer science, despite its newness, lies on a historical continuum of techniques and practices. For example, when we set type in Times Ten at 8 points, we say that we are using a “body size of 8 points” and an “optical size of 10 points”. Can the same characters have two different sizes? To understand the meaning of these terms, it is necessary to trace the development of the concept of “type size” from the fifteenth century to the PostScript and TrueType fonts of our modern machines.
So far we have briefly surveyed the three axes on which this book is based: the systemic approach (abstraction/concrete expression, encoding/font, character/glyph), geographicity (East/West), historicity (ancient/modern, mechanical/computerized processes). These three aspects make up the complexity and the scope of our subject, namely the exploration of the digital space for writing.
Finally, there is a fourth axis, less important than the previous three but still well grounded in our day-to-day reality, which is industrial competition. A phenomenon that leads to an explosion in technologies, to gratuitous technicality, to a deliberate lack of clarity in documentation, and to all sorts of other foolish things that give the world of business its supposed charm. If we didn’t have PostScript fonts and TrueType fonts and OpenType fonts and Apple Advanced Typography (AAT) fonts, the world might be a slightly better place and this book would be several hundred pages shorter.
In this regard, the reader should be aware of the fact that everything pertaining to encodings, and to fonts in particular, is considered to be industrial knowledge and therefore cannot be disseminated, at least not completely. It is hard to imagine how badly the “specifications” of certain technologies are written, whether because of negligence or out of a conscious desire to prevent the full use of the technologies. Some of the appendices of this book were written for the very purpose of describing certain technologies with a reputation for inaccessibility, such as AAT tables and TrueType instructions, as clearly and exhaustively as possible.
In the remainder of this introduction, we shall outline, first of all, the jargon used in the rest of the book, so as to clarify the historical development of certain terms. This will also enable us to give an overview of the transition from mechanical to computerized processes.
Next, we will give the reader a synthetic view of the book by outlining several possible ways to approach it. Each profile of a typical reader thatwe present is focused on a specific area of interest, a particular way to use this book.We hope that this part of the introduction will allow the reader to find her own path through the forest of 2.5 million letters that she is holding in her hands.