3 Using Characters and Strings

3.1 Characters

Characters in Common Lisp are data objects that represent printed symbols or operations for formatting text. In this release of Liquid Common Lisp, the format of character data objects has changed to support international character sets. In previous releases, a character object contained only a single byte of character data. Although single-byte characters were not large enough to encode character sets with many characters, such as the Kanji character set, character operations were uniform and efficient.

In this release of Liquid Common Lisp, a character object can contain up to two bytes of character data. The larger size allows you to use characters from multiple coded character sets that contain more elements than could previously be represented.

In this release of Liquid Common Lisp, as in previous ones, bits and font attributes of characters have been retained as an extension to Common Lisp. Each character object has three attributes. The code attribute is the numerical encoding of the character, the bits attribute associates extra flags with the character, and the font attribute specifies the style of a character's glyph.

To accommodate the change in character formats, the character data type structure has changed as follows:

By default, character and string operations now allow the use of both base characters and extended characters. This gives your code the most flexibility in handling characters, but string operations might be slower. Section 3.2 on page 21 explains how to optimize string operations.

The following sections provide more information about the new character types, how to use them, and their effect on character operations.

3.1.1 - Standard and printing characters
3.1.2 - Character attributes
3.1.3 - New character operations
3.1.4 - Changes to Common Lisp character operations

International Character Sets - 9 SEP 1996

Generated with Harlequin WebMaker