26.3 Character and String types
26.3.1 Character types
LispWorks supports all the characters in the Unicode range
[0, #x10ffff], excluding the surrogate range
[#xd800, #xdfff]. Note that character objects corresponding to surrogate code points may be produced by some APIs in LispWorks, but not by the interfaces that you should normally use to generate characters and strings in Common Lisp (that is
cl:code-char, reading from a stream, converting from a foreign string, loading and storing from or to strings).
The following subtypes of character are defined:
cl:char-code less than
cl:char-code less than
#x10000 (BMP stands for Basic Multilingual Plane in Unicode).
26.3.2 Compatibility notes
In LispWorks 6.1 and earlier versions, characters with codes up to
#x10000 are supported, and surrogate code points are allowed.
bmp-char was new in LispWorks 7.0, and matches the range of characters in LispWorks 6.1 and earlier versions, except that surrogate code points are no longer valid.
In LispWorks 6.1 and earlier versions there is simple-char which is now a synonym for
cl:character is preferable and portable.
In LispWorks 6.1 and earlier versions character bits attributes are supported, and also some characters represent keyboard gestures. These are no longer supported.
26.3.3 Character Syntax
All simple characters have names that consist of
U+ followed by the code of the character in hexadecimal, for example
The hexadecimal number must be 4-6 characters, for example
#\U+a0 is illegal. Use
Additionally, Latin-1 characters have names derived from the ISO10646 name, for example:
(char-name (code-char 190))
Names are also provided for space characters:
Note that surrogate characters, that is the inclusive range
[#xd800, #xdfff] are not acceptable, and trying to read such a character, for example
#\U+d835, produces an error.
26.3.4 Compatibility notes
In LispWorks 6.1 and earlier versions you can specify bits in character names. This is illegal in LispWorks 7.0 and later.
In LispWorks 6.1 and earlier versions character codes are limited to less than
#x10000, and surrogate code points are allowed.
26.3.5 String types
String types are supplied which are capable of holding each of the character types mentioned above. The following string types are defined:
holds any bmp-char.
cl:character (see Character types).
bmp-string was new in 7.0. In LispWorks 6.1 and earlier versions there is augmented-string, this is now a synonym for text-string and is deprecated.
In LispWorks 6.1 and earlier versions, text-string could hold characters with codes less than
The types above include non-simple strings - those which are displaced, adjustable or with a fill-pointer.
The Common Lisp type string itself is dependent on the value of
*default-character-element-type* according to the rules for string construction described in String Construction. For example:
CL-USER 1 > (set-default-character-element-type 'base-char)
CL-USER 2 > (coerce (list #\Ideographic-Space) 'string)
Error: #\Ideographic-Space is not of type BASE-CHAR.
1 (abort) Return to level 0.
2 Return to top loop level 0.
Type :b for backtrace or :c <option number> to proceed.
Type :bug-form "<subject>" for a bug report template or :? for other options.
CL-USER 3 : 1 > :a
CL-USER 4 > (set-default-character-element-type 'character)
CL-USER 5 > (coerce (list #\Ideographic-Space) 'string)
The following types are subtypes of
cl:simple-string. Note that in the names of the string types, 'simple' refers to the string object and does not mean that the string's elements are
holds any bmp-char.
The Common Lisp type
simple-string itself is dependent on the value of *default-character-element-type* according to the rules for string construction described in String Construction.
126.96.36.199 String types at run time
The type string (and hence
simple-string) is defined by ANSI Common Lisp to be a union of all the character array types. This makes a call like
(coerce s 'simple-string)
ambiguous because it needs to select a concrete type (such as
When LispWorks is running with
*default-character-element-type* set to
base-char, it expects that you will want strings with element type
base-char, so functions like
coerce treat references to
simple-string as if they were
(simple-array base-char (*)).
If you call set-default-character-element-type with a larger character type, then
simple-string becomes a union of the array types that are subtypes of that character type.
LispWorks User Guide and Reference Manual - 20 Sep 2017