All Manuals > LispWorks User Guide and Reference Manual > 22 Internationalization > 22.6 External Formats and File Streams

NextPrevUpTopContentsIndex

22.6.3 Guessing the external format

If open or with-open-file gets a non-complete :external-format argument ef-spec then the system decides which external format to use by calling the function guess-external-format.

The default behavior of guess-external-format is as follows:

  1. When ef-spec' s name is :default , this finds a match based on the filename; or (if that fails), looks in the Emacs-style (-*-) attribute line for an option called ENCODING or EXTERNAL-FORMAT; or (if that fails), chooses from amongst likely encodings by analysing the bytes near the start of the file, or (if that fails) uses a default encoding. Otherwise ef-spec 's name is assumed to name an encoding and this encoding is used.
  2. When ef-spec does not include the :eol-style parameter, it then also analyses the start of the file for byte patterns indicating the end-of-line style, and uses a default end-of-line style if no such pattern is found.

The file in this example was written by a Windows program which writes the Byte Order Mark at the start of the file, indicating that it is Unicode (UCS-2) encoded. The routine in step 1 above detects this:

(set-default-character-element-type 'simple-char)
=>
SIMPLE-CHAR
 
(with-open-file (ss "C:/temp/unicode-notepad.txt") 
  (stream-external-format ss))
=>
(:UNICODE :LITTLE-ENDIAN T :EOL-STYLE :CRLF)

The behavior of guess-external-format is configurable via the variables *file-encoding-detection-algorithm* and *file-eol-style-detection-algorithm*. See the manual pages for details.

22.6.3.1 Example of using UTF-8 by default


LispWorks User Guide and Reference Manual - 21 Dec 2011

NextPrevUpTopContentsIndex