12.9.5 Foreign stream encoding
IOSTREAM
has a field encoding
that is
managed at initialization from SIO_TEXT
. The available
encodings are defined as a C enum as below.
typedef enum { ENC_UNKNOWN = 0, /* invalid/unknown */ ENC_OCTET, /* raw 8 bit input */ ENC_ASCII, /* US-ASCII (0..127) */ ENC_ISO_LATIN_1, /* ISO Latin-1 (0..256) */ ENC_ANSI, /* default (multibyte) codepage */ ENC_UTF8, ENC_UNICODE_BE, /* big endian unicode file */ ENC_UNICODE_LE, /* little endian unicode file */ ENC_WCHAR /* wchar_t */ } IOENC;
Binary streams always have the encoding ENC_OCTET
.
The default encoding of a text stream depends on the Prolog flag
encoding. The
encoding is used by all functions that perform text I/O on a stream. The
encoding can be changed at any moment using Ssetenc()
which is available from Prolog using the set_stream/2
encoding(Encoding)
property. Functions that explicitly
manage the encoding are:
- int Ssetenc(IOSTREAM *s, IOENC new_enc, IOENC *old_enc)
- Set the encoding for s to new_enc and, if old_enc
is not
NULL
, return the old encoding. This function may fail, returning -1 if the Scontrol_function() of the stream returns -1 on theSIO_SETENCODING
request. On succcess it returns 0. If new_enc isENC_OCTET
the stream is switched to binary mode. Otherwise text mode is enabled. - int ScheckBOM(IOSTREAM *s)
- This function may be called on a buffered input stream immediately after
opening the stream. If the stream starts with a known Byte Order
Mark (BOM) the encoding is set accordingly and the flag
SIO_BOM
is set on the stream. Possibly resulting encodings areENC_UTF8
,ENC_UNICODE_BE
andENC_UNICODE_LE
. - int SwriteBOM(IOSTREAM *s)
- This function writes a Byte Order Mark (BOM) to s
and should be called immediately after opening a stream for writing. If
the encoding is one of
ENC_UTF8
,ENC_UNICODE_BE
orENC_UNICODE_LE
it writes the code point\ufeff
(a zero-width white space) to the stream in the current encoding and sets theSIO_BOM
flag on the stream. - int Scanrepresent(int c, IOSTREAM *s)
- Returns 0 if the encoding of s can represent the code point c and -1 otherwise.