4 Stream encoding issues
The parser can deal with ISO Latin-1 and UTF-8 encoded files, doing
decoding based on the encoding argument provided to
set_sgml_parser/2
or, for XML, based on the encoding
attribute of the XML
header. The parser reads from SWI-Prolog streams, which also provide
encoding handling. Therefore, there are two modes for parsing. If the
SWI-Prolog stream has encoding octet
(which is the default
for binary streams), the decoder of the SGML parser will be used and
positions reported by the parser are octet offsets in the stream. In
other cases, the Prolog stream decoder is used and offsets are character
code counts.