SWI-Prolog -- unicode

Documentation
- Reference manual
- Packages
  - SWI-Prolog Unicode library
    - library(unicode): Unicode string handling

Availability::- use_module(library(unicode)).(can be autoloaded)

[det]unicode_map(+In, -Out, +Options)

Perform unicode normalization operations. Options is a list of operations. Defined operations are:

stable: Unicode Versioning Stability has to be respected.
compat: Compatiblity decomposition (i.e. formatting information is lost)
compose: Return a result with composed characters.
decompose: Return a result with decomposed characters.
ignore: Strip "default ignorable characters"
rejectna: Return an error, if the input contains unassigned code points.
nlf2ls: Indicating that NLF-sequences (LF, CRLF, CR, NEL) are representing a line break, and should be converted to the unicode character for line separation (LS).
nlf2ps: Indicating that NLF-sequences are representing a paragraph break, and should be converted to the unicode character for paragraph separation (PS).
nlf2lf: Indicating that the meaning of NLF-sequences is unknown.
stripcc: Strips and/or convers control characters. NLF-sequences are transformed into space, except if one of the NLF2LS/PS/LF options is given. HorizontalTab (HT) and FormFeed (FF) are treated as a NLF-sequence in this case. All other control characters are simply removed.
casefold: Performs unicode case folding, to be able to do a case-insensitive string comparison.
charbound: Inserts 0xFF bytes at the beginning of each sequence which is representing a single grapheme cluster (see UAX#29).
lump: (e.g. HYPHEN U+2010 and MINUS U+2212 to ASCII "-"). (See module header for details.) If NLF2LF is set, this includes a transformation of paragraph and line separators to ASCII line-feed (LF).
stripmark: Strips all character markings (non-spacing, spacing and enclosing) (i.e. accents) NOTE: this option works only with compose or decompose.