
sgml.pl -- SGML, XML and HTML parser
This library allows you to parse SGML, XML and HTML data into a Prolog data structure. The library defines several families of predicates:
- High-level predicates
-
Most users will only use load_html/3, load_xml/3 or load_sgml/3 to
parse arbitrary input into a DOM structure. These predicates all
call load_structure/3, which provides more options and may be
used for processing non-standard documents.
The DOM structure can be used by library(xpath) to extract information from the document.
- The low-level parser
- The actual parser is written in C and consists of two parts: one for processing DTD (Document Type Definitions) and one for parsing data. The data can either be parsed to a Prolog (DOM) term or the parser can perform callbacks for the DOM events.
- Utility predicates
- Finally, this library provides prmitives for classifying characters and strings according to the XML specification such as xml_name/1 to verify whether an atom is a valid XML name (identifier). It also provides primitives to quote attributes and CDATA elements.
Undocumented predicates
The following predicates are exported, but not or incorrectly documented.
get_sgml_parser(Arg1, Arg2)
open_dtd(Arg1, Arg2, Arg3)
dtd(Arg1, Arg2)
load_html_file(Arg1, Arg2)
iri_xml_namespace(Arg1, Arg2)
load_html(Arg1, Arg2, Arg3)
xml_ideographic(Arg1)
xml_name(Arg1, Arg2)
xml_quote_cdata(Arg1, Arg2, Arg3)
sgml_parse(Arg1, Arg2)
new_sgml_parser(Arg1, Arg2)
dtd_property(Arg1, Arg2)
load_structure(Arg1, Arg2, Arg3)
iri_xml_namespace(Arg1, Arg2, Arg3)
xml_combining_char(Arg1)
xsd_number_string(Arg1, Arg2)
xml_quote_attribute(Arg1, Arg2)
free_sgml_parser(Arg1)
new_dtd(Arg1, Arg2)
load_dtd(Arg1, Arg2)
load_sgml_file(Arg1, Arg2)
xml_digit(Arg1)
load_sgml(Arg1, Arg2, Arg3)
xsd_time_string(Arg1, Arg2, Arg3)
xml_quote_cdata(Arg1, Arg2)
sgml_register_catalog_file(Arg1, Arg2)
set_sgml_parser(Arg1, Arg2)
free_dtd(Arg1)
load_dtd(Arg1, Arg2, Arg3)
xml_is_dom(Arg1)
load_xml_file(Arg1, Arg2)
load_xml(Arg1, Arg2, Arg3)
xml_extender(Arg1)
xml_basechar(Arg1)
xml_name(Arg1)
xml_quote_attribute(Arg1, Arg2, Arg3)