1 Motivation
The core facility for string matching in Prolog is provided by DCG (Definite Clause Grammars). Using DCGs is typically more verbose but gives reuse, modularity, readability and mixing with arbitrary Prolog code in return. Supporting regular expressions has some advantages: (1) in simple cases, the terse specification of a regular expression is more comfortable; (2) many programmers are familar with them; and (3) regular expressions are part of domain specific languages one may wish to implement in Prolog, e.g., SPARQL.
There are roughly three options for adding regular expressions to
Prolog. One is to simply interpret them in Prolog. Given Prolog's
unification and backtracking facilities this is remarkable simple and
performs quite reasonably. Still, implementing all facilities of modern
regular expression engines requires significant effort. Alternatively,
we can compile them into DCGs. This brings terse expressions to
DCGs while staying in the same framework. The disadvantage is that
regular expressions become programs that are hard to work with, making
this approach less attractive for applications that potentially execute
many different regular expressions. The final option is to wrap an
existing regular expression engine. This provides access to a robust
implementation for which we only have to document the Prolog binding.
That is the option taken by library library(pcre)
.