2.5.8.1 Strings
SWI-Prolog string handling has evolved over time. The functions that
create atoms or strings using char*
or wchar_t*
are “old school” ; similarly with functions that get the
string as
char*
or wchar_t*
. The PL_get_unify_put_[nw]chars()
family is more friendly when it comes to different input, output,
encoding and exception handling.
Roughly, the modern API is PL_get_nchars(), PL_unify_chars() and PL_put_chars() on terms. There is only half of the API for atoms as PL_new_atom_mbchars() and PL-atom_mbchars(), which take an encoding, length and char*.
For return values, char*
is dangerous because it can
point to local or stack memory. For this reason, wherever possible, the
C++ API returns a std::string
, which contains a copy of the
string. This can be slightly less efficient that returning a
char*
, but it avoids some subtle and pervasive bugs that
even address sanitizers can't detect.21If
we wish to minimize the overhead of passing strings, this can be done by
passing in a pointer to a string rather than returning a string value;
but this is more cumbersome and modern compilers can often optimize the
code to avoid copying the return value.
Some functions require allocating string space using PL_STRINGS_MARK().
The PlStringBuffers
class provides a RAII wrapper
that ensures the matching PL_STRINGS_RELEASE() is done. The PlAtom
or PlTerm
member functions that need the string buffer use PlStringBuffers
,
and then copy the resulting string to a std::string
value.
The C++ API has functions such as PlTerm::get_nchars()
that use
PlStringBuffers
and then copy the result to a
std::string
result, so the programmer often doesn't need to
use PlStringBuffers
.
- PlStringBuffers
- A RAII wrapper for allocating a string that is created using
BUF_STACK
. This isn't needed if you use a method such as PlTerm::as_string(), but is needed for calling certain PL_*() or Plx_*() wrapped functions.The constructor calls PL_STRINGS_MARK() and the destructor calls PL_STRINGS_RELEASE(). Here is an example of its use, for writing an atom to a stream, using Plx_atom_wchars(), which must be called within a strings buffer:
PREDICATE(w_atom_cpp, 2) { auto stream(A1), term(A2); PlStream strm(stream, STIO_OUTPUT); PlStringBuffers _string_buffers; const pl_wchar_t *sa = Plx_atom_wchars(term.as_atom().unwrap(), nullptr); strm.printfX("/%Ws/", sa); return true; }