1.4.4 Alternation, Aggregation, Encapsulation, and Enumeration
Alternation
The protobuf grammar provides a reserved word, optional
,
that indicates that the production rule that it refers to may appear
once or not at all in a protobuf message. Since Prolog has its own means
of alternation, this reserved word is not supported on the Prolog side.
It is anticipated that customary Prolog mechanisms for nondeterminism
(e.g. backtracking) will be used to generate and test alternatives.
Note that required
and optional
have been
removed from the proto3 specification, making all fields optional. This
has been partially revised in releases 3.12 and later. In general, you
should not expect any field to exist, nor can you expect a repeated
field to have at least one item.
Also note that the handling of missing fields is slightly different in proto2 and proto3 -- proto2 allows specifying a default value but proto3 uses 0 and =""= as defaults for numbers and strings and omits encoding any field that has one of those default values.
TODO
: determine correct behvaior for oneof
with a default field value.Aggregation
It is possible to specify homogeneous vectors of things (e.g. lists
of numbers) using the repeated
attribute. You specify a
repeated field as follows:
repeated(22, float([1,2,3,4])), repeated(23, enum(tank_state([empty, half_full, full]))).
The first clause above will cause all four items in the list to be encoded in the wire-stream as IEEE-754 32-bit floating point numbers, all with tag 22. The decoder will aggregate all items in the wire-stream with tag 22 into a list as above. Likewise, all the items listed in the second clause will be encoded in the wire-stream according to the mapping defined in an enumeration (described below) tank_state/2, each with tag 23.
You can also encode vectors of embedded messages using repeated_embedded
.
This uses a "template" for the individual messages and a list of
messages in the wire stream. For example: repeated_embedded(Tag, protobuf([string(1,_Key),string(2,_Value)]), Fields)
where Fields
gets a list (possibly empty), with each item
of the form
protobuf([string(1,_Key),string(2,_Value)])
.
Notes:
Beware that there is no explicit means to encode an empty set. The
protobuf specification provides that a repeated
field may
match a tag zero or more times. The empty set, while legal, produces no
output on encode. While decoding a repeated
term, failure
to match the specified tag will yield an empty set of the specified host
type.
An omitted optional
field is handled the same way as a repeated
field with an empty set.
The protobuf grammar provides a variant of the repeated
field known as "packed." This is represented similar to repeated
,
e.g.:
packed(22, float([1,2,3,4])), packed(23, enum(tank_state([empty, half_full, full]))).
Handling missing fields
For input, you can wrap fields in repeated
, so that if a
field is there, it gets a length-1 list and if it's missing, an empty
list:
?- Codes = [82,9,105,110,112,117,116,84,121,112,101], protobuf_message(protobuf([embedded(10, protobuf([repeated(13, integer64(I))]))]), Codes), protobuf_message(protobuf([embedded(10, protobuf([repeated(13, double(D))]))]), Codes), protobuf_message(protobuf([repeated(10, string(S))]), Codes). I = [7309475598860382318], D = [4.272430685433854e+180], S = ["inputType"].
?- Codes = [82,9,105,110,112,117,116,84,121,112,101], protobuf_message(protobuf([repeated(10, string(S)), repeated(11, integer64(I))]), Codes). S = ["inputType"], I = [].
This technique can also be used for output - a missing field simply produces nothing in the wire format:
?- protobuf_message(protobuf([repeated(10, string([]))]), Codes). Codes = []. ?- protobuf_message(protobuf([repeated(10, string(S))]), []). S = [].
Encapsulation and Enumeration
It is possible to embed one protocol buffer specification inside
another using the embedded
term. The following example
shows a vector of numbers being placed in an envelope that contains a
command enumeration.
Enumerations are a compact method of sending tokens from one system to another. Most occupy only two bytes in the wire-stream. An enumeration requires that you specify a callable predicate like commands/2, below. The first argument is an atom specifying the name of token, and the second is an integer that specifies the token's value. These must of course, match a corresponding enumeration in the .proto file.
Note: You must expose this predicate to the protobufs module by assigning it explicitly.
protobufs:commands(Key, Value) :- commands(Key, Value). commands(square, 1). commands(decimate, 2). commands(transform, 3). commands(inverse_transform, 4). basic_vector(Type, Proto) :- vector_type(Type, Tag), Proto = protobuf([ repeated(Tag, Type) ]). send_command(Command, Vector, WireCodes) :- basic_vector(Vector, Proto1), Proto = protobuf([enum(1, commands(Command)), embedded(2, Proto1)]), protobuf_message(Proto, WireCodes).
Use it as follows:
?- send_command(square, double([1,22,3,4]), WireCodes). WireCodes = [8, 1, 18, 36, 17, 0, 0, 0, 0, 0, 0, 240, 63, 17, 0, 0, 0, 0, 0, 0, 54, 64, 17, 0, 0, 0, 0, 0, 0, 8, 64, 17, 0, 0, 0, 0, 0, 0, 16, 64]. ?- send_command(Cmd, V, $WireCodes). Cmd = square, V = double([1.0, 22.0, 3.0, 4.0]).
Compatibility Note: The protobuf grammar (protobuf-2.1.0) permits enumerations to assume negative values. This requires them to be encoded as integers. Google's own Golden Message unit-test framework has enumerations encoded as regular integers, without the "zigzag" encoding. Therefore, negative values are space-inefficient, but they are allowed.
An earlier version of protobuf_message/2 assumed that enumeration values could not be zero, and there might still be incorrect assumptions in the code, resulting in either exceptions or silent failure.Heterogeneous Collections
Using Protocol Buffers, it is easy to specify fixed data structures and homogeneous vectors like one might find in languages like C++ and Java. It is however, quite another matter to interwork with these languages when requirements call for working with compound structures, arrays of compound structures, or unstructured collections (e.g. bags) of data.
At bottom, a wire-stream is nothing more than a concatenated stream of primitive wire type strings. As long as you can associate a tag with its host type in advance, you will have no difficulty in decoding the message. You do this by supplying the structure. Tell the parser what is possible and let the parser figure it out on its own, one production at a time. An example may be found in the appendix.