cl-protobufs is an implementation of Google protocol buffers for Common Lisp.
-
Install
protocCommon Lisp code for a given
.protofile is generated by a plug-in forprotoc, the protocol buffer compiler. The plug-in is written in C++ and requires the full version of Google's protocol buffer code to be installed in order to build, not just the precompiled protoc binaries. We also require Google's ABSL C++ library to be installed.Depending on your package manager, you may be able to install these libraries through
apt(or your system's package manager). If you need to isntall from source you can see the example in our continuous integration tests.Make sure the
protocbinary is on yourPATH. -
Build the Lisp
protocpluginWe use CMake to install the Lisp protoc plugin.
$ cd cl-protobufs/protoc $ cmake . -DCMAKE_CXX_STANDARD=17 $ cmake --build . --target install --parallel 16
Make sure the installation directory is on your
PATH.
There are two ways of doing this, either using protoc or ASDF.
If you add :defsystem-depends-on (:cl-protobufs.asdf) to your defsystem,
ASDF can generate Lisp code directly from your .proto files. For each
.proto file add a component of type :protobuf-source-file with a
:proto-pathname. You may also need to specify :proto-search-path to
help the protoc compiler find protos imported by your .proto file.
The pathnames can be relative with respect to the pathname of the system
you are building.
Several examples can be found in cl-protobufs.asd.
To test your build, try generating Lisp code from the
cl-protobufs/tests/case-preservation.proto file with the following command.
Note that the command may differ slightly depending on what directory you're in
and where you installed protoc-gen-cl-pb. In this case we assume you're in the
directory containing the cl-protobufs directory. The reason will become
clear in a moment.
$ protoc --plugin=protoc-gen-cl-pb=/usr/local/bin/protoc-gen-cl-pb \
--cl-pb_out=output-file=case-preservation.lisp:/tmp \
cl-protobufs/tests/case-preservation.protoThis command should generate a file named case-preservation.lisp in the
/tmp/ directory.
When a .proto file imports another .proto file, protoc needs to know how
to find the imported file. It does this by looking for the file relative to the
values passed to it with the --proto_path option (or the -I short option).
To see an example of this, you can try generating Lisp code for
cl-protobufs/tests/extend.proto. Still in the same directory, run the
following command:
protoc --plugin=protoc-gen-cl-pb=/usr/local/bin/protoc-gen-cl-pb \
--cl-pb_out=output-file=extend.lisp:/tmp
--proto_path=cl-protobufs/tests \
cl-protobufs/tests/extend.protoThe file /tmp/extend.lisp should be generated. Note that the .lisp file for
each imported file also needs to be generated separately.
Build and run the tests with ASDF:
-
Install Quicklisp and make sure to add it to your Lisp implementation's init file.
-
Install ASDF if it isn't part of your Lisp implementation.
-
Create a link to cl-protobufs so that Quicklisp will use the local version:
$ cd ~/quicklisp/local-projects $ ln -s .../path/to/cl-protobufs
-
Start Lisp and evaluate
(ql:quickload :cl-protobufs). -
Load and run the tests:
cl-user> (asdf:test-system :cl-protobufs)
- Create a pull request like usual through GitHub.
- Sign the Google CLA agreement. This must be done only once for all Google projects. This must be done for your pull request to be approved.
- Add someone in the Googlers team as a reviewer.
- When the reviewer is satisfied they will add the
Ready for Googlelabel. - The pull request will later be merged.
The files example/math.lisp and example/math-test.lisp give a simple example
of creating a proto structure, populating its fields, serializing, and then
deserializing. Looking over these files is a good way to get a quick feel for
the protobuf API, which is described in detail below.
The file math.proto has two messages: AddNumbersRequest and
AddNumbersResponse.
The prefix cl-protobufs. is automatically added to the package name specified
by package math;, resulting in cl-protobufs.math as the full package name
for the generated code. This is done to avoid conflicts with existing packages.
The full name of the Lisp type for the AddNumbersRequest message is
cl-protobufs.math:add-numbers-request.
This section explains the code generated from a .proto file by
protoc-gen-cl-pb, the Common Lisp plugin for protoc. See the "protoc"
directory in this distribution for the plugin code.
Note that protoc-gen-cl-pb transforms protobuf names like MyMessage or
my_field to names that are more Lisp-like, such as my-message and
my-field.
The code generated by protoc-gen-cl-pb uses macros to define the generated API.
Protocol buffer messages should be defined in .proto files instead of invoking
these macros directly. Internal details that are not in the API documented below
may change incompatibly in the future.
The generated code for each .proto file lives in a package derived from the
package statement.
package abc;The generated Lisp package for the above is cl-protobufs.abc. The prefix
"cl-protobufs." is added in order to avoid conflicts with another Lisp package
named "abc". If you prefer to use a shorter package name we recommend using
:local-nicknames as we do in many files in this library. Example:
(defpackage #:my.project
(:use #:common-lisp)
(:local-nicknames (#:abc #:cl-protobufs.abc))) ; Referenced as abc:You may have multiple .proto files use the same package if desired. The
package exports the symbols described in the sections below.
Groups are a deprecated way of defining a nested message and a field in a single declaration:
syntax = "proto2";
package abc;
message Foo {
optional group Bar = 1 {
optional string a = 1;
optional int32 b = 2;
}
}This is treated exactly the same way as defining a nested message named Bar
and a field named bar:
syntax = "proto2";
package abc;
message Foo {
message Bar {
optional string a = 1;
optional int32 b = 2;
}
optional Bar bar = 1;
}See the following sections for details on how to access nested messages and fields from Lisp.
This section uses the following protocol buffer messages as an example:
syntax = "proto2";
package abc;
message DateRange {
optional string min_date = 1;
optional string max_date = 2;
}Construct a date-range message:
(make-date-range :min-date "2020-05-27" :max-date "2020-05-28")Set the value of the max-date field on an already-constructed range
message:
(setf (date-range.max-date range) "2022-07-29")
Get the value of the min-date field from the range message:
(date-range.min-date range)If the field was explicitly set, that value is returned. Otherwise, a default
value is returned: the default value specified for this field in the .proto
file, if any, or a type-specific default value. Type-specific default values are
as follows:
| protobuf type | default value |
|---|---|
| numerics | zero of the appropriate type |
| strings | the empty string |
| messages | nil |
| groups | nil |
| enums | the first value listed in the .proto file |
| booleans | nil |
| repeated fields | the empty list |
| symbols | nil |
Note that with nested messages and long message names, field accessor names can
get pretty long. If speed is not an issue it is also possible to access fields
via the cl-protobufs:field generic function, which is an alternative (slower,
but often more concise) way to read a protobuf field's value:
(cl-protobufs:field range 'min-date)Check whether the min-date field has been set on range:
(date-range.has-min-date range)(Returns t if the min-date field has been set, otherwise nil.)
Clear the value of the min-date field on range:
(date-range.clear-min-date range)(After the above call, (date-range.has-min-date range) returns nil and
(date-range.min-date range) returns the default value.)
This section uses the following protocol buffer message as an example:
syntax = "proto3";
message Event {
int32 day = 1;
int32 month = 2;
int32 year = 3;
repeated string invitees = 4;
}The generated code for proto3 messages is similar to proto2 messages. The only
difference is the introduction of fields with no specified label, which are
known as "singular" fields. For singular fields, the state of being unset and
the state of being set to the default value for the type are indistinguishable.
So, has-* functions, such as (event.has-day msg) are not defined.
The has-* functions for repeated fields are defined. They return true if and
only if the field has been manually set and has not been cleared since.
This library supports optional fields in proto3 messages. These fields have the same semantics and generated code as proto2 optional fields.
This section uses the following protocol buffer message as an example:
message Dictionary {
map<int32,string> map_field = 1;
}This creates an associative map with keys of type int32 and values of type
string. In general, the key type can be any scalar type except float and
double. The value type can be any protobuf type.
For a message dict of type
Dictionary, the following functions are created to access the map:
*-gethash returns the value associated with 2 in the map-field field in dict.
If there is no value explicitly set, this function returns the default value of
the value type. In this case, the empty string.
(dictionary.map-field-gethash 2 dict)gethash can be used with setf to set fields as well.
This associates 1 with the value "one" in the map-field field in dict:
(setf (dictionary.map-field-gethash 1 dict) "one")*-remhash removes any entry with key 1 in the map-field field in dict:
(dictionary.map-field-remhash 1 dict)Like the other fields, these functions are aliased by methods which are slower
but more concise. Examples of the methods are: (map-field-gethash 2 dict),
(setf (map-field-gethash 1 dict) "one"), and (map-field-remhash 1 dict).
These have the same functionality as the above 3 functions respectively.
These functions are type checked, and interfacing with the map with these
functions alone will guarantee that (de)serialization functions as well as the
(dictionary.has-map-field dict) function will work properly. The underlying
hash table may be accessed directly via (dictionary.map-field dict), but doing
so may result in undefined behavior.
enum DayOfWeek {
DAY_UNDEFINED = 0;
MON = 1;
TUE = 2;
WED = 3;
...
}The above enum defines the Lisp type day-of-week, like this:
(deftype day-of-week '(member :day-undefined :mon :tue :wed ...))Each enum value is represented by a keyword symbol which is mapped to/from its numeric equivalent during serialization and deserialization.
Convert a keyword symbol to its numeric value:
(defun day-of-week-to-int (name) ...)(Example: (day-of-week-to-int :mon) => 1)
Convert a number to its symbolic name:
(defun int-to-day-of-week (num) ...)(Example: (int-to-day-of-week 1) => :MON)
Each numeric enum value is also bound to a constant by the same name but with "+" on each side:
(defconstant +mon+ 1)Note that most enums should have an "undefined" or "unset" field with value 0
so that message fields using this enum type have a reasonable default value that
is distinguishable from valid values. (It probably wouldn't make sense for
Monday to be the default day.)
Name conflicts with other enum constants can easily happen if they all have a
field named "undefined", so in this case we named the "undefined" field with a
DAY_ prefix. For this reason it is also common to nest an enum inside the
message that uses it.
When an enum is defined inside of a message instead of at top level in the
.proto file, the message name is prepended to the name. For example, if
DayOfWeek had been defined inside of a Schedule message it would result in
these definitions:
(deftype schedule.day-of-week '(member :day-undefined :mon :tue :wed ...))
(schedule.day-of-week-to-int :mon) => 1
(int-to-schedule.day-of-week 1) => :MON
(defconstant +schedule.day-undefined+ 0) ; may not need the DAY_ prefix now.
(defconstant +schedule.mon+ 1)
...
For backward compatibility, unrecognized enum values are retained during deserialization and are output again when serialized. This allows a client that acts as a pass-through for the enum data to function correctly even if it uses a different version of the proto than the systems it is communicating with.
Message Schema V1:
enum DayOfWeek {
DAY_UNDEFINED = 0;
MON = 1;
TUE = 2;
WED = 3;
}
message DayIWillWork {
optional DayOfWeek workday = 1;
}Message Schema V2:
enum DayOfWeek {
DAY_UNDEFINED = 0;
MON = 1;
TUE = 2;
WED = 3;
THUR = 4;
}
message DayIWillWork {
optional DayOfWeek workday = 1;
}If we send a V2 message:
DayIWillWork {
workday: THUR
}to a V1 system it will save the fact that the enum it
received is 4. Calling (day-i-will-work.workday v2-proto)
will return :%undefined-4. Reserialization will add the
workday enum value to the serialized protobuf message, and
deserialization on a V2 system will properly add the
new :thur enum value to the new protocol buffer message.
Trying to call (setf (day-i-will-work.workday v2-proto) :%undefined-4
will signal an error on a V1 or V2 system since :%undefined-4 isn't a
known enum value.
This section uses the following protobuf message as an example:
message Person {
optional string name = 1;
oneof AgeOneof {
int32 age = 2;
string birthdate = 3;
}
}To access fields inside a oneof, use the standard accessors outlined above.
These fields have the semantics of proto2 optional fields, so has-* functions
are created. For example:
(setf (person.age bob) 5)...will set the age field of a Person object bob to 5.
Defining a oneof also creates two special functions:
*-oneof-case will return the lisp symbol corresponding to the field which is currently
set. So, if we set age to 5, then this will return the symbol AGE. If no
field is set, this function will return nil.
(person.age-oneof-case bob)If we set the age field on our bob object, then:
(person.has-age bob) => t
(person.has-birthdate bob) => nilTo clear all fields inside of the oneof age-oneof:
(person.clear-age-oneof bob)We use the following protocol buffer message as an example in this section:
message RepeatedProto {
repeated integer my_int_list = 1;
repeated integer my_int_vector = 1 [(lisp_container) = VECTOR];
}This creates a message with two fields.
The field my_int_list stores a list of integers.
The default value is the empty list, i.e. nil.
The field my_int_vector stores a vector of integers.
The default value is an empty vector which is extendable with a fill pointer.
The APIs for the list and vector repeated fields are the same. There is a minor difference when pushing onto the different types of repeated field.
push-* pushes a value onto the corresponding list or vector field.
This pushes the integer 1 onto the my_int_list field in the RepeatedProto:
(repeated-proto.push-my-int-list 1 my-message)(Since we push onto a list, this will push into the front of the list.)
This pushes the integer 1 onto the my_int_vector field in the RepeatedProto:
(repeated-proto.push-my-int-vector 1 my-message)(Since we push onto a vector, this will push into the back of the vector.)
The has-* functions on a repeated field return true if there
are no elements in the sequence:
(repeated-proto.has-my-int-list my-message)
(repeated-proto.has-my-int-vector my-message)The length-of-* function returns the number of elements in the repeated field:
(repeated-proto.length-of-my-int-list my-message)
(repeated-proto.length-of-my-int-vector my-message)The nth-* function returns the element at position n in the repeated field:
(repeated-proto.nth-my-int-list n my-message)
(repeated-proto.nth-my-int-vector n my-message)(If the repeated field has length less than n, we signal an error.)
The clear-* function clears the repeated field of all elements:
(repeated-proto.clear-my-int-list my-message)
(repeated-proto.clear-my-int-vector my-message)A string field may be annotated as a symbol field, which will cause it to be represented in Lisp as an interned symbol rather than a string. Example:
import "third_party/lisp/cl_protobufs/proto2-descriptor-extensions.proto";
message Foo {
optional symbol = 1 [(lisp_type) = "CL:SYMBOL"];
}When converting from text mode, we uppercase the string, and if it does not contain a colon we intern it as a keyword symbol, except that we special case "T" and "NIL" to refer to the corresponding Lisp symbols. If the string contains a colon at the beginning, then we also intern it as a keyword symbol, but if it contains a colon elsewhere in the string, the portion preceding the colon is interpreted a package name. Thus, the following lines are equivalent
symbol: "foo"
symbol: "FOO"
symbol: "keyword:foo"
as are
symbol: "t"
symbol: "common-lisp:t"
but note that these are different:
symbol: "t"
symbol: ":t"
Multiple colons are not allowed, nor are the single-quote, double-quote, and backslash characters.
TODO
This section describes the generated code API for a protobuf service in a proto file.
You must have a corresponding RPC library as well; cl-protobufs just generates the
methods.
The gRPC library, or any library containing the following form:
(setq cl-protobufs:*rpc-call-function* 'start-call)can be used as the underlying RPC mechanism. We will show examples with the expectation that you are using gRPC.
The following example service definition is used throughout this section.
lisp_package = "math";
message AddNumbersRequest {
optional int32 number1 = 1;
optional int32 number2 = 1;
}
message AddNumbersResponse {
optional int32 sum = 1;
}
Service MyService
rpc AddNumbers(AddNumbersRequest) returns (AddNumbersResponse) {}
}The cl-protobufs protoc plugin generates two packages:
cl-protobufs.mathcl-protobufs.math-rpc
The package cl-protobufs.math contains the add-numbers-request and add-numbers-response
protocol buffer messages.
The package cl-protobufs.math-rpc contains a stub for call-add-numbers. A message can be
sent to a server implementing the Greeter service with:
(grpc:with-insecure-channel
(channel (concatenate 'string hostname ":" (write-to-string port-number)))
(let* ((request (cl-protobufs.testing:make-add-numbers-request
:number-1 1 :number-2 2))
(response (cl-protobufs.math-rpc:call-add-numbers channel request)))
...))There is currently no known supported open framework for implementing the server portion of Protocol Buffer services in Lisp.
(defgeneric add-numbers-impl (channel (request add-numbers-request) rpc))A generic function generated for each RPC in the service definition. The name is the concatenation of the protobuf method name (in its Lisp form) and the string "-impl".
To implement the service define a method for each generic function. The method
must return the type declared in the .proto file. Example:
(defmethod add-numbers-impl (channel (request add-numbers-request) rpc)
(make-add-numbers-response :sum (+ (add-numbers-request.number1 request)
(add-numbers-request.number2 request))))The channel argument is supplied by the underlying RPC code and differs
depending on which transport mechanism (HTTP, TCP, IPC, etc) is being used. The
channel and rpc arguments can usually be ignored.
This section documents the symbols exported from the cl-protobufs package.
message is the base type from which every generated protobuf message inherits:
(defstruct message ...)print-text-format prints a protocol buffer message to a stream. object is the protocol buffer
message, group, or extension to print. stream is the stream to print to.
pretty-print-p may be set to nil to minimize textual output by omitting
most whitespace.
(defun print-text-format (object &key
(indent -2)
(stream *standard-output*)
(pretty-print-p t)))parse-text-format parses a protocol buffer message written in text-format.
type is the type of message to parse. stream is the stream to read from.
(defun parse-text-format (type &key (stream *standard-input*)))is-initialized checks if object has all required fields set, and recursively all of its
sub-objects have all of their required fields set. An error may be signaled if
an attempt is made to serialize a protobuf object that is not initialized.
Signals an error if object is not a protobuf message.
(defun is-initialized (object))proto-equal checks if two protobuf messages are equal. By default, two messages are equal if
calling the getter on each field would retrieve the same value. This means that
a message with a field explicitly set to the default value is considered equal
to a message with that field not set.
If exact is true, consider the messages to be equal only if the same fields
have been explicitly set.
message-1 and message-2 must both be protobuf messages.
(defun proto-equal (message-1 message-2 &key exact nil))clear resets the protobuf message to its initial state:
(defgeneric clear (object message))has-field returns whether field has been explicitly set in object. field is the
symbol naming the field in the proto message.
(defun has-field (object field))byte-vector: a vector of unsigned-bytes. In serialization functions, this is often referred to
as 'buffer'.
(deftype byte-vector)make-byte-vector: constructor to make a byte vector. size is the size of the underlying vector.
adjustable is a boolean value determining whether the byte-vector can change
size.
(defun make-byte-vector (size &key adjustable))serialize-to-bytes creates a byte-vector and serializes a protobuf message to that byte-vector. The
object is the protobuf message instance to serialize. Optionally use type to
specify the type of object to serialize.
(defun serialize-to-bytes (object &optional (type (type-of object))))serialize-to-stream: serialize object, a protobuf message, to stream. Optionally use type to
specify the type of object to serialize.
(defun serialize-to-stream (object stream &optional (type (type-of object)))deserialize-from-bytes: deserialize a protobuf message returning the newly created structure.
typeis the symbol naming the protobuf message to deserialize.bufferis the byte-vector containing the data to deserialize.start(inclusive) andend(exclusive) delimit the range of bytes to deserialize.
(defun deserialize-from-bytes (type buffer &optional (start 0) (end (length buffer))))deserialize-from-stream: deserialize an object of type type by reading bytes from stream.
type is the symbol naming the protobuf message to deserialize.
(defun deserialize-from-stream (type stream)Several functions are exported from the cl-protobufs.well-known-types package.
A list of all well known types can be found in the
official Protocol Buffers documentation.
unpack-any: takes an Any protobuf message any-message and turns it into the stored
protobuf message, as long as the qualified-name given in the type-url corresponds
to a loaded message type. The type-url must be of the form
base-url/qualified-name.
(defun unpack-any (any-message))pack-any: creates an Any protobuf message given a protobuf message and a base-url.
(defun pack-any (message &key (base-url "type.googleapis.com"))TODO: examples
The cl-protobufs.json package exports functions to convert between protobuf
objects and
the canonical JSON encoding.
print-json: takes any protobuf message message and prints it as JSON. The parameters are:
pretty-print-p: Indent the output byindentspaces and print newlines.stream: The Lisp stream to output to.camel-case-p: Print field names in camelCase. Ifnil, then print field names as they appear in the .proto file.numeric-enums-p: If true, print enum values by their number rather than their name.
(defun print-json (message &key (pretty-print-p t) (stream *standard-output*)
(camel-case-p t) numeric-enums-p))parse-json: parses a JSON encoding and return the parsed protobuf object. The parameters are:
type: Either the Lisp type or themessage-descriptorof the object to parse.stream: The stream to read from. By default, this is *standard-input*.ignore-unknown-fields-p: If true, silently ignore any unrecognized fields encountered when parsing. Ifnil, the parser will throw an error.
(defun parse-json (type &key stream ignore-unknown-fields-p)This is a non-exhaustive list of ways in which cl-protobufs doesn't currently meet the Protocol Buffers spec.
- Groups are not supported within
oneoffields. - The
[deprecated=true]field option is not supported.