Tutorial: CXX blobstore client
This example walks through a Rust application that calls into a C++ client of a blobstore service. In fact we'll see calls going in both directions: Rust to C++ as well as C++ to Rust. For your own use case it may be that you need just one of these directions.
All of the code involved in the example is shown on this page, but it's also
provided in runnable form in the demo directory of
https://github.com/dtolnay/cxx. To try it out directly, run cargo run
from
that directory.
This tutorial assumes you've read briefly about shared structs, opaque types, and functions in the Core concepts page.
Creating the project
We'll use Cargo, which is the build system commonly used by open source Rust projects. (CXX works with other build systems too; refer to chapter 5.)
Create a blank Cargo project: mkdir cxx-demo
; cd cxx-demo
; cargo init
.
Edit the Cargo.toml to add a dependency on the cxx
crate:
We'll revisit this Cargo.toml later when we get to compiling some C++ code.
Defining the language boundary
CXX relies on a description of the function signatures that will be exposed from
each language to the other. You provide this description using extern
blocks
in a Rust module annotated with the #[cxx::bridge]
attribute macro.
We'll open with just the following at the top of src/main.rs and walk through each item in detail.
The contents of this module will be everything that needs to be agreed upon by both sides of the FFI boundary.
Calling a C++ function from Rust
Let's obtain an instance of the C++ blobstore client, a class BlobstoreClient
defined in C++.
We'll treat BlobstoreClient
as an opaque type in CXX's classification so
that Rust does not need to assume anything about its implementation, not even
its size or alignment. In general, a C++ type might have a move-constructor
which is incompatible with Rust's move semantics, or may hold internal
references which cannot be modeled by Rust's borrowing system. Though there are
alternatives, the easiest way to not care about any such thing on an FFI
boundary is to require no knowledge about a type by treating it as opaque.
Opaque types may only be manipulated behind an indirection such as a reference
&
, a Rust Box
, or a UniquePtr
(Rust binding of std::unique_ptr
). We'll
add a function through which C++ can return a std::unique_ptr<BlobstoreClient>
to Rust.
The nature of unsafe
extern blocks is clarified in more detail in the
extern "C++" chapter. In brief: the programmer is not
promising that the signatures they have typed in are accurate; that would be
unreasonable. CXX performs static assertions that the signatures exactly match
what is declared in C++. Rather, the programmer is only on the hook for things
that C++'s semantics are not precise enough to capture, i.e. things that would
only be represented at most by comments in the C++ code. In this case, it's
whether new_blobstore_client
is safe or unsafe to call. If that function said
something like "must be called at most once or we'll stomp yer memery", Rust
would instead want to expose it as unsafe fn new_blobstore_client
, this time
inside a safe extern "C++"
block because the programmer is no longer on the
hook for any safety claim about the signature.
If you build this file right now with cargo build
, it won't build because we
haven't written a C++ implementation of new_blobstore_client
nor instructed
Cargo about how to link it into the resulting binary. You'll see an error from
the linker like this:
Adding in the C++ code
In CXX's integration with Cargo, all #include paths begin with a crate name by
default (when not explicitly selected otherwise by a crate; see
CFG.include_prefix
in chapter 5). That's why we see
include!("cxx-demo/include/blobstore.h")
above — we'll be putting the
C++ header at relative path include/blobstore.h
within the Rust crate. If your
crate is named something other than cxx-demo
according to the name
field in
Cargo.toml, you will need to use that name everywhere in place of cxx-demo
throughout this tutorial.
Using std::make_unique
would work too, as long as you pass std("c++14")
to
the C++ compiler as described later on.
The placement in include/ and src/ is not significant; you can place C++ code anywhere else in the crate as long as you use the right paths throughout the tutorial.
Be aware that CXX does not look at any of these files. You're free to put arbitrary C++ code in here, #include your own libraries, etc. All we do is emit static assertions against what you provide in the headers.
Compiling the C++ code with Cargo
Cargo has a build scripts feature suitable for compiling non-Rust code.
We need to introduce a new build-time dependency on CXX's C++ code generator in Cargo.toml:
Then add a build.rs build script adjacent to Cargo.toml to run the cxx-build code generator and C++ compiler. The relevant arguments are the path to the Rust source file containing the cxx::bridge language boundary definition, and the paths to any additional C++ source files to be compiled during the Rust crate's build.
This build.rs would also be where you set up C++ compiler flags, for example if
you'd like to have access to std::make_unique
from C++14. See the page on
Cargo-based builds for more details about CXX's Cargo
integration.
The project should now build and run successfully, though not do anything useful yet.
Calling a Rust function from C++
Our C++ blobstore supports a put
operation for a discontiguous buffer upload.
For example we might be uploading snapshots of a circular buffer which would
tend to consist of 2 pieces, or fragments of a file spread across memory for
some other reason (like a rope data structure).
We'll express this by handing off an iterator over contiguous borrowed chunks.
This loosely resembles the API of the widely used bytes
crate's Buf
trait.
During a put
, we'll make C++ call back into Rust to obtain contiguous chunks
of the upload (all with no copying or allocation on the language boundary). In
reality the C++ client might contain some sophisticated batching of chunks
and/or parallel uploading that all of this ties into.
Any signature having a self
parameter (the Rust name for C++'s this
) is
considered a method / non-static member function. If there is only one type
in
the surrounding extern block, it'll be a method of that type. If there is more
than one type
, you can disambiguate which one a method belongs to by writing
self: &BlobstoreClient
in the argument list.
As usual, now we need to provide Rust definitions of everything declared by the
extern "Rust"
block and a C++ definition of the new signature declared by the
extern "C++"
block.
In blobstore.cc we're able to call the Rust next_chunk
function, exposed to
C++ by a header main.rs.h
generated by the CXX code generator. In CXX's Cargo
integration this generated header has a path containing the crate name, the
relative path of the Rust source file within the crate, and a .rs.h
extension.
This is now ready to use. :)
Interlude: What gets generated?
For the curious, it's easy to look behind the scenes at what CXX has done to make these function calls work. You shouldn't need to do this during normal usage of CXX, but for the purpose of this tutorial it can be educative.
CXX comprises two code generators: a Rust one (which is the cxx::bridge attribute procedural macro) and a C++ one.
Rust generated code
It's easiest to view the output of the procedural macro by installing
cargo-expand. Then run cargo expand ::ffi
to macro-expand the mod ffi
module.
You'll see some deeply unpleasant code involving #[repr(C)]
, #[link_name]
,
and #[export_name]
.
C++ generated code
For debugging convenience, cxx_build
links all generated C++ code into Cargo's
target directory under target/cxxbridge/.
In those files you'll see declarations or templates of any CXX Rust types
present in your language boundary (like rust::Slice<T>
for &[T]
) and extern "C"
signatures corresponding to your extern functions.
If it fits your workflow better, the CXX C++ code generator is also available as a standalone executable which outputs generated code to stdout.
Shared data structures
So far the calls in both directions above only used opaque types, not shared structs.
Shared structs are data structures whose complete definition is visible to both languages, making it possible to pass them by value across the language boundary. Shared structs translate to a C++ aggregate-initialization compatible struct exactly matching the layout of the Rust one.
As the last step of this demo, we'll use a shared struct BlobMetadata
to pass
metadata about blobs between our Rust application and C++ blobstore client.
You've now seen all the code involved in the tutorial. It's available all
together in runnable form in the demo directory of
https://github.com/dtolnay/cxx. You can run it directly without stepping
through the steps above by running cargo run
from that directory.
Takeaways
The key contribution of CXX is it gives you Rust–C++ interop in which all of the Rust side of the code you write really looks like you are just writing normal Rust, and the C++ side really looks like you are just writing normal C++.
You've seen in this tutorial that none of the code involved feels like C or like the usual perilous "FFI glue" prone to leaks or memory safety flaws.
An expressive system of opaque types, shared types, and key standard library type bindings enables API design on the language boundary that captures the proper ownership and borrowing contracts of the interface.
CXX plays to the strengths of the Rust type system and C++ type system and the programmer's intuitions. An individual working on the C++ side without a Rust background, or the Rust side without a C++ background, will be able to apply all their usual intuitions and best practices about development in their language to maintain a correct FFI.