HomePhabricator
Communicating between C++ and JavaScript
(with Echo and Cap'n Proto)

Recently I had to refactor some code that was starting to become a bit unmanageable. It was fairly boiler plate and most of it was just taking data structures in C++ then serialising the data to JSON. The state the the module in question can be attributed to it being created quickly in order to support communication between our system (C++) and our JavaScript web interface, with the JSON format never being designed in the first place, and our rapidly changing system.

Our system has many network nodes that communicate with one another, with communication either being exclusively with a server node or set up via it.

When our system started to have a more concrete direction and became more stable, it was my job to review the technical debt that, unfortunately, I had created. The task was to come up with a solution so that we could move forward more rapidly without worrying about growing a monster. I now had time to consider what I needed to accomplish. Thankfully it was isolated to a single module that interacted with the system via our internal C++ interface.

We are using Cap'n Proto (cerealization protocol - infinitely faster) to define our wire format, so our system was already communicating between our C++ nodes in a structured way. I had learned that newer (than we were using on our distro) versions of Cap'n Proto included support for C++ object to JSON serialisation, pretty much for free, just pass one of the objects into the JSON encoder or decoder and you're kraken (pun intended).

Cap'n what?

Cap'n Proto... It is written right there ^^...

Cap'n Proto is like Google protobufs, in fact the developer of Cap'n Proto, Kenton Varda, was the primary developer of Protocol Buffers version 2. In these types of serialisation libraries, you define structured data (like a C++ struct) and then generate code to access or manipulate the data in a buffer. The buffer is then sent over the network and at the other end interpreted in the appropriate structure.

It basically adds a layer of data checking and serialisation that you might forget to do, handles versioning, allows you to avoid sending blocks of data if you wanted it to be null and includes some compression. Other advanced features of some of these libraries include things like RPC, which we are not using for a number of reasons.

The Cap'n Proto JSON serialisation support works well out of the box if you don't have structures that include AnyPointers. Unfortunately we do use AnyPointers in our data transport. That was going to be fine, it would just mean I'd have to write a JSON serialisation handler for those types, right?

I set off on this path for a little while and wrote a templated class that would take data from the network from any source (JSON or binary) and implicitly convert it to the required object or JSON depending on what you needed. This seemed like a good direction at first, providing you knew what your data contained, the conversion would be automatic. I even wrote some registration wrappers to handle the conversion using this serialisation helper class and just pass the type you wanted. Unfortunately I didn't consider the full scope of what we needed and ended up with a class that would do one way conversions easily, then when you needed to send in the other format the transport system didn't know how to convert it, since the types weren't available at the level of dealing with the data in an abstracted way.

So, I continued to ponder about how I would get that conversion function at the lower level without needing to significantly overhaul the networking system.... Do we even need two serialisation formats?

Years ago....

Years ago I wrote Echo networking support in Java for a C++ desktop client to communicate with an Android Input Method so you could use your desktop keyboard to enter text on your phone instead of the on screen keyboard. It was fairly trivial since Echo's network format is a simple header that includes information about the data to come. It was already abstracted from the transport so I just had to implement Echo's Connection and DataPacket in Java and all worked well.

I wondered whether I could deal with binary data directly in JavaScript. Years ago I wouldn't have even considered it, but I decided to look into it again. Maybe I could implement the equivalent of the Java solution in JavaScript...

I had only considered this because we were using Cap'n Proto, which has other language support from contributors which JavaScript is included. The JavaScript libraries were, unfortunately, no longer maintained, but I managed to find a TypeScript implementation.

TypeScript

I had read about TypeScript once before, and it sounded like it had the potential to pull me towards web development. "Typed JavaScript that compiles to JavaScript? What? Sure, I'll look at it some day." Realistically, the times I need to write JavaScript meant that I needed to write JavaScript - to get a job done, not because I'm having a bunch of fun. Because of this I didn't look into TypeScript until now.

As it turned out, I actually like TypeScript, it lets me write (or technically generate) JavaScript and be confident my code is correct, rather than harbouring a typo that would otherwise only be detected in an edge case code branch and bring a critical program down (or worse, seem unresponsive to the user). I'm still learning about some of the issues when using non-TypeScript projects, but for the most part it has been great to work with.

But what about the binary?

Ah yes, about 01110100011010000110000101110100... JavaScript has ArrayBuffers which can hold raw data. You can set up JavaScript WebSockets to treat the received data as ArrayBuffers rather than a string or Blob. You can then use a DataView to access the binary data at specific offsets and widths. For example:

Shamelessly taken from the DataView MDN docs
var buffer = new ArrayBuffer(2);
new DataView(buffer).setInt16(0, 256);

And, you guessed it, Cap'n Proto TypeScript consumes ArrayBuffer's (or similar) to read the structed data.

And now

I've implemented a minimal Echo Connection and DataPacket in TypeScript that utilises WebSockets as the transport. To add WebSocket support to Echo, I implemented a WebSocketNetworkSystem which can accept WebSocket connections. This is already available in Echo. The TypeScript code will communicate directly with the C++ code which allows our structured data to be written and read correctly at both ends. The TypeScript implementation will be available sometime in the near future once finalised and approved for release.

Rather than reducing the technical debt, I eliminated it entirely. Now our "JavaScript" clients can communicate as though they were regular C++ nodes in our system!

Written by 0xseantasker on Apr 3 2019, 10:00 PM.
User
Projects
None
Subscribers
None

Event Timeline