ToxFloxl

The tox4j core library interface is written in Java using JNI to talk to a C++ wrapper around toxcore. Most of the HLAPI does not directly communicate with this bottom layer, but is a purely declarative behavioural specification written entirely in ToxFloxl, the Tox flow control and execution language, which translates to Scala, which is then compiled to Java bytecode.

Background

Scala is an impure functional programming language. While this is useful for pragmatic use cases where one wants to have (well-contained) mutation to write efficient real-world applications, it makes reasoning and code analysis hard.

There is only one well-tested purely functional language for the JVM (CAL), but it requires a large runtime library to work, and thus does not fit our use case, which is targetting mobile devices. Other impure languages exist, but they suffer from the same difficulties as Scala.

Therefore, we have decided to create our own DSL to solve the problem.

Syntax

As you will notice, some of the syntax was borrowed from revised OCaml. Since the HLAPI design is a GSoC task, these examples focus on the lower level toxcore library. A number of types, such as List, Char, String (which is the same as List Char) are provided by a standard library.

(* Name inherits from String. *)
module Name <: String = struct
  let max_length = 128
 
  (* Add constraints to the type. 'this' is always an implicit parameter of the type it was declared in *)
  constraint [
    (* short for \forall this \in Name: this.length <= max_length *)
    this.length <= max_length;
  ]
end
 
(* Friend *)
module Friend = struct
  (* Friends always have these fields. *)
  type = {
    id: Nat; (* Natural number: >= 0 *)
    name: Name;
    key: PublicKey;
  }
 
  (* If offline, their status is Offline. *)
  type offline = {
    status: Status = Offline;
    (* ... *)
  }
 
  (* If online, their status is not Offline, and they have a number of other fields. *)
  type online = {
    status: Status <> Offline; (* short for constraints += online.status <> Offline *)
    ip: InetAddress;
    port: Port; (* Nat constrained with > 0 and <= 65535. *)
  }
end
 
module Tox = struct
  type = {
    name : Name;
    friends: List Friend;
  }
 
  constraint [
    (* Uniqueness of IDs and keys. The <=> operator is the mathematical "if and only if". *)
    \forall a, b \in this.friends: a = b <=> a.id = b.id /\ a.key = b.key
  ]
 
  let set_name name =
    ensures [this.name = name]
 
  let add_friend_norequest key =
    requires [\not \exists friend_by_key key]
    ensures  [     \exists friend_by_key key]
 
  let friend_by_key key = { key } \in this.friends
 
end

Semantics

The language is designed in such a way that no instructions are given to perform an action. Functions are specified in terms of their pre- and post-conditions and the code to perform that is inferred automatically.

Types are specified as records, sum types or type aliases. Constraints on types can reference functions. Internally, everything is represented as types. Function specifications are turned into types and the whole program is then run through a type checker. Abstract interpretation is used to ensure validity of every constraint and function specification. If a function is over-specified, the compiler issues a warning. If it is under-specified and the function is used in a context where more specification is required, this causes an error.

Code generation is done after completely constraining the typed model. Type-preserving transformations ensure that the generated code observes the semantic constraints of the model. Several optimisation passes ensure the runtime efficiency of the code.

The compiler performs a large number of very small transformations, so that each transformation is by itself easy to reason about. Every transformation is a simple recursive fixpoint computation implemented in Coq. A semantic model of a subset of the Scala language is part of the compiler to ensure that the generated Scala code is semantically equivalent to the input specification.

This practical work builds on several theoretical results, such as:

Document extraction

This executable specification is at the same time documentation. Our compiler automatically generates a human-readable document containing the full specification with generated explanations and PDF hyperlinks. Since the documentation and the executable code are both generated from the same constraint-based specification, they will never diverge. Our documentation is always up-to-date.

C backend

The Scala backend is used for tox4j, but adding a C backend would be trivial. We can reuse earlier work by Xavier Leroy on compcert for the formal semantics of the C language. We would then automatically get verified correct machine code, as well. This can be used to gradually rewrite toxcore and at the same time create a full specification. The language will be extended to support expressing packet formats and communication states.

Code

Work on this has been underway for several weeks, and we will release the first version soon. We expect the theoretical work to be finished within the next fifteen years. After that, the implementation will be relatively straightforward.

Print/export