Skip to content

Specification

Georg Schuppe edited this page May 30, 2016 · 7 revisions

Preliminaries

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Buggy Specification Draft. Last updated: 2016/05/29

Buggy Structure

Buggy is a data flow transformation system based on a CSP [1,2] semantic. The basic input type is a graph that:

  • is directed,
  • has a hierarchical structure (i.e. it is a compound graph),
  • has input and output ports for every node.

This graph has normal edges (simply called edges) that carry the data. For those edges the following must hold:

  • They must connect an output port to an input port. The direction must point from output ports to input ports
  • They must not induce cycles in any way in the graph.
  • Edges must not cross hierarchy borders. The only way to pass data into another hierarchy level is through ports.

Furthermore there are special edges that go from nodes to other nodes (not from ports). They will be explained later in this document.

Each node can have input and output ports. An input port receives data whereas an output port sends data. In any node each port name must be unique. That also means that no input port can share the same name with an output port.

Data format

Information in the Buggy system is always encoded as JSON documents.

Graph format

The graph format is based on graphlib.

Nodes

Nodes are classified as atomic nodes or compound nodes. Both have an id as well as input and output ports. Nodes represent the implementation of some concept with the following distinction:

  • Atomic: Those nodes are the basic building blocks of every program.
  • Compound: These are nodes that have a graph as their implementation. This graph must follow the same conditions as discussed above. But all input nodes of the compound node are output nodes in the sub graph and vice versa.

The hierarchical structure is induced by the compound nodes and their implementation. The distinction bewteen an atomic and compound node is made by the atomic flag. A compound node is required to have an implementation field. Whereas an atomic node should not have an implementation.

{
  "id": "atomicNode",
  "inputPorts": {},
  "outputPorts": {},
  "atomic": true
}
{
  "id": "compoundNode",
  "inputPorts": {},
  "outputPorts": {},
  "atomic": false,
  "implementation": {
    "nodes": [],
    "edges": []
  }
}

It is possible to define a SemVer version for each node as version.

Implementations

The implementation of a compound node is stored in a short notation for graphs. It has exactly two fields: A nodes array and an array of edges.

Each node in the nodes array must either be:

  • a valid node as discussed above,
  • a reference to a node. References are introduced here.

There is no restriction on how often a node with a specific id is used inside this sub graph, but the names must be unique.

The array of edges contains all edges inside this compound node. It connects the ports of the compound node to its inner nodes as well as nodes inside the compound node. Each edge has two fields: from and to. Both identify ports inside the compound node. As the ports themselve may not be unique. An identifier is:

  • A port name of the node, e.g. "compoundPort"
  • A string of the node name and the port name separated by a colon ":", e.g. "atomicNode:outPort"

A complete implementation could look like this:

{
  "nodes": [
    { "meta": "compoundNode", "name": "node1"},
    {
      "id": "atomicNode",
      "inputPorts": {},
      "outputPorts": {},
      "atomic": true,
      "name": "node2"
    }
  ],
  "edges": [
    { "from": "compoundPort1", "to": "node1:inPort" },
    { "from": "node1:outPort", "to": "node2:inPort" },
    { "from": "node2:outPort", "to": "compoundPort2" },
  ]
}

Where compoundPort1 is the input port of the parent compound node and compoundPort2 is the out port of the parent compound node.

Special Nodes

Aside from normal atomics there are a few special atomic nodes that have a special semantic or are treated differently by Buggy.

Node References

Every implementation can contain references to nodes. These are simply short aliases to those nodes. Each reference:

  • Must have a meta field that identifies the referenced node.
  • Can have a name that will be used as a unique identifier in the graph.
  • Can have a version specifying more precisely which node is referenced. If no version is given, the latest version is taken automatically.

Those references will be replaced by their corresponding in the toolchain. A reference might look like this:

{ "meta": "atomicNode", "version": "1.0.0", "name": "thisAtomicNode"}

Lambda Nodes

Lambda nodes pass functions as data. Those functions are encoded as sub graphs inside of the lambda node. A lambda node is, in this sense, similar to a compound node. But it is in fact an atomic node and always has exactly one child and this child is never connected to the lambda function itself. It is contained inside of the lambda node and thus its parent node is the lambda node.

The lambda function, the contained graph inside the lambda node, can itself contain compound or atomic nodes. It is a child of the lambda node.

Buggy Semantics

A Buggy graph can be understood as a connected sequetial processes, similar to CSP [1,2]. The nodes describe operations and the edges denote a data transfer between nodes.

The structure can also be understood as a control flow graph in which every incoming edge describes the preconditions for every operation (e.g. node). And every outgoing edge is a precondition for the succesors.

Atomics

Each atomic node must describe a sequential program that can read and send data via channels. This also means, that there must be a clear causality between receiving and sending data. An atomic must send data on all output ports after it received data on all input ports. It is not possible to send data after reading only one of many input ports. The operation that the node performs can ignore input data, but the atomic node must read all input from its ports. It also must send on all output ports before receiving data again. There are two special cases:

  • Sources: If a node has no input ports, it will send data indefinitely.
  • Sinks: If a node has no output ports, it can receive data indefinitely

Special Forms

Not all necessary operations can be described with the above definition of atomics.

References

[1] Hoare, Charles Antony Richard. Communicating sequential processes. Springer New York, 1978.
[2] https://en.wikipedia.org/wiki/Communicating_sequential_processes