In principle, ZebraStream creates one-way data channels, where information flows from the producer to the consumer. But that's not entirely true: combining different HTTP requests means that there are different types of information being passed around in different directions between producer, consumer, and relay. This article explains the fundamentals of data flow, and current limitations.

Diverse Flow of Information

Main Transfer (Data API)

A basic ZebraStream one-way channel is constructed using two connections: a PUSH request to the relay, initiated by the producer, and a GET request to the relay, initiated by the consumer. In these two requests, the exchanging parties send information to the relay, including connection control and metadata. In the obvious payload data flow, the producer sends the data to the relay, and the relay forwards it to the consumer. Less obvious is, that the producer can also transmit metadata in the form of HTTP custom headers to to consumer. Nevertheless this meta-data still follows the main direction. Almost overseen, there is also a reverse direction. First, the consumer may send headers and other information to the relay, mostly when it initiates the request. This data can be used by the relay, or forwarded to the producer as part of the streaming response, like for status messages. Second, the relay detects, whether the TCP connection terminates correctly, or not, but also when the consumer slows down or pauses consumption (this is called back pressure). Such information may also be forwarded to the producer in various ways. While the full details can be quite confusing, the net result is simple: ZebraStream can transfer payload data from the producer to the consumer, but meta-data, status data, and control data can flow in both ways.

Pre-Transfer (Connect API)

So far, we have only considered the actual data transfer using the Data API, but there is another flow of information that happens before the actual transfer: the notification channel using the Connect API. Here, producer and consumer both use a GET request for matching. Currently, the relevant piece of information is the timing, and it can go in both directions: either the producer waits for the consumer, or vice versa. In future, however, the Connect API might be extended to accept meta-data, similar to request parameters for web servers. This allows the producer to customize the data, and the consumer to decide how to process the data, even before starting the actual transfer.

Implicit Path Information

Currently, there are only few ways to parametrize the data, because there is no real communication before data is transferred. One simple way is to enumerate different options by providing different stream addresses. For instance, `/red` and `/green` would either tell the producer to deliver either red or green data, or tell the consumer which kind of data is coming. Encoding information in the path may seem ugly, but can be simple and powerful at the same time. For instance, consider giving recursive read access to a stream folder. One can effectively add another level of access control by adding a shared secret to the path, as only a reader knowing the secret could access it.

Protocol Limitations

The HTTP protocol and version restricts what kind of additional data, and at which connection state it can be transferred. The most limiting property of the protocol is, that there is no simple way for the consumer to tell the producer how much data has arrived, if a transfer is stopped or interrupted. This all-or-nothing principle means, that unsuccessful transfers must either be retried, or the producer must be informed at which point to resume by other means (for instance using a reverse ZebraStream address). This limitation is the tribute to be paid for using such a simple protocol while the payoff is the reduced integration work and complexity. In many applications, the possibility of failed transfers can be neglected.

Status Codes and Messages

A transfer can be either successful or unsuccessful. Every successful transfer has an HTTP response code `200` and a final status message `[STATE:TRANSFER_SUCCESSFUL]` at the producer side. Other status messages should be considered purely informational at the moment. The consumer just needs to wait for the download with status code code `200` to finish. If it doesn't, it should be considered a failure. This crude mechanism will be expanded into a more formal and structured approach in the future.