-
Notifications
You must be signed in to change notification settings - Fork 4
Stream Specification
The stream is the foundation all other Stroom services are build on. It is used to store data, state and configurations in a simple predictable manner.
A stream is a list JSON documents with a very simple HTTP based interface to POST and GET documents that adheres to the following rules.
- Every document stored in a stream is immutable - once added, it can not be changed.
- As documents are added they are each assigned a location in the stream as an immutable continuoesly incrementing integer. That is, the first document is given the location 0, the second document 1 and so forth.
- A stream can be truncated. That is all documents after a specific location can be removed. After a truncation, new objects added must reuse the same locations - ensuring that the stream still has a continuoesly incrementing set of locations with no gaps.
A stream service can host multiple streams, each with a unique topic. The HTTP based interface for a stream service is as follows.
This endpoint should return all streams, but may omit streams with a "." in the topic name. These are considered metadata streams for the primary topic. One such sample is the map service that uses a .state stream to store the index to which it has mapped data.
GET /stream/
[{"topic":"metrics","size":123833247,"count":861062},
{"topic":"log_data","size":30986110875,"count":1002901386}]
This call should respond without error to any topic - even ones that have not yet had any data added to them. Streams are created by simply posting to them in Stroom with no prior configuration needed. The count and size fields should not take any metadata streams into account - those can be queried separately if needed.
GET /stream/:topic
[{"topic":"log_data","size":30986110875,"count":1002901386}]
Posting to a topic implicitly creates the stream if it has not yet been used. You may post any valid JSON object to a stream. Invalid JSON should result in an error. JSON array data should be treated as a request to add multiple objects as a batch (see the next section). If you need to store an array, you must wrap it in a JSON object.
POST /stream/:topic
{"location":1002901386}
If you post a JSON array to a stream, it will be interpreted as a request to add multiple documents at once. Each element in the array must be a valid JSON object and will be added separately. The response will be a list of locations instead of a single location.
POST /stream/:topic
{"location_list":[1002901387,1002901388,1002901389,1002901390]}
To retrieve an object, simply send a get request specifying the topic and location. If the object exists it will be returned, if not an empty response will be given.
GET /stream/:topic/:location
{"foo":"bar"}
Return the latest document if any documents have been added to the stream. Otherwise an empty response will be given.
GET /stream/:topic/_
{"foo":"bar"}
Requesting multiple documents at once can only be done if you want a continuous range. To do this, simply provide the start and end locations. Both will be included in the result. If the range extends beyond the end of the documents currently stored in the stream you will simply get the available data. Be aware that Stroom implementations are free to place limits on how many documents (or how much data) they will return for one request so you may also get less documents than requested even when there are more documents available. However, a request must always return at least one document if documents are available in the given range.
GET /stream/:topic/:start_location-:end_location
[{"foo":"bar"},{"foo":"baz"}]
You may omit all documents from a given location by simply omitting the end location. Be aware that this call has the same limits as the regular range request.
GET /stream/:topic/:start_location-
[{"foo":"bar"},{"foo":"baz"}]
Documents can never be modified in a stream, but the stream can be truncated and new documents can fill the old locations. Be aware of the complications this can cause to simple aggregate jobs running over streams! To truncate a stream, simply send a delete request with a topic and location. The specified location will be included in the truncated part of the stream.
DELETE /stream/:topic/:start_location