This chapter describes a set of guidelines that must be applied when writing and publishing event-driven APIs. These APIs are usually published via topics or queues.
- Implementation & Documentation
APIs are contracts between service providers and service consumers that cannot be broken via unilateral decisions. For this reason APIs may only be changed if backward compatibility is guaranteed. If this cannot be guaranteed, a new API major version must be provided and the old one has to be supported in parallel. For deprecation, follow the principles described in the chapter about deprecation.
API designers should apply the following rules to evolve APIs for services in a backward-compatible way:
- Add only optional, never mandatory fields.
- Never change the semantic of fields (e.g. changing the semantic from customer-number to customer-id, as both are different unique customer keys)
- Input fields may have (complex) constraints being validated via server-side business logic. Never change the validation logic to be more restrictive and make sure that all constraints are clearly defined in description.
- Enum ranges can be reduced when used as input parameters, only if the server is ready to accept and handle old range values too. Enum range can be reduced when used as output parameters.
- Enum ranges cannot be extended when used for output parameters — clients may not be prepared to handle it. However, enum ranges can be extended when used for input parameters.
- Use x-extensible-enum, if range is used for output parameters and likely to be extended with growing functionality. It defines an open list of explicit values and clients must be agnostic to new values.
On the other hand, API consumers must follow the tolerant reader rules (see below).
Clients of an API must follow the rules described in the chapter about tolerant dependencies.
Every API endpoint (topic/queue) needs to be secured by an state of the art authentication mechanism supported by the platform you’d like to use.
Distributed Tracing over multiple applications, teams and even across large solutions is very important in root cause analysis and helps detect how latencies are stacked up and where incidents are located and thus can significantly shorten mean time to repair (MTTR).
To identify a specific request through the entire chain and beyond team boundaries every team (and API)
MUST use OpenTelemetry as its way to trace calls and business transactions. Teams
MUST use standard W3C Trace Context Headers, as they are the common standard for distributed tracing and are supported by most of the cloud platforms and monitoring tools. We explicitly use W3C standards for eventing too and do not differ between synchronous and asynchronous requests, as we want to be able to see traces across the boundaries of these two architectural patterns.
The traceparent header field identifies the incoming request in a tracing system. The trace-id defines the trace through the whole forest of synchronous and asynchronous requests. The parent-id defines a specific span within a trace.
The tracestate header field specifies application and/or APM Tool specific key/value pairs.
We use the AsyncAPI specification as standard to define event-driven API specification files.
The API specification files should be subject to version control using a source code management system - best together with the implementing sources.
MUST publish the component API specification with the deployment of the implementing service and make it discoverable, following our publication principles. As a starting point, use our ESTA Blueprints (internal Link).
The preferred data format for asynchronous APIs in the SBB are either JSON or Apache AVRO If you have to decide which one, choose the data format based on what your customer / consumers are comfortable with. Additionally, please check out the confluent blog about differences of the two formats.
SHOULD NOT use legacy data formats such as Xml or Java Object Serialization Stream Protocol. It’s almost impossible to fulfill the principles laid out in this document because of numerous issues around versioning, compatibility and security considerations of these technologies.
Both are supported by the Kafka schema registry and as a linkable resource from the developer portal.
Versions in the specification must follow the principles described by SemVer. Versions in queue/topic names are always major versions. Or in a more generic way: we avoid introducing minors or patches anywhere where it could break consumers compatibility. We explicitly avoid resource based versioning, because this is a complexity which an API should not reflect to its consumers.
Good example for a topic/queue name:
Bad Example for a topic/queue name: