Handling Time

Last Significant Update:

2024-06-25

Status:

Draft

Comments to:

The Issue

There’s no guarantee that the notion of the “current time” will be uniform on the machines in a multi-machine deployment — typically each machine’s notion of the “current time” would differ slightly from that of the others.

On some multi-processor system designs the in-CPU timestamp counters could differ in their value at any instant.

These differences cause problems when ordering events from different event streams in time order.

Possible Fixes

There are a couple of ways that we could cope with this:

Use application-specific knowledge to compensate for this wobble.

For example, the response to an RPC with a specific tag can only happen after the RPC was issued. Such application-specific knowledge could be used to adjust for skewed timestamps in traces.
Implement a periodic request/response round-trip sharing timestamps between the system being measured and the system doing the measurement.

These timestamps could be used to drive a phase-locked loop that tracks the timestamp skew for each measurement source.

TODO

Study how monitoring frameworks (like Open Telemetry) cope with this issue.