Oximeter is currently dropping a subsample of metrics on the colo rack:
Investigating further, these failures all come from two producer ports, 8001 and 4677:
These metrics come from ddmd and mgd, and sure enough, there are gaps (note that these metrics should be sampled every 1s):
I think this means that collecting from these producers is often taking >1s. I think we should either figure out why and fix it, or figure out oximeter can better tolerate collection latencies greater than the sampling interval, or raise the sampling interval.
Oximeter is currently dropping a subsample of metrics on the colo rack:
Investigating further, these failures all come from two producer ports, 8001 and 4677:
These metrics come from ddmd and mgd, and sure enough, there are gaps (note that these metrics should be sampled every 1s):
I think this means that collecting from these producers is often taking >1s. I think we should either figure out why and fix it, or figure out oximeter can better tolerate collection latencies greater than the sampling interval, or raise the sampling interval.