Skip to content

Time inconsistencies heuristic for faulty measurements detection#147

Open
LDiazN wants to merge 22 commits intomainfrom
time-inconsistencies
Open

Time inconsistencies heuristic for faulty measurements detection#147
LDiazN wants to merge 22 commits intomainfrom
time-inconsistencies

Conversation

@LDiazN
Copy link
Contributor

@LDiazN LDiazN commented Feb 11, 2026

Implements an airflow dag that flags measurements with timestamp anomalies

closes #146

@LDiazN LDiazN requested a review from hellais February 11, 2026 14:57
@LDiazN LDiazN self-assigned this Feb 11, 2026
@LDiazN LDiazN changed the title Time inconsistencies Time inconsistencies heuristic for faulty measurements detection Feb 11, 2026
@codecov
Copy link

codecov bot commented Feb 12, 2026

Codecov Report

❌ Patch coverage is 96.98492% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.52%. Comparing base (e70e168) to head (b8b2e18).

Files with missing lines Patch % Lines
...ine/src/oonipipeline/tasks/time_inconsistencies.py 86.66% 2 Missing ⚠️
oonipipeline/src/oonipipeline/tasks/volume.py 86.66% 2 Missing ⚠️
oonipipeline/tests/conftest.py 92.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #147      +/-   ##
==========================================
+ Coverage   82.77%   83.52%   +0.74%     
==========================================
  Files          78       84       +6     
  Lines        4871     5068     +197     
==========================================
+ Hits         4032     4233     +201     
+ Misses        839      835       -4     
Flag Coverage Δ
oonidata 77.86% <ø> (ø)
oonipipeline 87.02% <96.98%> (+1.00%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@hellais hellais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good. Left a comment for what to improve in the threshold checks for measurements from time travelers

WHERE
measurement_start_time >= %(start_time)s AND
measurement_start_time < %(end_time)s AND
abs(dateDiff('second', parseDateTimeBestEffort(substring(measurement_uid, 1, 15)), measurement_start_time)) >= %(treshold)s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should separate the checks for measurements from the future, from the ones from the past. Measurements from the future should never happen and are a sign of a probe with a faulty clock. Those from the past may be normal, since probes might be re-uploading measurements later.

"threshold": threshold,
}

values.append(("time_inconsistency", probe_cc, probe_asn, orjson.dumps(details).decode()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To facilitate querying and analysis we should maybe have two keys here for future and past measurements so we can look at them separately

)
)
res = db.execute("SELECT COUNT() FROM faulty_measurements WHERE type = 'volume'")
assert res == [(1,)], "There should be at the least one event"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the assert say "exactly one measurement"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement time inconsistency heuristic task

2 participants