Skip to content

feat: make Telethon flood-waits visible in the log#124

Open
tondeaf wants to merge 1 commit intoGeiserX:mainfrom
tondeaf:feat/flood-wait-visibility
Open

feat: make Telethon flood-waits visible in the log#124
tondeaf wants to merge 1 commit intoGeiserX:mainfrom
tondeaf:feat/flood-wait-visibility

Conversation

@tondeaf
Copy link
Copy Markdown

@tondeaf tondeaf commented Apr 18, 2026

The problem

Telethon's TelegramClient silently sleeps through any FloodWaitError whose wait is ≤ flood_sleep_threshold (default 60 s). On a large iter_messages pass — e.g. initial backup of a 100 k-message chat — Telegram often hands back much longer waits, and the operator can't tell "crashed" from "rate-limited":

  • No log lines
  • No DB writes
  • Realtime listener is still firing
  • systemctl status still shows the service as active

The fix

  1. Set flood_sleep_threshold=0 on the shared TelegramClient so every FloodWaitError bubbles up as an exception rather than a silent sleep.
  2. Add iter_messages_with_flood_retry(client, entity, **kwargs) — a thin async generator that wraps client.iter_messages, logs FloodWait: sleeping Ns before resuming at message_id=M, sleeps, then resumes iteration from the last-seen message id.
  3. Use that wrapper at the two bulk-iteration call-sites in src/telegram_backup.py (full chat backup and gap-fill).

New config knob FLOOD_WAIT_LOG_THRESHOLD (default 10 s) controls the minimum wait to log, so trivial 1-2 s floods don't spam the log.

Before/after is qualitative:

  • Before: multi-minute flood-wait looks like a hang.
  • After: FloodWait: sleeping 47s before resuming at message_id=87342 — operator immediately knows what's happening.

Conflict notes

Rebased on top of the recently-merged SKIP_TOPIC_IDS feature (#118). Both conflict sites in src/telegram_backup.py are trivial — our flood-retry wrapper wraps the iter_messages call, and the topic-skip check stays inside the loop body where upstream added it.

Test plan

  • pytest tests/test_flood_wait_visibility.py tests/test_config.py — 101 pass
  • Manually triggered in production on a large chat; log now shows the flood-wait line and resumes cleanly from the correct message id.

Summary by CodeRabbit

  • Configuration

    • Updated Telegram client configuration with new settings for rate-limiting condition detection.
  • Bug Fixes

    • Implemented automatic retry and resume functionality for message retrieval operations affected by Telegram rate-limiting, ensuring backup synchronization and gap recovery continue seamlessly after throttling conditions are encountered.

Telethon's ``TelegramClient`` silently sleeps through any FloodWaitError
whose wait is ≤ ``flood_sleep_threshold`` (default 60s). On a large
``iter_messages`` pass (e.g. a 100k-message chat) Telegram often hands
back much longer waits, and the default behaviour makes the run look
like a hang: no log lines, no DB writes, realtime listener still firing.
The operator can't tell "crashed" from "rate-limited".

This change makes flood-waits explicit:

1. Set ``flood_sleep_threshold=0`` on the shared ``TelegramClient`` so
   every FloodWaitError bubbles up as an exception rather than a silent
   sleep.
2. Introduce ``iter_messages_with_flood_retry(client, entity, **kwargs)``
   — a thin async generator that wraps ``client.iter_messages``,
   logs ``FloodWait: sleeping Ns before resuming at message_id=M``,
   sleeps, then resumes iteration from the last-seen message id.
3. Use that wrapper at the two bulk-iteration sites in
   ``src/telegram_backup.py`` (full chat backup and gap-fill).

Config: ``FLOOD_WAIT_LOG_THRESHOLD`` (default 10s) controls the minimum
wait to log — trivial 1-2s floods stay quiet.

Before: multi-minute flood-wait looks like a crash.
After: operator sees ``FloodWait: sleeping 47s before resuming`` and
knows exactly what's happening.

Regression tests in ``tests/test_flood_wait_visibility.py``:
- Wrapper resumes iteration from the last-seen message id after a flood.
- Short floods below the log threshold are suppressed.
- Long floods emit a structured log line with wait duration + cursor.
@tondeaf tondeaf requested a review from GeiserX as a code owner April 18, 2026 21:52
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

📝 Walkthrough

Walkthrough

Changes introduce flood-wait handling for Telegram message iteration by always setting flood_sleep_threshold: 0 in client kwargs to surface errors, adding a retry mechanism with resume-from-last-id semantics for message iteration, and expanding test coverage for flood-wait scenarios.

Changes

Cohort / File(s) Summary
Configuration Updates
src/config.py
Modified build_telegram_client_kwargs() and Config.get_telegram_client_kwargs() to always return {"flood_sleep_threshold": 0}, with proxy configuration conditionally added when available.
Flood Retry Logic
src/telegram_backup.py
Added iter_messages_with_flood_retry() async generator to catch FloodWaitError, log duration, sleep, and resume iteration from last message id. Integrated into _backup_dialog() and _fill_gap_range() message loops.
Test Coverage
tests/test_config.py, tests/test_flood_wait_visibility.py
Updated existing config tests to expect flood_sleep_threshold: 0 in client kwargs. Added new test file with three tests verifying flood-sleep-threshold propagation, client initialization, and flood-retry resumption behavior with mocked asyncio sleep.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • PR #104: Directly related—these changes modify the same Telegram client kwargs helper functions introduced in that PR by adding the mandatory flood_sleep_threshold: 0 baseline and conditional proxy attachment.
🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete: the PR claims a new FLOOD_WAIT_LOG_THRESHOLD configuration knob (default 10s) but this is not implemented in the code changes shown. Either implement FLOOD_WAIT_LOG_THRESHOLD in the code or remove this claim from the description to match the actual implementation.
Docstring Coverage ⚠️ Warning Docstring coverage is 58.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title directly reflects the main change: making Telethon flood-waits visible via logging. It accurately summarizes the primary objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@GeiserX
Copy link
Copy Markdown
Owner

GeiserX commented Apr 18, 2026

Hey @tondeaf, excellent feature! Making flood-waits visible is a huge UX improvement for operators — "is it crashed or rate-limited?" is a really common pain point with Telethon. The min_id extraction via keyword-only parameter (*, min_id=0, **kwargs) is a smart design choice that prevents double-passing.

Here's the thorough review:

Code Quality (1 CRITICAL, 2 MEDIUM, 2 LOW)

CRITICAL — while True loop has no retry cap
If Telegram repeatedly returns FloodWaitError (e.g., account temporarily restricted), the loop retries forever. The backup process hangs indefinitely on a single chat, blocking all subsequent chats. Please add a maximum retry count:

MAX_FLOOD_RETRIES = 5
retries = 0
while True:
    try:
        async for msg in client.iter_messages(entity, min_id=resume_from, **kwargs):
            yield msg
            if getattr(msg, "id", None) is not None:
                resume_from = max(resume_from, msg.id)
            retries = 0  # reset on successful progress
        return
    except FloodWaitError as e:
        retries += 1
        if retries > MAX_FLOOD_RETRIES:
            logger.error("FloodWait: exceeded %d retries, giving up (last_msg_id=%s)", MAX_FLOOD_RETRIES, resume_from)
            raise
        logger.warning(...)
        await asyncio.sleep(e.seconds + 1)

MEDIUM — PR body advertises FLOOD_WAIT_LOG_THRESHOLD but it's not implemented
The description says "New config knob FLOOD_WAIT_LOG_THRESHOLD (default 10s) controls the minimum wait to log" — but this doesn't exist in the diff. All flood-waits are logged unconditionally. Either implement it or remove the claim from the PR body.

MEDIUM — e.seconds is trusted without an upper bound
A malicious/buggy server could send an extremely large FloodWaitError.seconds value, hanging the process for hours/days. Cap the sleep:

MAX_FLOOD_WAIT_SECONDS = 600  # 10 minutes
wait_seconds = min(e.seconds + 1, MAX_FLOOD_WAIT_SECONDS)
await asyncio.sleep(wait_seconds)

LOW — Two extra blank lines introduced
At telegram_backup.py:698 and :790, there are two consecutive blank lines after the topic-skip blocks. PEP 8 recommends at most one within a function body.

LOW — (yrru-mix3) in test docstring
Same as PR #123 — opaque internal ID, replace with PR/issue number.

Security (APPROVE with above changes)

  • No rate-limit amplification — the wrapper sleeps the full requested duration before retrying.
  • Log line only exposes e.seconds (int) and resume_from (message ID int) — no PII, no entity names.
  • flood_sleep_threshold=0 is the right call for a backup tool where visibility matters.

Test Adequacy (2 HIGH gaps, 2 MEDIUM)

HIGH — No test for FloodWait before any message is yielded
The seeded_iter always yields id=1 before the error. But what if the very first API call raises FloodWaitError? resume_from stays at original min_id. The initial fake_iter was defined for this but is actually dead code (overwritten by seeded_iter before use). Please: remove dead fake_iter, add a separate test for the no-yield-before-flood case.

HIGH — No test for multiple consecutive FloodWaitErrors
Production commonly sees 3-4 consecutive floods before success. Add a test that raises FloodWaitError 2-3 times, then succeeds.

MEDIUM — No test for max_id preservation across retries
The gap-fill call-site passes max_id=gap_end via **kwargs. It's forwarded correctly, but an explicit test would catch regressions.

MEDIUM — _fake_db fixture is overly broad
The autouse=True fixture reloads src.connection and src.telegram_backup for every test, including test_config_kwargs_include_flood_sleep_threshold_zero which doesn't need it. Consider narrowing the scope.

CodeRabbit

CodeRabbit was rate-limited on this PR. I'll retrigger it once the cooldown expires.

Thanks for the contribution! 🙏

@GeiserX
Copy link
Copy Markdown
Owner

GeiserX commented Apr 18, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

✅ Actions performed

Full review triggered.

@GeiserX
Copy link
Copy Markdown
Owner

GeiserX commented Apr 19, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 19, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
tests/test_flood_wait_visibility.py (1)

26-44: autouse=True is overkill and module-reload is a footgun.

This fixture reloads src.connection and src.telegram_backup for every test in the file — including test_config_kwargs_include_flood_sleep_threshold_zero which only touches src.config and doesn't need the DB stub at all. Reloading modules mid-test-suite can also leak state into unrelated tests via sys.modules mutation if this file is imported alongside others.

Narrow the scope: drop autouse=True and apply the fixture explicitly to the two tests that actually exercise src.telegram_backup / src.connection. The teardown branch at lines 42-44 is also effectively a no-op (the if is always true since the fixture just inserted the module) — either restore the original sys.modules["src.db"] or remove the branch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_flood_wait_visibility.py` around lines 26 - 44, The _fake_db
fixture is too broad and the module reload teardown is unsafe; remove
autouse=True from the _fake_db fixture and instead add it as an explicit
argument to only the tests that need DB stubbing (the tests that import/use
src.telegram_backup and src.connection), and replace the teardown no-op with
restoring the prior sys.modules["src.db"] value (capture original =
sys.modules.get("src.db") at fixture start and on teardown set
sys.modules["src.db"] = original or delete it if original was None) so the
fixture (_fake_db), monkeypatch.setitem, and the importlib.reload calls no
longer leak state into unrelated tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/telegram_backup.py`:
- Around line 697-699: There are two instances of double blank lines inside
function bodies (one immediately after the statement "msg_data = await
self._process_message(message, chat_id)" and another in the later block around
similar message-processing logic); remove the extra blank line in each place so
there is at most one blank line between logical blocks per PEP 8 — update the
function(s) containing "_process_message" calls to collapse the consecutive
blank lines to a single blank line and re-run ruff format/check to ensure the
change passes linting.
- Around line 41-62: iter_messages_with_flood_retry currently retries forever,
sleeps for unbounded server-supplied seconds, doesn't honour
FLOOD_WAIT_LOG_THRESHOLD, and lacks type hints; fix by adding type hints for
parameters and return (e.g., AsyncIterator[Message] or appropriate Telethon
types), introduce a bounded retry counter (e.g., retry_count and MAX_RETRIES)
that increments on each FloodWaitError and is reset to 0 whenever resume_from
advances (i.e., after yielding a message), clamp the sleep to
MAX_FLOOD_WAIT_SECONDS (use min(e.seconds, MAX_FLOOD_WAIT_SECONDS) + 1) and only
log the FloodWait if e.seconds >= FLOOD_WAIT_LOG_THRESHOLD, and raise or abort
after exceeding MAX_RETRIES to avoid infinite loops (refer to
iter_messages_with_flood_retry, resume_from, e.seconds,
FLOOD_WAIT_LOG_THRESHOLD, MAX_FLOOD_WAIT_SECONDS, and retry_count/MAX_RETRIES).

In `@tests/test_flood_wait_visibility.py`:
- Around line 96-142: Remove the dead fake_iter definition (it’s overwritten by
seeded_iter) and add focused tests for iter_messages_with_flood_retry: (1) a
test where the first call to the fake iter_messages raises FloodWaitError before
yielding any message (ensure min_id stays 0 and subsequent call yields
messages), (2) a test that simulates consecutive FloodWaitError occurrences
before success to exercise retry-count logic, and (3) a test that captures the
kwargs passed into the fake iter_messages to assert that max_id (and other
kwargs) are preserved across retries (use SimpleNamespace client.iter_messages,
patch asyncio.sleep, and assert behavior against iter_messages_with_flood_retry
and the _fill_gap_range path).
- Around line 1-11: The module docstring uses a cryptic internal ID
"(yrru-mix3)"; replace that token in the top-of-file docstring with the actual
PR or issue reference (e.g. "PR `#1234`" or "Issue `#5678`") so git-blame/searchers
can find the context—edit the string in tests/test_flood_wait_visibility.py (the
file-level docstring surrounding "Flood-wait visibility") to include the real
PR/issue number instead of "(yrru-mix3)".

---

Nitpick comments:
In `@tests/test_flood_wait_visibility.py`:
- Around line 26-44: The _fake_db fixture is too broad and the module reload
teardown is unsafe; remove autouse=True from the _fake_db fixture and instead
add it as an explicit argument to only the tests that need DB stubbing (the
tests that import/use src.telegram_backup and src.connection), and replace the
teardown no-op with restoring the prior sys.modules["src.db"] value (capture
original = sys.modules.get("src.db") at fixture start and on teardown set
sys.modules["src.db"] = original or delete it if original was None) so the
fixture (_fake_db), monkeypatch.setitem, and the importlib.reload calls no
longer leak state into unrelated tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9fb4dc56-f287-4704-ad25-fd93d6825d37

📥 Commits

Reviewing files that changed from the base of the PR and between d76873a and 76f7b4e.

📒 Files selected for processing (4)
  • src/config.py
  • src/telegram_backup.py
  • tests/test_config.py
  • tests/test_flood_wait_visibility.py

Comment thread src/telegram_backup.py
Comment on lines +41 to +62
async def iter_messages_with_flood_retry(client, entity, *, min_id=0, **kwargs):
"""Wrap ``client.iter_messages`` so FloodWaitError is logged and retried.

With ``flood_sleep_threshold=0`` on the client, every flood-wait bubbles up
as an exception. We log the wait and resume iteration from the last yielded
message id so progress isn't lost.
"""
resume_from = min_id
while True:
try:
async for msg in client.iter_messages(entity, min_id=resume_from, **kwargs):
yield msg
if getattr(msg, "id", None) is not None:
resume_from = max(resume_from, msg.id)
return
except FloodWaitError as e:
logger.warning(
"FloodWait: sleeping %ss before resuming (last_msg_id=%s)",
e.seconds,
resume_from,
)
await asyncio.sleep(e.seconds + 1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Retry loop needs a leash and a muzzle.

Two production risks and one missing feature in this wrapper:

  1. Infinite retry. while True with no retry cap: a restricted/limited account can loop forever on a single chat, stalling all subsequent chats in the run. Add a bounded retry count that resets on progress.
  2. Unbounded sleep. asyncio.sleep(e.seconds + 1) trusts whatever the server returns. A pathological e.seconds (hours/days) silently wedges the backup. Cap via MAX_FLOOD_WAIT_SECONDS.
  3. FLOOD_WAIT_LOG_THRESHOLD advertised in the PR description but not implemented — every flood-wait logs at WARNING, including trivial sub-second ones. Either wire it up or drop the claim from the PR.
  4. Missing type hints on the signature (client, entity, return type). As per coding guidelines: "Use type hints for function signatures".
🔧 Proposed fix
-async def iter_messages_with_flood_retry(client, entity, *, min_id=0, **kwargs):
-    """Wrap ``client.iter_messages`` so FloodWaitError is logged and retried.
-
-    With ``flood_sleep_threshold=0`` on the client, every flood-wait bubbles up
-    as an exception. We log the wait and resume iteration from the last yielded
-    message id so progress isn't lost.
-    """
-    resume_from = min_id
-    while True:
-        try:
-            async for msg in client.iter_messages(entity, min_id=resume_from, **kwargs):
-                yield msg
-                if getattr(msg, "id", None) is not None:
-                    resume_from = max(resume_from, msg.id)
-            return
-        except FloodWaitError as e:
-            logger.warning(
-                "FloodWait: sleeping %ss before resuming (last_msg_id=%s)",
-                e.seconds,
-                resume_from,
-            )
-            await asyncio.sleep(e.seconds + 1)
+MAX_FLOOD_RETRIES = 5
+MAX_FLOOD_WAIT_SECONDS = 600
+FLOOD_WAIT_LOG_THRESHOLD = int(os.getenv("FLOOD_WAIT_LOG_THRESHOLD", "10"))
+
+
+async def iter_messages_with_flood_retry(
+    client: TelegramClient,
+    entity,
+    *,
+    min_id: int = 0,
+    **kwargs,
+):
+    """Wrap ``client.iter_messages`` so FloodWaitError is logged and retried.
+
+    With ``flood_sleep_threshold=0`` on the client, every flood-wait bubbles up
+    as an exception. We log the wait and resume iteration from the last yielded
+    message id so progress isn't lost.
+    """
+    resume_from = min_id
+    retries = 0
+    while True:
+        made_progress = False
+        try:
+            async for msg in client.iter_messages(entity, min_id=resume_from, **kwargs):
+                yield msg
+                made_progress = True
+                if getattr(msg, "id", None) is not None:
+                    resume_from = max(resume_from, msg.id)
+            return
+        except FloodWaitError as e:
+            if made_progress:
+                retries = 0  # reset on progress
+            retries += 1
+            if retries > MAX_FLOOD_RETRIES:
+                logger.error(
+                    "FloodWait: exceeded %d retries, aborting (last_msg_id=%s)",
+                    MAX_FLOOD_RETRIES,
+                    resume_from,
+                )
+                raise
+            wait_seconds = min(e.seconds + 1, MAX_FLOOD_WAIT_SECONDS)
+            if e.seconds >= FLOOD_WAIT_LOG_THRESHOLD:
+                logger.warning(
+                    "FloodWait: sleeping %ss (requested=%ss) before resuming "
+                    "(last_msg_id=%s, retry=%d/%d)",
+                    wait_seconds,
+                    e.seconds,
+                    resume_from,
+                    retries,
+                    MAX_FLOOD_RETRIES,
+                )
+            await asyncio.sleep(wait_seconds)

As per coding guidelines: "Use type hints for function signatures".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/telegram_backup.py` around lines 41 - 62, iter_messages_with_flood_retry
currently retries forever, sleeps for unbounded server-supplied seconds, doesn't
honour FLOOD_WAIT_LOG_THRESHOLD, and lacks type hints; fix by adding type hints
for parameters and return (e.g., AsyncIterator[Message] or appropriate Telethon
types), introduce a bounded retry counter (e.g., retry_count and MAX_RETRIES)
that increments on each FloodWaitError and is reset to 0 whenever resume_from
advances (i.e., after yielding a message), clamp the sleep to
MAX_FLOOD_WAIT_SECONDS (use min(e.seconds, MAX_FLOOD_WAIT_SECONDS) + 1) and only
log the FloodWait if e.seconds >= FLOOD_WAIT_LOG_THRESHOLD, and raise or abort
after exceeding MAX_RETRIES to avoid infinite loops (refer to
iter_messages_with_flood_retry, resume_from, e.seconds,
FLOOD_WAIT_LOG_THRESHOLD, MAX_FLOOD_WAIT_SECONDS, and retry_count/MAX_RETRIES).

Comment thread src/telegram_backup.py
Comment on lines 697 to 699


msg_data = await self._process_message(message, chat_id)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Double blank lines, PEP 8 disapproves.

Lines 697-698 and 787-788 introduce two consecutive blank lines inside function bodies. PEP 8 allows at most one blank line between logical blocks within a function. Ruff format should catch this on pre-push.

✂️ Proposed fix
             if self.config.should_skip_topic(chat_id, extract_topic_id(message)):
                 continue
-

             msg_data = await self._process_message(message, chat_id)

Apply the same trim at line 788.

As per coding guidelines: "Run python3 -m ruff check . && python3 -m ruff format --check . before pushing to ensure CI lint workflow passes".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/telegram_backup.py` around lines 697 - 699, There are two instances of
double blank lines inside function bodies (one immediately after the statement
"msg_data = await self._process_message(message, chat_id)" and another in the
later block around similar message-processing logic); remove the extra blank
line in each place so there is at most one blank line between logical blocks per
PEP 8 — update the function(s) containing "_process_message" calls to collapse
the consecutive blank lines to a single blank line and re-run ruff format/check
to ensure the change passes linting.

Comment on lines +1 to +11
"""Flood-wait visibility (yrru-mix3).

Goal: make Telethon flood-waits visible in the scheduler log so a long silent
pause during backfill can be diagnosed instead of mistaken for a hang.

Two things under test:
1. Config exposes ``flood_sleep_threshold=0`` in the shared client kwargs so
Telethon always raises ``FloodWaitError`` instead of sleeping silently.
2. A thin retry wrapper around ``client.iter_messages`` catches the error,
logs the wait, and resumes iteration from the last yielded message id.
"""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace cryptic internal ID with a real reference.

(yrru-mix3) tells future readers nothing. Use the PR/issue number so anyone spelunking through git blame can find the context.

📝 Proposed fix
-"""Flood-wait visibility (yrru-mix3).
+"""Flood-wait visibility (PR `#124`).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_flood_wait_visibility.py` around lines 1 - 11, The module
docstring uses a cryptic internal ID "(yrru-mix3)"; replace that token in the
top-of-file docstring with the actual PR or issue reference (e.g. "PR `#1234`" or
"Issue `#5678`") so git-blame/searchers can find the context—edit the string in
tests/test_flood_wait_visibility.py (the file-level docstring surrounding
"Flood-wait visibility") to include the real PR/issue number instead of
"(yrru-mix3)".

Comment on lines +96 to +142
calls = {"n": 0}

async def fake_iter(entity, min_id=0, reverse=True, **_):
calls["n"] += 1
if calls["n"] == 1:
assert min_id == 0
raise FloodWaitError(request=None, capture=7)
# Second call: resume from last yielded id (1) then yield 2, 3
assert min_id == 1
for i in (2, 3):
yield SimpleNamespace(id=i)

fake_client = SimpleNamespace(iter_messages=fake_iter)

collected: list[int] = []

async def fast_sleep(_):
return None

with (
caplog.at_level(logging.WARNING, logger="src.telegram_backup"),
patch.object(telegram_backup.asyncio, "sleep", fast_sleep),
):
# Simulate: first fetch yields id=1, then FloodWait, then retry yields 2,3.
# We need an additional pre-yielded message to seed last-id tracking.
async def seeded_iter(entity, min_id=0, reverse=True, **_):
calls["n"] += 1
if calls["n"] == 1:
yield SimpleNamespace(id=1)
raise FloodWaitError(request=None, capture=7)
assert min_id == 1
for i in (2, 3):
yield SimpleNamespace(id=i)

fake_client.iter_messages = seeded_iter
calls["n"] = 0
async for msg in telegram_backup.iter_messages_with_flood_retry(
fake_client, "chat", min_id=0, reverse=True
):
collected.append(msg.id)

assert collected == [1, 2, 3]
assert calls["n"] == 2
assert any(
"FloodWait" in r.getMessage() and "7" in r.getMessage()
for r in caplog.records
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dead code and thin coverage.

Two problems here:

  1. Dead fake_iter (lines 98-106): defined, never called. seeded_iter at line 121 immediately overwrites fake_client.iter_messages, making the outer definition pure noise that confuses readers about what the test actually exercises.

  2. Coverage gaps the reviewer flagged remain unaddressed:

    • FloodWait on the initial call (before any message yields) — resume_from stays at min_id=0; your current test always seeds with id=1 first, so the "no progress yet" branch is untested.
    • Consecutive FloodWaitErrors (2–3 in a row before success) — exercises the retry-count logic (see the cap suggested on the production wrapper).
    • max_id preservation across retries for the _fill_gap_range path — trivial to add by asserting kwargs.get("max_id") inside the fake iter.
🧪 Sketch for additional tests
`@pytest.mark.asyncio`
async def test_iter_with_flood_retry_before_first_yield(caplog):
    from src import telegram_backup

    calls = {"n": 0}

    async def iter_fn(entity, min_id=0, **_):
        calls["n"] += 1
        if calls["n"] == 1:
            assert min_id == 0
            raise FloodWaitError(request=None, capture=3)
            yield  # make this an async generator
        assert min_id == 0  # no progress made -> resume from original min_id
        for i in (1, 2):
            yield SimpleNamespace(id=i)

    client = SimpleNamespace(iter_messages=iter_fn)
    with patch.object(telegram_backup.asyncio, "sleep", AsyncMock()):
        out = [m.id async for m in telegram_backup.iter_messages_with_flood_retry(client, "chat")]
    assert out == [1, 2]


`@pytest.mark.asyncio`
async def test_iter_with_flood_retry_preserves_max_id():
    from src import telegram_backup

    seen_kwargs = []

    async def iter_fn(entity, min_id=0, **kwargs):
        seen_kwargs.append({"min_id": min_id, **kwargs})
        if len(seen_kwargs) == 1:
            raise FloodWaitError(request=None, capture=1)
            yield
        for i in (10, 11):
            yield SimpleNamespace(id=i)

    client = SimpleNamespace(iter_messages=iter_fn)
    with patch.object(telegram_backup.asyncio, "sleep", AsyncMock()):
        async for _ in telegram_backup.iter_messages_with_flood_retry(
            client, "chat", min_id=5, max_id=20, reverse=True
        ):
            pass

    assert all(k.get("max_id") == 20 for k in seen_kwargs)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_flood_wait_visibility.py` around lines 96 - 142, Remove the dead
fake_iter definition (it’s overwritten by seeded_iter) and add focused tests for
iter_messages_with_flood_retry: (1) a test where the first call to the fake
iter_messages raises FloodWaitError before yielding any message (ensure min_id
stays 0 and subsequent call yields messages), (2) a test that simulates
consecutive FloodWaitError occurrences before success to exercise retry-count
logic, and (3) a test that captures the kwargs passed into the fake
iter_messages to assert that max_id (and other kwargs) are preserved across
retries (use SimpleNamespace client.iter_messages, patch asyncio.sleep, and
assert behavior against iter_messages_with_flood_retry and the _fill_gap_range
path).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants