Skip to content

binding-and-user-name-mapping#20

Open
WhiteFeather127 wants to merge 4 commits into
Phobos-developers:masterfrom
WhiteFeather127:binding-and-user-name-mapping
Open

binding-and-user-name-mapping#20
WhiteFeather127 wants to merge 4 commits into
Phobos-developers:masterfrom
WhiteFeather127:binding-and-user-name-mapping

Conversation

@WhiteFeather127

Copy link
Copy Markdown
Contributor
命令 说明
/bind <QQ号> Discord 私信 发起,绑定到指定 QQ 号
/bind <Discord昵称> QQ 私信 发起,绑定到指定 Discord 用户
/unbind 解除当前账号的所有绑定关系

当机器人使用跨群转发功能转发一条消息时:

  • 如果原消息中 @ 了某个用户,且该用户已完成跨平台绑定,那么在目标群中转发时,机器人将自动 @ 该用户所绑定的对方平台账号
  • 同时,如果发送者本人也已绑定账号,则转发时会用其绑定的平台账号名替换原有的用户名(即显示为绑定后的名称)。
Command Description
/bind <QQ number> Initiate from a Discord DM to bind to the specified QQ number.
/bind <Discord nickname> Initiate from a QQ DM to bind to the specified Discord user.
/unbind Unbind all binding relationships for the current account.

When the bot forwards a message using the cross‑group forwarding feature:

  • If the original message @mentioned a user and that user has completed cross‑platform binding, the bot will automatically @ the corresponding bound account of that user on the other platform when forwarding to the target group.
  • Additionally, if the sender themselves have bound their account, the bot will replace the original username with the bound platform account name (i.e., display the bound name) during forwarding.

@TaranDahl

Copy link
Copy Markdown
Collaborator

PR #16 审查意见(去除已忽略/已修复项)

🟠 严重

1. 验证流程存在竞态条件

文件: orchestrator.py:614-670

async def _handle_verification_reply(self, event, code):
    result = self._verification_manager.verify(...)
    # ↓ 验证码校验通过到 bind() 执行之间有窗口
    self._bind_manager.bind(...)

验证码校验成功到 bind() 执行之间没有锁保护。考虑以下场景:

  1. Alice 发起 /bind, Bot 发验证码到 Bob
  2. Bob 回复验证码, verify() 通过
  3. 在执行 bind() 之前, Bob 又发起了一个 /bind 请求(指向另一个用户)
  4. 两个协程同时到达 bind(), 导致绑定状态不一致

虽然 BindManager.bind() 内部有冲突检查,但两个协程可能都通过了检查然后同时执行 _save(),后写入的会覆盖前一个。

建议: 用 asyncio.Lock 保护整个绑定操作,或者在 BindManager 内部实现 compare-and-swap,把检查+写入做成原子操作。


2. 直接访问私有成员 _cache

文件: orchestrator.py:713 / L764 / L834

多处通过 self.matcher._cache 直接访问 UserMatcher 的双下划线私有属性:

# _resolve_target_user
cache = self.matcher._cache.get(platform)

# _resolve_author_display_name
cache = self.matcher._cache.get("discord" if platform == "qq" else "qq", {})

# _resolve_text_mention_via_binding
source_cache = self.matcher._cache.get(source_platform, {})

Python 的 __cache(双下划线前缀)会被 name mangling 为 _UserMatcher__cache,但 _cache(单下划线)只是约定上的保护,外部仍可访问。这破坏了封装——如果以后 UserMatcher 改用了数据库或 Redis 做缓存,这些代码都会静默失效。

建议: 在 UserMatcher 上添加公共方法:

  • get_display_name(platform, user_id) -> str | None
  • search_users(name, platform) -> list[tuple[str, str]]
  • get_platform_cache(platform) -> dict[str, str](如果确实需要暴露)

🟡 中等

3. 缺少新功能单元测试

新增了 308 行核心逻辑(bind_manager.py 156 行 + verification.py 152 行),但完全没有测试覆盖。orchestrator.py 新增的 432 行也只修改了原有测试中的断言(QQUser:`QQUser`:),没有覆盖任何绑定/解绑/验证码/昵称替换的测试路径。

建议:至少补充:

  • BindManagerbind / unbind / get_counterpart / 双向映射一致性 / 冲突异常 / JSON 持久化与恢复
  • VerificationManagercreate / verify / code 过期 / 多次错误重试 / cancel
  • Orchestrator_handle_bind_command / _handle_unbind_command / _handle_verification_reply / _resolve_author_display_name

4. @提及绑定的名称匹配逻辑过于宽松

文件: orchestrator.py:836

if name == source_display or name in source_display or source_display in name:

三层匹配(精确匹配 + 双向子串包含)会引入误匹配。例如:

  • 用户 A 的显示名 = "test"
  • 用户 B 的显示名 = "test123"
  • 用户在消息中 @test123
  • 文本解析出 name = "test123"
  • "test" in "test123" 为 True → 错误地匹配到用户 A

建议: 先做精确匹配,只有精确匹配失败时才 fallback 到子串匹配,且对子串匹配结果加日志警告。或者在子串匹配时取最短显示名,而不是第一个匹配。


5. /bind 命令解析的边界情况

文件: orchestrator.py:679

rest = text[6:]  # "/bind " 硬编码长度为 6

两个问题:

  • 用户输入 /bind (多个空格)时 text[6:] 会包含前导空格,后续 strip() 能处理但逻辑脆弱
  • L693 rest.isdigit() 从 Discord 发起时,假设纯数字一定是 QQ 号。Discord 用户名可以是纯数字(虽然罕见),会误判

建议: 改用正则 /^\/bind\s+/ 做前缀匹配;给 _parse_bind_target 增加 from_platform 参数,从 Discord 发起时仍允许通过 Discord:用户名 显式声明目标平台。


🔵 轻微

6. 验证码重试无次数限制

文件: verification.py:121-122

验证码输错后被放回 pending 队列,允许无限重试。6 位纯数字验证码共 100 万种组合,攻击者在 5 分钟内以每轮一次网络请求的速度理论上可暴力破解。

建议: 在 VerificationCode 中添加 attempts: int = 0 字段,错误时 attempts += 1,超过 5 次直接删除验证码。


7. 测试中 Orchestrator 初始化可能产生副作用

文件: tests/test_orchestrator.py:116

orch = Orchestrator(bridge_config, mock_message_store)

BridgeConfig()data_dir 默认值为 "./data"Orchestrator.__init__ 会据此创建 BindManager(data_dir="./data")。如果测试路径意外触发了 _save(),会在工作目录下创建 ./data/bindings/... 目录。

建议: 在测试中显式传入 bind_manager,或使用 pytesttmp_path fixture 设置临时 data_dir。---

PR #16 Review Comments (Ignored/Fixed Items Removed)

🟠 Critical

1. Race Condition in Verification Workflow

File: orchestrator.py:614-670

async def _handle_verification_reply(self, event, code):
    result = self._verification_manager.verify(...)
    # ↓ Window exists between successful verification and bind() execution
    self._bind_manager.bind(...)

No lock protection covers the period from successful verification to the execution of bind(). Consider this scenario:

  1. Alice sends the /bind command, and the Bot delivers a verification code to Bob
  2. Bob replies with the code, and verify() passes validation
  3. Before bind() runs, Bob submits another /bind request targeting a different user
  4. Two coroutines reach bind() simultaneously, leading to inconsistent binding states

Although conflict checks exist inside BindManager.bind(), both coroutines may pass the checks and execute _save() concurrently, where the later write overwrites the earlier one.

Recommendation: Wrap the entire binding logic with an asyncio.Lock, or implement compare-and-swap logic within BindManager to atomically combine validation and write operations.


2. Direct Access to Private Member _cache

File: orchestrator.py:713 / L764 / L834

Multiple locations directly access the single-underscore private attribute of UserMatcher via self.matcher._cache:

# _resolve_target_user
cache = self.matcher._cache.get(platform)

# _resolve_author_display_name
cache = self.matcher._cache.get("discord" if platform == "qq" else "qq", {})

# _resolve_text_mention_via_binding
source_cache = self.matcher._cache.get(source_platform, {})

Python double-underscore variables such as __cache undergo name mangling and become _UserMatcher__cache, while single-underscore _cache is merely a convention for private access and remains externally accessible. This breaks encapsulation: all related code will silently malfunction if UserMatcher later switches its cache backend to a database or Redis.

Recommendation: Add public methods to UserMatcher:

  • get_display_name(platform, user_id) -> str | None
  • search_users(name, platform) -> list[tuple[str, str]]
  • get_platform_cache(platform) -> dict[str, str] (if cache exposure is truly required)

🟡 Medium

3. Missing Unit Tests for New Functionalities

308 lines of core logic have been added (156 lines in bind_manager.py + 152 lines in verification.py), yet there is zero test coverage. The additional 432 lines in orchestrator.py only modified assertions in existing tests (QQUser:`QQUser`:) without covering any test paths for binding, unbinding, verification codes, or nickname replacement.

Recommendation: At minimum, implement tests covering:

  • BindManager: bind / unbind / get_counterpart / bidirectional mapping consistency / conflict exceptions / JSON persistence and restoration
  • VerificationManager: create / verify / code expiration / repeated failed retries / cancel
  • Orchestrator: _handle_bind_command / _handle_unbind_command / _handle_verification_reply / _resolve_author_display_name

4. Overly Permissive Name Matching Logic for @-Mention Bindings

File: orchestrator.py:836

if name == source_display or name in source_display or source_display in name:

The three-tier matching rule (exact match + bidirectional substring inclusion) introduces false positive matches. Example:

  • User A display name = "test"
  • User B display name = "test123"
  • User sends a message with @test123
  • Parsed name from text = "test123"
  • "test" in "test123" evaluates to True → incorrect match against User A

Recommendation: Prioritize exact matching, only fall back to substring matching if exact matching fails, and attach warning logs for substring matches. Alternatively, select the shortest display name among substring matches instead of returning the first hit.


5. Edge Cases in /bind Command Parsing

File: orchestrator.py:679

rest = text[6:]  # Hardcoded length of 6 for "/bind "

Two critical flaws exist:

  1. When users input /bind (trailing multiple spaces), text[6:] retains leading whitespace. While subsequent strip() resolves this, the underlying logic is fragile.
  2. Line L693 uses rest.isdigit() and assumes all numeric inputs originating from Discord are QQ IDs. Discord usernames may consist solely of digits (rare but valid), which triggers misclassification.

Recommendation: Use regex pattern /^\/bind\s+/ for prefix matching; add a from_platform parameter to _parse_bind_target to support explicit target platform declarations via Discord:Username for requests sent from Discord.


🔵 Minor

6. No Retry Limit for Verification Codes

File: verification.py:121-122
Incorrect verification code entries are returned to the pending queue, enabling unlimited retry attempts. A 6-digit numeric verification code has one million possible combinations; attackers could theoretically brute-force valid codes within 5 minutes with one network request per attempt.

Recommendation: Add an attempts: int = 0 field to the VerificationCode model. Increment attempts on each failed input, and delete the verification code entry once retries exceed 5 attempts.


7. Potential Side Effects from Orchestrator Initialization in Tests

File: tests/test_orchestrator.py:116

orch = Orchestrator(bridge_config, mock_message_store)

The default data_dir value for BridgeConfig() is "./data". Orchestrator.__init__ instantiates BindManager(data_dir="./data") based on this value. If test logic accidentally triggers _save(), directories like ./data/bindings/... will be created in the working directory.

Recommendation: Explicitly inject a custom bind_manager during test setup, or leverage pytest’s tmp_path fixture to configure a temporary data_dir.

…apsulation, test coverage, parsing, retry limit
@TaranDahl

Copy link
Copy Markdown
Collaborator
/bind | 从 QQ 私信 发起,绑定到指定 Discord 用户 -- | --

我觉得这个没必要吧?把这个去了之后上面那个匹配的问题也不用管了。
I don't think this is necessary. If we remove this, we won't have to deal with the matching issue above either.

@WhiteFeather127

Copy link
Copy Markdown
Contributor Author

All 7 review items have been addressed in commit 798345e:

🔴 Critical

  1. ✅ Race condition: Added _bind_lock = asyncio.Lock() to protect verify+bind flow in _handle_verification_reply and _handle_bind_command.
  2. _cache encapsulation: Added get_display_name(), has_user(), search_users_by_display() to UserMatcher; replaced all direct _cache access.

🟡 Medium
3. ✅ Test coverage: Added test_bind_manager.py (12 tests), test_verification.py (9 tests), and 15 new tests in test_orchestrator.py covering bind/unbind/verification/display name resolution.
4. ✅ @mention matching: search_users_by_display() does exact match first, substring fallback with warning log.
5. ✅ /bind parsing: Uses BIND_COMMAND_RE regex; _parse_bind_target accepts from_platform param.

🔵 Minor
6. ✅ Retry limit: VerificationCode.attempts field, code removed after 5 failed attempts.
7. ✅ Test side effects: bridge_config fixture uses tmp_path for data_dir.


7 项审查意见已全部修复:

🔴 严重

  1. ✅ 竞态条件:添加 _bind_lock 保护 verify+bind 整个流程
  2. _cache 封装:UserMatcher 新增公共方法,替换所有直接 _cache 访问

🟡 中等
3. ✅ 测试覆盖:新增 36 个单元测试
4. ✅ @提及匹配:先精确匹配,再模糊回退并打 warning
5. ✅ /bind 解析:改用正则,增加 from_platform 参数

🔵 轻微
6. ✅ 重试限制:超过 5 次错误删除验证码
7. ✅ 测试副作用:使用 tmp_path 设置临时 data_dir

@WhiteFeather127

Copy link
Copy Markdown
Contributor Author

/bind | 从 QQ 私信 发起,绑定到指定 Discord 用户 -- | --
我觉得这个没必要吧?把这个去了之后上面那个匹配的问题也不用管了。 I don't think this is necessary. If we remove this, we won't have to deal with the matching issue above either.

那就改单向
Then change the direction to one-way

WhiteFeather127 and others added 2 commits June 12, 2026 18:15
Comment out the _parse_bind_target path that allowed QQ users to bind
to a Discord user by display name. The original code is preserved as
comments for reference.

PR review (TaranDahl): this feature is unnecessary; removing it also
eliminates the related name-matching issue (Phobos-developers#4 in PR Phobos-developers#16 review).
Both branches added code after _on_discord_message:
- HEAD: _on_discord_dm method
- upstream/master: _last_processed_id persistence (PR Phobos-developers#18)

Kept both: persistence at end of _on_discord_message, _on_discord_dm after it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants