Bug Description
The AI chatbot provides answers with no mechanism for users to rate response quality. There is no way to collect granular feedback on individual responses, making it impossible to identify which answers are helpful, hallucinated, or outdated. The existing #161 mentions a feedback loop but lacks per-response granularity.
Competitor Benchmark
- ChatGPT: Thumbs up/down on every response + comment box
- Claude: 👍 / 👎 with optional detailed feedback
- Perplexity: Rate answer + flag incorrect information
- Intercom: thumbs up/down with follow-up question
- Zendesk AI: Satisfaction rating per interaction
Affected Users
- Users receiving incorrect or unhelpful answers
- Users who want to flag outdated information
- Documentation team trying to improve RAG quality
Blast Radius
- Primary: Chatbot users receiving poor answers
- Secondary: Documentation team maintaining RAG pipeline
- Impact: Data-driven improvements, reduced hallucinations, better user satisfaction
Root Cause Analysis
No feedback mechanism exists at the response level. #161 (Feedback Loop) is high-level architecture but does not specify per-response UI. Without granular feedback, the team cannot identify which documents or retrievals are problematic.
Proposed Solution
Option A: Thumbs Up/Down + Comment (Recommended)
Add to each chatbot response:
[Response text]
Was this helpful? 👍 👎
[Optional: Tell us more... _______]
Option B: Flagging System
Allow users to flag specific issues:
- ❌ Incorrect information
- ❌ Outdated content
- ❌ Missing context
- ❌ Hallucination
- ✅ Helpful
Additional Fixes Required
Acceptance Criteria
Raw Context
Without per-response feedback, the team is flying blind on chatbot quality. One bad answer about staking could cost users money, but there's no way to know it's happening.
Bug Description
The AI chatbot provides answers with no mechanism for users to rate response quality. There is no way to collect granular feedback on individual responses, making it impossible to identify which answers are helpful, hallucinated, or outdated. The existing #161 mentions a feedback loop but lacks per-response granularity.
Competitor Benchmark
Affected Users
Blast Radius
Root Cause Analysis
No feedback mechanism exists at the response level. #161 (Feedback Loop) is high-level architecture but does not specify per-response UI. Without granular feedback, the team cannot identify which documents or retrievals are problematic.
Proposed Solution
Option A: Thumbs Up/Down + Comment (Recommended)
Add to each chatbot response:
Option B: Flagging System
Allow users to flag specific issues:
Additional Fixes Required
Acceptance Criteria
Raw Context
Without per-response feedback, the team is flying blind on chatbot quality. One bad answer about staking could cost users money, but there's no way to know it's happening.