Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds operational safeguards to the Gin-based web API by introducing per-IP rate limiting, setting explicit HTTP server timeouts, and improving visibility into authentication failures via logging.
Changes:
- Add a per-IP token-bucket rate limiting middleware (+ tests) and document the new env configuration.
- Configure
http.Servertimeouts (read/write/idle) instead of usingrouter.Run. - Log JWT authorization rejections and JWT validation failures.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/middleware/ratelimit.go | New per-IP rate limiter middleware with background cleanup. |
| src/middleware/ratelimit_test.go | Tests for allow/deny behavior and per-IP isolation. |
| src/middleware/jwt.go | Add logging for auth rejections and JWT validation failures. |
| src/main.go | Add rate limit env config; wrap Gin in http.Server with timeouts. |
| src/go.mod | Add golang.org/x/time dependency. |
| src/go.sum | Add checksums for golang.org/x/time. |
| README.md | Document new rate limit env vars and server timeout behavior. |
| if !limiter.Allow() { | ||
| log.Printf("rate limit exceeded: ip=%s", ip) | ||
| c.AbortWithStatusJSON(http.StatusTooManyRequests, gin.H{"error": "rate limit exceeded"}) | ||
| return |
There was a problem hiding this comment.
Logging every 429 (rate limit exceeded) can become a high-volume log source during attacks or client misbehavior, potentially causing log flooding and extra CPU/IO. Consider sampling/throttling these logs (or emitting them at a lower verbosity) while still returning the 429 response.
| // envOrDefaultInt returns the environment variable as an int or defaultValue when unset/invalid. | ||
| func envOrDefaultInt(key string, defaultValue int) int { | ||
| if v := os.Getenv(key); v != "" { | ||
| n, err := strconv.Atoi(v) | ||
| if err != nil { | ||
| log.Fatalf("environment variable %s must be an integer: %v", key, err) | ||
| } | ||
| return n | ||
| } | ||
| return defaultValue | ||
| } |
There was a problem hiding this comment.
The comment says this returns the default when the env var is "unset/invalid", but the implementation log.Fatalfs on invalid values. Either update the comment to match the behavior, or change the function to actually fall back to defaultValue on parse errors (and consider validating for non-negative/positive values since these are used for rate limiting).
| Addr: ":" + port, | ||
| Handler: router, | ||
| ReadTimeout: 10 * time.Second, | ||
| WriteTimeout: 65 * time.Second, // must exceed /wait max timeout (60s) | ||
| IdleTimeout: 120 * time.Second, | ||
| MaxHeaderBytes: 1 << 20, // 1 MB |
There was a problem hiding this comment.
ReadTimeout: 10s applies to reading the entire request, including the body. This can break legitimate slow/large uploads (e.g., the multipart attachment upload reads the whole file from the request body). Consider using ReadHeaderTimeout for header protection and either a larger/more configurable ReadTimeout, or leaving ReadTimeout unset and enforcing body limits separately (e.g., MaxBytesReader).
| Addr: ":" + port, | |
| Handler: router, | |
| ReadTimeout: 10 * time.Second, | |
| WriteTimeout: 65 * time.Second, // must exceed /wait max timeout (60s) | |
| IdleTimeout: 120 * time.Second, | |
| MaxHeaderBytes: 1 << 20, // 1 MB | |
| Addr: ":" + port, | |
| Handler: router, | |
| ReadHeaderTimeout: 10 * time.Second, | |
| WriteTimeout: 65 * time.Second, // must exceed /wait max timeout (60s) | |
| IdleTimeout: 120 * time.Second, | |
| MaxHeaderBytes: 1 << 20, // 1 MB |
| IdleTimeout: 120 * time.Second, | ||
| MaxHeaderBytes: 1 << 20, // 1 MB | ||
| } | ||
| if err = srv.ListenAndServe(); err != nil { |
There was a problem hiding this comment.
ListenAndServe() returns http.ErrServerClosed on a graceful shutdown. If/when shutdown handling is added, this code would Fatalf on a normal exit. Consider treating http.ErrServerClosed as a non-error (log it at Info/Debug or ignore it) and only Fatalf unexpected errors.
| if err = srv.ListenAndServe(); err != nil { | |
| if err = srv.ListenAndServe(); err != nil && err != http.ErrServerClosed { |
| golang.org/x/sync v0.19.0 // indirect | ||
| golang.org/x/sys v0.41.0 // indirect | ||
| golang.org/x/text v0.34.0 // indirect | ||
| golang.org/x/time v0.15.0 // indirect |
There was a problem hiding this comment.
golang.org/x/time is imported directly by this module (via src/middleware/ratelimit.go), so it shouldn’t be marked // indirect. Running go mod tidy should update this entry to a direct requirement.
| func (rl *rateLimiter) getVisitor(ip string) *rate.Limiter { | ||
| val, ok := rl.visitors.Load(ip) | ||
| if ok { | ||
| v := val.(*visitor) | ||
| v.lastSeen = time.Now() | ||
| return v.limiter | ||
| } | ||
| limiter := rate.NewLimiter(rl.rps, rl.burst) | ||
| rl.visitors.Store(ip, &visitor{limiter: limiter, lastSeen: time.Now()}) | ||
| return limiter |
There was a problem hiding this comment.
visitor.lastSeen is read/written concurrently by request handlers (via getVisitor) and by the background cleanup() goroutine, but it’s a plain time.Time with no synchronization. This will cause data races under load (and can lead to incorrect eviction). Consider making lastSeen atomic (e.g., store UnixNano in an atomic.Int64) or protecting the visitor fields with a mutex, and update both getVisitor and cleanup to use the same synchronization strategy.
| val, ok := rl.visitors.Load(ip) | ||
| if ok { | ||
| v := val.(*visitor) | ||
| v.lastSeen = time.Now() | ||
| return v.limiter | ||
| } | ||
| limiter := rate.NewLimiter(rl.rps, rl.burst) | ||
| rl.visitors.Store(ip, &visitor{limiter: limiter, lastSeen: time.Now()}) | ||
| return limiter |
There was a problem hiding this comment.
getVisitor uses a Load then Store pattern which can create multiple limiters for the same IP under concurrent first access (last writer wins). That can allow extra requests through and increases memory churn. Using LoadOrStore (and only creating a limiter when needed) avoids this race and keeps per-IP state consistent.
| val, ok := rl.visitors.Load(ip) | |
| if ok { | |
| v := val.(*visitor) | |
| v.lastSeen = time.Now() | |
| return v.limiter | |
| } | |
| limiter := rate.NewLimiter(rl.rps, rl.burst) | |
| rl.visitors.Store(ip, &visitor{limiter: limiter, lastSeen: time.Now()}) | |
| return limiter | |
| now := time.Now() | |
| val, ok := rl.visitors.Load(ip) | |
| if ok { | |
| v := val.(*visitor) | |
| v.lastSeen = now | |
| return v.limiter | |
| } | |
| newVisitor := &visitor{ | |
| limiter: rate.NewLimiter(rl.rps, rl.burst), | |
| lastSeen: now, | |
| } | |
| actual, loaded := rl.visitors.LoadOrStore(ip, newVisitor) | |
| if loaded { | |
| v := actual.(*visitor) | |
| v.lastSeen = now | |
| return v.limiter | |
| } | |
| return newVisitor.limiter |
| func NewRateLimiter(rps float64, burst int) gin.HandlerFunc { | ||
| rl := &rateLimiter{ | ||
| rps: rate.Limit(rps), | ||
| burst: burst, | ||
| } | ||
| go rl.cleanup() | ||
| return rl.handler |
There was a problem hiding this comment.
Starting an unbounded background goroutine inside NewRateLimiter means each instantiation of the middleware leaks a goroutine for the lifetime of the process (tests create several, and future refactors might create multiple routers). Consider using a time.Ticker with a stop channel/context, or exposing a Close()/Stop() so the cleanup loop can be terminated when the server shuts down.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (1)
src/handlers/attachments.go:140
- This check (
currentTotal + written > MaxMsgSize) is not atomic with the subsequent INSERT/UPDATE. Concurrent attachment uploads to the same message can both pass the check and exceed the limit. Consider performing the size check and attachment upsert in a DB transaction with appropriate locking (e.g., lock the msg row / serialize per msg_id) or maintaining/enforcing a total-size constraint in the database.
if currentTotal+written > h.MaxMsgSize {
_ = os.Remove(finalPath)
c.JSON(http.StatusRequestEntityTooLarge, gin.H{"error": "total message size exceeds limit"})
return
}
// Persist to DB.
_, err = h.DB.Pool.Exec(ctx,
`INSERT INTO msg_attachment (msg_id, filename, filesize, filepath)
VALUES ($1, $2, $3, $4)
ON CONFLICT (msg_id, filename) DO UPDATE SET filesize=$3, filepath=$4`,
msgID, intendedFilename, written, finalPath,
)
| if int64(len(msg.Data)) > h.MaxDataSize { | ||
| c.JSON(http.StatusRequestEntityTooLarge, gin.H{"error": "message data exceeds maximum size"}) | ||
| return | ||
| } |
There was a problem hiding this comment.
Create enforces MaxDataSize but not MaxMsgSize. If MaxMsgSize is configured lower than MaxDataSize, this allows creating messages whose data alone exceeds the total-message limit. Add a MaxMsgSize check here as well (and ideally ensure MaxDataSize <= MaxMsgSize at startup).
| } | |
| } | |
| if int64(len(msg.Data)) > h.MaxMsgSize { | |
| c.JSON(http.StatusRequestEntityTooLarge, gin.H{"error": "message exceeds maximum size"}) | |
| return | |
| } |
| Addr: ":" + port, | ||
| Handler: router, | ||
| ReadHeaderTimeout: 10 * time.Second, | ||
| WriteTimeout: 65 * time.Second, // must exceed /wait max timeout (60s) |
There was a problem hiding this comment.
WriteTimeout: 65s can prematurely terminate slow clients when downloading message data/attachments (up to ~20MB by default), since WriteTimeout applies to the entire response write. Consider increasing it substantially or disabling it and relying on per-handler timeouts/limits; alternatively, set different timeouts for long-poll vs file-download endpoints.
| WriteTimeout: 65 * time.Second, // must exceed /wait max timeout (60s) | |
| WriteTimeout: 0, // disabled: a server-wide write timeout can prematurely terminate slow long-poll/download responses |
| if err = srv.ListenAndServe(); err != nil { | ||
| log.Fatalf("server error: %v", err) | ||
| } |
There was a problem hiding this comment.
ListenAndServe returns http.ErrServerClosed on a normal shutdown; treating all errors as fatal makes graceful shutdowns look like crashes. Consider handling http.ErrServerClosed explicitly (only fatal on other errors).
| Unauthorized: func(c *gin.Context, code int, message string) { | ||
| log.Printf("auth failure: ip=%s code=%d message=%s", c.ClientIP(), code, message) | ||
| c.JSON(code, gin.H{"error": message}) |
There was a problem hiding this comment.
The auth failure logs include untrusted strings (addr/message) with %s, which can allow log forging via embedded newlines and can make parsing harder. Consider logging these values quoted/escaped (e.g., %q) and/or sanitizing before logging.
| func (rl *rateLimiter) getVisitor(ip string) *rate.Limiter { | ||
| now := time.Now().UnixNano() | ||
| v := &visitor{limiter: rate.NewLimiter(rl.rps, rl.burst)} | ||
| v.lastSeen.Store(now) | ||
|
|
||
| if actual, loaded := rl.visitors.LoadOrStore(ip, v); loaded { | ||
| v = actual.(*visitor) | ||
| v.lastSeen.Store(now) | ||
| } |
There was a problem hiding this comment.
getVisitor allocates a new visitor (and rate.Limiter) on every request even when the IP already exists, because the allocation happens before LoadOrStore. This can be a significant overhead under load. Consider Load-first and only creating a new limiter when missing (or on the rare race where LoadOrStore loses).
| // Check total message size (data + all attachments including this one). | ||
| var currentTotal int64 | ||
| if err = h.DB.Pool.QueryRow(ctx, | ||
| `SELECT m.size + COALESCE((SELECT SUM(filesize) FROM msg_attachment WHERE msg_id = m.id), 0) |
There was a problem hiding this comment.
The total-size enforcement uses m.size from the msg row, but msg.size is client-controlled (set from msg.Size in Create/Update). This allows bypassing MaxMsgSize by sending a small size value. Compute and persist message size server-side (e.g., from len(data)/actual stored bytes) and base this query on that trusted value.
| // Check total message size (data + all attachments including this one). | |
| var currentTotal int64 | |
| if err = h.DB.Pool.QueryRow(ctx, | |
| `SELECT m.size + COALESCE((SELECT SUM(filesize) FROM msg_attachment WHERE msg_id = m.id), 0) | |
| // Check total message size using trusted server-side sizes | |
| // (stored message data bytes + all attachment bytes including this one). | |
| var currentTotal int64 | |
| if err = h.DB.Pool.QueryRow(ctx, | |
| `SELECT COALESCE(octet_length(m.data), 0) + | |
| COALESCE((SELECT SUM(filesize) FROM msg_attachment WHERE msg_id = m.id), 0) |
| if int64(len(msg.Data)) > h.MaxDataSize { | ||
| c.JSON(http.StatusRequestEntityTooLarge, gin.H{"error": "message data exceeds maximum size"}) | ||
| return | ||
| } | ||
|
|
||
| // Check total message size (data + existing attachments). | ||
| var attachTotal int64 | ||
| if err := h.DB.Pool.QueryRow(ctx, | ||
| "SELECT COALESCE(SUM(filesize), 0) FROM msg_attachment WHERE msg_id = $1", | ||
| msgID, | ||
| ).Scan(&attachTotal); err != nil { | ||
| log.Printf("update message %d: total size check: %v", msgID, err) | ||
| c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to check message size"}) | ||
| return | ||
| } | ||
| if int64(len(msg.Data))+attachTotal > h.MaxMsgSize { | ||
| c.JSON(http.StatusRequestEntityTooLarge, gin.H{"error": "total message size exceeds limit"}) | ||
| return | ||
| } |
There was a problem hiding this comment.
MaxMsgSize relies on attachment sizes from DB plus message data size, but the code still persists msg.Size from the request body into msg.size (client-controlled). To avoid inconsistencies (and to support attachment total-size enforcement), compute message size server-side from len(msg.Data) (or the saved file) and store that value, rather than trusting msg.Size.
| // Check total message size (data + existing attachments). | ||
| var attachTotal int64 | ||
| if err := h.DB.Pool.QueryRow(ctx, | ||
| "SELECT COALESCE(SUM(filesize), 0) FROM msg_attachment WHERE msg_id = $1", | ||
| msgID, | ||
| ).Scan(&attachTotal); err != nil { | ||
| log.Printf("update message %d: total size check: %v", msgID, err) | ||
| c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to check message size"}) | ||
| return | ||
| } | ||
| if int64(len(msg.Data))+attachTotal > h.MaxMsgSize { | ||
| c.JSON(http.StatusRequestEntityTooLarge, gin.H{"error": "total message size exceeds limit"}) | ||
| return | ||
| } |
There was a problem hiding this comment.
The total-size check reads SUM(filesize) and then proceeds to update the message. Without a transaction/lock, concurrent attachment uploads can change the total between the check and the update, allowing the combined size to exceed MaxMsgSize. Consider doing the size check + update in a transaction that also prevents concurrent attachment changes for the same msg_id.
No description provided.