Skip to content

Bug: download-db-log-file-portion auto-pagination produces corrupted output due to overlapping page content #10319

@crnhrv

Description

@crnhrv

Disclaimer: This issue was drafted with assistance from Claude Code (Anthropic). The bug reproduction and data analysis were performed by humans; Claude helped structure the write-up.

Describe the bug

aws rds download-db-log-file-portion with auto-pagination produces corrupted output when downloading PostgreSQL csvlog files from Aurora. The RDS API returns overlapping content at page boundaries (3-4KB of duplicated bytes per boundary), and the CLI's auto-paginator does not correctly deduplicate this overlap, resulting in malformed CSV records.

Reproduction Steps

  1. Have an Aurora PostgreSQL instance with auto_explain enabled (log_min_duration = 1, log_analyze = on) and a CSV log file large enough to require multiple pages (~1MB+). Enabling pgaudit with log = all and log_catalog = on significantly increases the likelihood of hitting this bug, as it generates many multiline csvlog entries with embedded commas and escaped quotes.

  2. Download a csvlog file with auto-pagination:

aws rds download-db-log-file-portion \
  --db-instance-identifier <aurora-writer-instance> \
  --log-file-name error/postgresql.log.2026-05-18-1400.csv \
  --starting-token 0 \
  --output json > output.json
  1. Parse the CSV and check for records with != 26 columns (PG 14+ csvlog has exactly 26 columns):
import json, csv, io
with open('output.json') as f:
    data = json.load(f)['LogFileData']
reader = csv.reader(io.StringIO(data))
for i, row in enumerate(reader):
    if len(row) not in (23, 24, 26) and len(row) >= 23:
        print(f"Row {i}: {len(row)} columns (expected 26)")

Observed behavior

The auto-paginated output contains corrupted CSV records with wrong column counts (e.g. 32, 38 columns instead of 26). In our testing: 5 corrupted records out of 32,820 in a single 26MB log file.

Root cause: Each raw API page overlaps with the previous page by 3,000-4,000 bytes. This can be verified by downloading pages individually with --no-paginate and --marker:

# Page 0
aws rds download-db-log-file-portion \
  --db-instance-identifier <instance> \
  --log-file-name <logfile> \
  --marker 0 --no-paginate --output json > page0.json

# Page 1 (use Marker from page0 response)
aws rds download-db-log-file-portion \
  --db-instance-identifier <instance> \
  --log-file-name <logfile> \
  --marker <marker-from-page0> --no-paginate --output json > page1.json

Comparing the last N bytes of page 0 with the first N bytes of page 1 shows an exact byte-for-byte overlap of ~3,000-3,700 characters at every page boundary. In our test with a 26MB file (34 pages), all 33 page boundaries had overlaps.

The overlap starts mid-field (e.g. a port number 59420 is split as 594|20 at the page boundary, then the continuation page duplicates from 20 onwards for ~3KB). The CLI's auto-paginator does some deduplication (the auto-paginated output is 832KB smaller than raw concatenation of 34 pages), but it doesn't fully handle the overlap, leaving corrupted records in the output.

Expected behavior

Auto-paginated output should be identical to the original log file content with no duplicated or corrupted records.

AWS CLI version

aws-cli/2.34.30 Python/3.13.3 Darwin/25.4.0 source/arm64

Environment details

  • Aurora PostgreSQL 16.4
  • csvlog format with auto_explain (log_analyze=on, log_format=json) producing multiline records
  • Also reproducible with pgaudit enabled (log = all, log_catalog = on), which makes the bug much more likely by generating high volumes of multiline AUDIT log entries with embedded commas and escaped quotes

Metadata

Metadata

Assignees

Labels

bugThis issue is a bug.investigatingThis issue is being investigated and/or work is in progress to resolve the issue.p3This is a minor priority issuerds

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions