-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Summary
Add a new serve-ltml output mode that stores each rendered PDF in an Amazon S3-compatible object store and returns the object location instead of streaming the PDF bytes directly in the HTTP response.
The new mode should work with AWS S3 and S3-compatible providers such as MinIO, Cloudflare R2, Backblaze B2 S3, etc., without disturbing the existing direct-PDF response mode.
Goals
- Preserve the current default behavior:
POST /renderreturnsapplication/pdfunless object-storage mode is explicitly enabled. - Add a configurable object-storage mode that uploads the finished PDF to an S3-compatible bucket.
- Return a stable machine-readable response describing where the PDF was stored.
- Cover configuration, credentials, object naming, visibility, expiration/lifetime, cleanup, and failure behavior.
- Keep request-scoped upload assets and the base-path overlay behavior unchanged.
Non-Goals
- Replacing the current direct streaming mode.
- Adding a provider-specific implementation that only works with AWS.
- Building a background job queue in the first iteration.
- Managing bucket lifecycle rules automatically via cloud APIs in the first iteration.
Proposed API Shape
Keep POST /render as the main endpoint and introduce an alternate response mode:
- Default mode: unchanged; response is the rendered PDF stream.
- Object-storage mode: response is JSON, for example:
{
"storage": "s3",
"bucket": "example-bucket",
"key": "renders/2026/03/28/4f3d.../output.pdf",
"url": "https://storage.example.com/example-bucket/renders/.../output.pdf",
"expires_at": "2026-03-29T12:34:56Z"
}Questions to settle in implementation/design review:
- Should the mode be enabled globally at process startup, or selectable per request?
- If per request, should selection use
Accept: application/json, a query parameter, or an explicit request field/header? - Should the response include a direct object URL, a presigned URL, or only bucket/key metadata?
- Should the server still support inline PDF streaming when storage upload is configured but a request opts out?
My recommendation: start with a server-level config switch for simplicity, but keep the response schema general enough that per-request selection can be added later without breaking clients.
Configuration Plan
Add object-storage configuration as flags and environment variables, following the existing namsral/flag pattern.
Suggested settings:
-output-mode/OUTPUT_MODE
Values:inline(default),s3-s3-endpoint/S3_ENDPOINT
Required for most S3-compatible providers; optional for AWS if region-based endpoint resolution is used.-s3-region/S3_REGION
Required for AWS-style signing.-s3-bucket/S3_BUCKET
Required whenoutput-mode=s3.-s3-prefix/S3_PREFIX
Optional object key prefix such asrenders/.-s3-path-style/S3_PATH_STYLE
Boolean for providers that require path-style addressing.-s3-public-base-url/S3_PUBLIC_BASE_URL
Optional externally reachable base URL to use in the returned location instead of deriving from the SDK endpoint.-s3-presign-ttl/S3_PRESIGN_TTL
Optional duration; when set, return a presigned GET URL that expires after the given TTL.-s3-server-side-encryption/S3_SERVER_SIDE_ENCRYPTION
Optional value such asAES256or provider-specific mode.-s3-storage-class/S3_STORAGE_CLASS
Optional storage tier.-s3-metadata-*
Optional future extension; not required in the first pass.
Validation rules:
output-modemust be validated at startup.- When
output-mode=s3, require bucket, region/signing configuration, and whatever endpoint settings are necessary for the chosen provider. - Reject invalid combinations such as both
s3-public-base-urlands3-presign-ttlif the implementation cannot safely honor both. - Log the chosen mode and non-secret storage settings at startup.
- Never log secrets.
Authorization And Credential Strategy
Use the AWS SDK for Go v2 and rely on its standard credential/provider chain where possible.
Credential sources to support:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN). - Shared AWS config/credentials files.
- IAM role / instance profile / ECS/EKS task role when running on AWS.
- Static credentials for non-AWS S3-compatible providers via the same environment variables.
Implementation notes:
- Prefer the default credential chain before introducing project-specific secret flags.
- Avoid adding custom
-s3-access-key/-s3-secret-keyflags unless there is a strong operational need; flags are easier to leak via process listings and shell history. - If explicit credential flags are ever added, document their risks and keep env/config-chain auth as the preferred path.
- Ensure the issue covers signature compatibility for custom endpoints and path-style addressing.
Object Naming And Response Contract
Define a predictable but collision-resistant key layout. Example:
<prefix>/<yyyy>/<mm>/<dd>/<request-id>/output.pdf
Requirements:
- Keys must be unique across concurrent requests.
- Returned metadata must include at least bucket and key.
- If a URL is returned, specify whether it is:
- a durable public URL,
- a derived internal endpoint URL, or
- a time-limited presigned URL.
- Set object
Content-Typetoapplication/pdf. - Consider setting
Content-Dispositionmetadata if the object will be downloaded by browsers. - Include
expires_atonly when the returned URL or object retention policy actually has a meaningful expiration.
Lifetime Management And Cleanup
This needs explicit design because the current server model is request-scoped and ephemeral, while object storage is durable by default.
Plan:
- Treat upload to S3 as the final output handoff after the PDF file has been fully rendered locally.
- Keep the existing request temp-directory cleanup exactly as it is today.
- Add documentation for expected bucket lifecycle management.
- Support one of these initial lifetime approaches:
- No automatic deletion by the server; rely on bucket lifecycle rules configured out of band.
- Optional presigned URL expiration only; object may outlive the URL.
- Optional prefix dedicated to ephemeral renders so operators can attach lifecycle expiration rules.
My recommendation for v1:
- Do not have
serve-ltmldelete objects itself. - Document that operators should attach lifecycle rules to the configured prefix/bucket.
- Optionally return
expires_atonly for presigned URLs, not as a promise that the object itself will be deleted then.
This keeps the server stateless and avoids hidden cleanup jobs or partially reliable delete-on-timer behavior.
Failure Handling
Define behavior for each phase:
- LTML parse/render failure: same
400/500behavior as today; no object should be created. - Upload failure after successful render: return
500 Internal Server Error; do not return partial location metadata. - Response serialization failure after successful upload: object may already exist; log enough context to find it.
- If multipart upload is used for large files in the future, abort failed multipart uploads cleanly.
Implementation Plan
- Add a small storage abstraction in
cmd/serve-ltmlfor render outputs.- Example:
type renderSink interface { Store(ctx context.Context, pdfPath string) (RenderLocation, error) } - Provide an inline sink for the current behavior and an S3 sink for object storage.
- Example:
- Extend
Configwith validated output-mode and S3 settings. - Update startup/config docs in
cmd/serve-ltml/README.md. - Refactor the handler/render pipeline so rendering produces a finished temp PDF file before the final delivery step.
- This mostly matches current behavior already.
- Implement an S3 sink using AWS SDK for Go v2.
- Configure custom endpoint resolution for S3-compatible providers.
- Support path-style mode.
- Set
Content-Type: application/pdf.
- Define the JSON response schema for storage mode.
- Decide whether to return raw bucket/key only, bucket/key plus URL, or bucket/key plus optional presigned URL.
- Add tests for config validation, key generation, JSON responses, and upload failure paths.
- Add an integration-style test seam using a fake uploader rather than requiring live cloud credentials.
- Update
cmd/render-ltmldocumentation if remote clients need to understand JSON location responses.
Testing Checklist
- Config tests for valid and invalid
output-mode=s3combinations. - Unit tests for object key generation and prefix handling.
- Handler test proving default mode still returns
application/pdf. - Handler test proving S3 mode returns JSON with the expected fields.
- Handler test proving render failures do not attempt upload.
- Handler test proving upload failures return
500. - Tests verifying request temp directories are still cleaned up in both modes.
- Tests verifying
Content-Typeand optional metadata on uploaded objects. - Tests for path-style/custom-endpoint configuration using a fake or stubbed uploader.
Documentation Checklist
- Update
cmd/serve-ltml/README.mdwith the new mode, flags/env vars, and response examples. - Document credential sourcing and recommend the AWS default credential chain.
- Document the distinction between object expiration and presigned URL expiration.
- Document operator expectations around bucket lifecycle rules and retention.
- Document any compatibility impact for
render-ltml -submitor other clients.
Open Questions
- Should object-storage mode be process-wide or request-selectable?
- Should the server return bucket/key only, or also a usable URL?
- If a usable URL is returned, should it be public or presigned?
- Should there be a configurable object naming template, or is prefix + generated request ID enough?
- Do we want to expose extra upload headers such as cache control or content disposition in v1?
- Should
render-ltml -submiteventually grow a mode that prints the returned JSON location instead of expecting PDF bytes?
Acceptance Criteria
serve-ltmlcan be started in an explicit S3 output mode without breaking the existing inline-PDF mode.- A successful render in S3 mode uploads exactly one PDF object with the correct content type and a unique key.
- The HTTP response returns machine-readable location metadata.
- Credentials are sourced without introducing insecure defaults.
- The temp-file lifecycle remains request-scoped and cleaned up locally.
- The retention/lifetime story is clearly documented for operators.
go build ./...andgo test ./...remain green.