Skip to content

Add AllowNetwork and LTML error policy for URL-backed src assets #36

@rowland

Description

@rowland

Problem

Today src-backed LTML assets are effectively local/asset-FS only. There is no way to explicitly allow remote http:// or https:// asset loading, even in environments where that is desirable and safe.

Separately, LTML does not yet have a clear document-level error policy for asset-loading failures. Missing local files and future remote fetch failures need a consistent way to decide whether rendering should fail immediately or continue in a best-effort mode.

We already have fluent document configuration in pdf.DocWriter for features like CompressPages(bool), and LTML document attributes already map into those writer options. This new behavior fits that same pattern.

Goal

Add:

  • a fluent AllowNetwork(bool) *DocWriter setter
  • a corresponding LTML document attribute such as allow-network="true"
  • URL-backed src support for http and https when network access is explicitly enabled
  • download-to-temp-file behavior inside the sandbox writable area
  • a document-level errors attribute that controls whether asset-loading failures are strict or best-effort

The intended default is conservative:

  • network access is disabled unless explicitly enabled
  • error mode defaults to best-effort rather than hard-fail

Proposed plan

1. Add writer-level configuration

Follow the existing fluent setter pattern used by CompressPages.

Add to pdf.DocWriter:

  • AllowNetwork(bool) *DocWriter
  • internal state recording whether remote asset fetches are allowed

If it fits better architecturally, the actual remote-loading implementation may live above pdf in LTML, but the writer-facing configuration should still exist so the document pipeline has a consistent place to carry this setting.

2. Add LTML document attributes

Extend StdDocument so it can parse and apply:

  • allow-network="true|false"
  • errors="strict|best-effort" (or equivalent names)

Suggested semantics:

  • allow-network defaults to false
  • errors defaults to best-effort
  • best-effort means: log/record the problem, omit the failed asset, and continue rendering when possible
  • strict means: return an error and fail rendering for fetch/load failures like missing files, unreadable URLs, HTTP failures, or temp-file write failures

3. Define which src attributes may use URLs

Support http:// and https:// URLs anywhere src is already meaningful or becomes meaningful.

Immediate targets:

If allow-network is false and a URL is encountered:

  • in strict mode: return a clear error
  • in best-effort mode: skip the asset and continue, while surfacing a diagnostic

4. Add a remote fetch helper with sandbox-safe temp storage

Implement a small shared loader responsible for:

  • recognizing local paths vs http / https URLs
  • downloading remote content only when network access is allowed
  • writing fetched bytes to a temp directory inside the sandbox writable area
  • returning a stable local path or bytes to the consuming code

Implementation notes:

  • prefer Go's standard HTTP client with timeouts
  • validate status codes and reject non-200 responses cleanly
  • derive temp file names safely from URL/path information without trusting raw input
  • clean up temp files when reasonable, or use a document-scoped temp directory that can be removed at the end of render
  • avoid duplicate downloads of the same URL within one document render if easy to cache safely

5. Define best-effort behavior clearly

We need one consistent policy rather than ad hoc handling.

Candidate best-effort behavior:

  • missing local image: leave the image area blank and continue
  • failed remote image fetch: leave the image area blank and continue
  • failed external rules fetch: continue without applying those rules
  • emit diagnostics with enough context to understand what was skipped

This issue should decide where those diagnostics go. Options include:

  • returned aggregated warnings
  • stderr logging, similar to current SVG warnings
  • both, depending on the caller surface

6. Document error categories

The errors mode should explicitly cover at least:

  • missing local asset files
  • unreadable local asset files
  • network disallowed for URL sources
  • DNS/connectivity failures
  • HTTP non-success responses
  • temp directory / temp file write failures
  • malformed URL strings

That gives us a stable rule for when strict mode aborts and when best-effort mode continues.

7. Add tests

Add focused tests for:

  • DocWriter.AllowNetwork(true|false) state propagation
  • LTML document attribute mapping to writer settings
  • remote URL rejection when network is disabled
  • best-effort vs strict behavior for local missing files
  • best-effort vs strict behavior for remote fetch failures
  • successful remote fetch to temp storage
  • cleanup/caching behavior if implemented

For HTTP tests, use httptest where the environment allows it, with suitable skips if sandbox restrictions apply.

8. Update docs

Document in LTML syntax and developer docs:

  • allow-network
  • errors
  • which tags honor URL-based src
  • default behavior and failure behavior
  • the security posture: remote loading is opt-in

Open design questions

  • Should AllowNetwork live only on DocWriter, or do we want a higher-level LTML/environment config object too?
  • Should best-effort mode suppress visible placeholders entirely, or should some widgets show a fallback marker?
  • Should downloads be cached only for the duration of one render, or across renders?
  • Do we want to restrict allowed URL schemes to exactly http and https in v1? My recommendation is yes.
  • Do we want an enum name of best-effort, lenient, or continue for the non-strict mode? best-effort is the clearest to me.

Workarounds today

  • Download remote assets ahead of time and reference them as local files.
  • Keep rules inline rather than external if they need to travel with the document.
  • Treat missing assets as fatal in the calling code if strictness is needed today.

Acceptance criteria

  • DocWriter exposes AllowNetwork(bool) *DocWriter.
  • LTML document attributes can enable network loading and choose strict vs best-effort error handling.
  • URL-based src values for supported elements can fetch through http and https when enabled.
  • Fetched files are written into a sandbox-safe temp directory.
  • Strict mode fails rendering on asset-loading errors; best-effort mode continues when possible.
  • Tests and docs cover the new behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions