Add import_format to S3/GCS import data sources#21
Merged
Conversation
S3 and GCS import data sources accept an optional import_format (csv/ndjson/parquet), emitted as IMPORT_FORMAT in the generated .datasource. This lets you ingest files whose extension does not imply the format (for example NDJSON delivered as .log), which otherwise fail with "Format not supported". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mirror from_timestamp end-to-end: parse IMPORT_FORMAT, carry it on the S3/GCS migrate models, emit it from emit_ts, and add parse + round-trip tests. Keeps generator/parser/emitter parity for the new field. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
juliavallina
approved these changes
Jun 29, 2026
juliavallina
left a comment
Contributor
There was a problem hiding this comment.
This is great. Thanks for contributing. I'll merge it and release a new version (0.4.0)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
S3 and GCS import data sources accept an optional
import_format(csv/ndjson/parquet), emitted asIMPORT_FORMATin the generated.datasource, and round-tripped by the datafile parser and the TypeScript-migration emitter.Why
The S3/GCS connector infers file format from the extension. Files whose extension doesn't imply the format — e.g. NDJSON delivered as
.log(Fastly's S3 logging hard-codes the.logsuffix) — fail import withFormat not supported.IMPORT_FORMATis already supported by the platform (tb datasource create --s3-format; thedatafileimport_formatparam); this exposes it in the SDK so generated datasources can set it.Changes
Mirrors the existing
from_timestampfield end-to-end, for both S3 and GCS:import_formatonS3Config/GCSConfig(schema), emitted by_generate_import_config.parse_datasource(IMPORT_FORMAT), carried on the S3/GCS migrate models, emitted byemit_ts.make checkpasses locally (ruff, format, mypy, pytest 133 passed, gitleaks).