feat: writable external tables (INSERT/LOAD into stage files)#24889
feat: writable external tables (INSERT/LOAD into stage files)#24889fengttt wants to merge 10 commits into
Conversation
Make external tables writable when created with a WRITE_FILE_PATTERN option (a strftime template with MO extensions %nN = n random digits and %U = UUID, resolving to a stage:// path). Such tables accept INSERT ... SELECT and LOAD, writing CSV or JSONLine files into the stage; each parallel pipeline writes one file. Reads, UPDATE/DELETE, and read-only external tables are unchanged. - pkg/sql/colexec/externalwrite: strftime expander, ExternalWriter with a fileservice streaming sink, CSV/JSONLine encoders (const-vector aware, emit only the declared columns). - insert operator: third write mode ToExternal/insert_external alongside ToWriteS3/insert_table; write config carried on the Go InsertCtx (no proto change), built in compile from TableDef.Createsql. - compile: compileInsert routes external-write nodes to one writer op per source scope (no S3 merge/shuffle). - planner: minimal external insert plan (build_insert/build_load); op-aware checkTableType allows writable-external for insert; initInsertStmt Pkey guard; modern DML binder defers external targets to the legacy path. - DDL: validate WRITE_FILE_PATTERN (stage:// + csv/jsonline + parseable) and accept it in the read-side option validators. - docs/design + BVT case test/distributed/cases/stage/writable_external_table. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… SHOW CREATE Address iamlinjunhong's review on writable external tables: 1. Remote execution no longer loses external-write mode. pipeline.Insert gains to_external + external_stmt_unix_nano; the receiving CN rebuilds the writer config from the serialized TableDef's stored ExternParam via the new shared buildExternalInsertArg, so WRITE_FILE_PATTERN time directives expand against the same statement-start instant on every CN. The writer's session time zone is now resolved in insert.Prepare so a remotely rebuilt operator renders TIMESTAMPs in the session zone too. Testing on a 2-CN cluster exposed a second instance of the same bug: dupOperator dropped ToExternal when a scope is parallelized, so the duplicated inserts silently wrote rows into the disttae workspace and the transaction panicked at commit (UnionWindow index out of range in dumpInsertBatchLocked). Both paths are fixed and covered by unit tests (encode/decode round-trip, dup), plus a 100k-row parallel-insert BVT case that fails without the dup fix. 2. format='jsonline' with jsondata='array' is rejected at DDL time for writable external tables: the writer emits one object per line, which an 'array' table could not read back. 3. The CSV writer now defaults ENCLOSED BY to '"', matching the external reader (which always parses with '"' when the clause is absent), so strings containing the field terminator, quotes, newlines or backslashes round-trip. New BVT case covers all four. 4. SHOW CREATE TABLE now emits WRITE_FILE_PATTERN for external tables (both INFILE and s3option forms), so the output recreates a writable table instead of silently degrading to read-only. All verified end to end on the local multi-CN docker cluster, including a 500k-row parallel insert producing 5 writer files that read back exactly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
@iamlinjunhong Thanks for the thorough review — all four issues (plus the three you re-raised from Copilot, which were fixed in 1bb3a22) are now addressed in d15911b. 1. Remote execution loses external-write mode — Fixed via proto plumbing, as you suggested. While verifying this end to end on a 2-CN docker cluster with a 500k-row insert, I found a second instance of the same class of bug: 2. JSONLine ignores 3. CSV default enclosure mismatch — The writer now defaults 4. Verification: |
- TestShowCreateExternalWriteFilePattern: WRITE_FILE_PATTERN survives SHOW CREATE formatting in both the INFILE and URL s3option forms (the s3option form had no coverage), and read-only tables emit no such key. - TestConstructExternalInsertStmtTime: the writer config takes its timestamp from defines.StartTS on the process context (falling back to the wall clock), which is what keeps WRITE_FILE_PATTERN time directives consistent across parallel pipelines. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A 4.4M-row flushed source crosses the multi-CN stats thresholds (>512
blocks), so the INSERT into the writable external table compiles to a
MULTICN plan; on a multi-CN cluster the source-scan scopes carry the
external-write insert to remote CNs through the pipeline protocol,
exercising the to_external encode/decode and remote writer rebuild end to
end. On a single CN the same case degenerates to the parallel-pipeline
insert and asserts identical results.
Verified on the local 2-CN docker cluster: during the run cn2's
mo_rpc_message_total{name="pipeline-server",type="receive"} advanced by 13
while the session ran on cn1, and the 4.4M rows read back exactly.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
|
||
| // validateWriteFilePattern validates the WRITE_FILE_PATTERN option that makes an | ||
| // external table writable. No-op for read-only external tables (option absent). | ||
| func validateWriteFilePattern(ctx context.Context, param *tree.ExternParam) error { |
There was a problem hiding this comment.
[P1] Parallel writers silently collide when WRITE_FILE_PATTERN has no uniqueness directive
validateWriteFilePattern checks only the stage:// prefix, the format, and a dry-run expansion — it never requires %U/%nN. But execution creates one writer per parallel pipeline (one per scope in compileInsert, more via dupOperator), all expanding the pattern against the same statement timestamp. ExpandFilePattern's own doc comment says "It is the caller's responsibility to make the pattern produce unique names across parallel writers" — and no caller enforces it.
Failure scenario: 'write_file_pattern'='stage://s/out-%Y%m%d.csv' (valid today) + a multi-pipeline INSERT/LOAD → every pipeline opens the same path. Both LocalFS.Write and S3FS.Write do a non-atomic exists-check-then-write, so the statement either fails with FileAlreadyExists after some files were already written, or in the race window last-writer-wins and the other pipelines' rows are silently lost while affected-rows reports success. Two sequential same-day INSERTs collide the same way.
Suggested fix: reject patterns without %U/%nN here at DDL time (or auto-append a uniqueness suffix at expansion when absent).
There was a problem hiding this comment.
Fixed in 6c0c49b — validateWriteFilePattern now rejects patterns without %U/%nN (PatternHasUniqueDirective, %%-escape aware). Unit tests + BVT error case added; all existing patterns already used %U.
| // appendExternalInsertPlan appends a minimal INSERT node for a writable external | ||
| // table. The source (lastNodeId) has already been bound, cast to the table | ||
| // column types and projected by initInsertStmt, so we only attach the INSERT. | ||
| func appendExternalInsertPlan(builder *QueryBuilder, bindCtx *BindContext, objRef *ObjectRef, tableDef *TableDef, lastNodeId int32) error { |
There was a problem hiding this comment.
[P1] Minimal external-insert plan skips PreInsert — NOT NULL is unenforced and AUTO_INCREMENT columns write NULL
NOT NULL enforcement for INSERT lives only in the PreInsert operator (colexec.BatchDataNotNullCheck, preinsert.go:263 — its single non-test caller), and AUTO_INCREMENT filling (genAutoIncrCol) likewise only runs there. appendExternalInsertPlan attaches the INSERT node directly with no PreInsert, and the compile side constructs none, so:
CREATE EXTERNAL TABLE t (a int NOT NULL, ...) ... write_file_pattern=...; INSERT INTO t VALUES (NULL, ...)succeeds and writes\Ninstead of raising "Column 'a' cannot be null". Explicit NULLs and NULLs from a SELECT/LOAD source all pass through (omitted columns still fail at plan time viagetDefaultExpr).- CREATE EXTERNAL TABLE accepts AUTO_INCREMENT (shared
buildTableDefs, no external gate), andgetDefaultExprplans the column as a NULL literal expecting PreInsert to fill it — soINSERT INTO t(v) ...writes\Nfor the id column instead of generated values.
The LOAD path (build_load.go external branch) has the same gap. Suggested fix: reject AUTO_INCREMENT columns for writable external tables at DDL time, and run BatchDataNotNullCheck either via a PreInsert node here or inside insert_external.
There was a problem hiding this comment.
Fixed in 6c0c49b — NOT NULL is enforced in insert_external via attrs-aligned ColDefs (covers INSERT and LOAD; const-null vectors handled), and AUTO_INCREMENT columns are rejected at DDL time for writable external tables (hidden fake-PK columns exempt — they're never written; found live when the check first rejected every PK-less table). BVT asserts both.
| insert.ctr.partitionS3Writers = nil | ||
| } | ||
| if insert.ctr.extWriter != nil { | ||
| insert.ctr.extWriter.Close(proc.Ctx) |
There was a problem hiding this comment.
[P1] Pipeline failure finalizes the partial file — orphan rows become visible, and a retried statement duplicates them
Reset and Free call extWriter.Close(proc.Ctx) unconditionally, ignoring the pipelineFailed/err parameters and discarding Close's error. externalWriter.Close → FileServiceWriter.Close does a clean pipe close (not CloseWithError), so the async fs.Write treats EOF as normal end-of-stream and durably persists whatever was buffered: LocalFS renames the temp file into place, S3 completes the PUT.
Failure scenario: INSERT INTO wext SELECT ... writes 2 of 5 batches, then the pipeline fails (expression error, lock conflict). Cleanup finalizes the partial file into the stage → every later SELECT on the table includes those orphan rows. Worse, for a retryable error (ErrTxnNeedRetry) the retry loop reruns the statement and writes a complete second file under a new %U expansion → the table now returns the 2 partial batches plus all 5 batches.
External writes can't be transactional, but the failure path should abort rather than finalize: plumb pipelineFailed/err into an abort that cancels the underlying write (e.g. CloseWithError / remove-on-error), and surface Close errors instead of dropping them.
There was a problem hiding this comment.
Fixed in 6c0c49b — added FileServiceWriter.Abort (closes the pipe with an error so the in-flight fileservice write discards its partial output) and an Abort on the ExternalWriter interface; Reset/Free now abort instead of Close — a non-nil writer there always means the stream didn't end cleanly, since insert_external nils it after a successful Close. Verified live: a 200k-row insert failing mid-stream on division-by-zero leaves zero files in the stage.
| return []byte("true"), false, nil | ||
| } | ||
| return []byte("false"), false, nil | ||
| case types.T_bit: |
There was a problem hiding this comment.
[P2] CSV: bit values are written as raw unquoted bytes — terminator bytes corrupt the record structure
csvValue returns the bit value's raw big-endian bytes with quote=false, so the bytes are written without enclosure or escaping. Any bit value whose byte representation contains the field terminator, line terminator, enclosure char, or backslash breaks the CSV framing.
Failure scenario: bit(8) value 44 (0x2C = ,) or 10 (\n) → the unescaped byte splits the row mid-record; reading the table back fails with a column-count mismatch or returns shifted values — the writer's own round-trip guarantee breaks. (String types are enclosed+escaped; bit is not, yet its payload is equally arbitrary bytes.)
Suggested fix: return quote=true for T_bit so the bytes get enclosed and escaped like other binary-ish values (the reader parses bit from the enclosed field's raw bytes, so enclosure is round-trip-safe).
There was a problem hiding this comment.
Fixed in 6c0c49b — bit values are now enclosed+escaped like binary (quote=true). BVT round-trips 44 (','), 34 ('"'), 92 ('\'), 128 (high byte). One caveat discovered while testing: bytes that are pure whitespace (e.g. 10='\n') are written correctly but read back NULL because the external READER TrimSpaces non-string fields — a pre-existing read-side limitation noted in the test.
| return json.RawMessage(val.String()), nil | ||
| case types.T_bool: | ||
| return vector.GetFixedAtNoTypeCheck[bool](vec, i), nil | ||
| case types.T_bit: |
There was a problem hiding this comment.
[P2] JSONLine: bit bytes ≥ 0x80 are invalid UTF-8 and get silently replaced with U+FFFD by json.Marshal
jsonValue returns the bit value's raw big-endian bytes as a Go string. encoding/json coerces invalid UTF-8 to � during marshaling, so any bit value with a byte ≥ 0x80 is corrupted.
Failure scenario: bit(8) value 128 (single byte 0x80) → marshaled as "�" (3 bytes); on read-back the external reader's getColData T_bit sees a 3-byte field and reconstructs 0xEFBFBD (or errors "data too long") instead of 128. Values ≤ 127 happen to survive, which makes the bug easy to miss in tests.
Suggested fix: emit bit as a JSON number is what the reader can't parse (that was the original bug) — so either base64-encode like T_binary does and teach the reader, or reject bit columns for jsonline writable tables at DDL time. Round-tripping arbitrary bytes through a JSON string is not salvageable.
There was a problem hiding this comment.
Fixed in 6c0c49b — went with rejection: bit columns are refused at DDL time for writable jsonline tables (CSV keeps them, enclosed+escaped), with an encoder error guarding the unreachable path. Round-tripping arbitrary bytes through a JSON string isn't salvageable without changing the reader.
| // start time). It is the caller's responsibility to make the pattern produce | ||
| // unique names across parallel writers (use %U or %nN); the standard directives | ||
| // alone are not unique. | ||
| func ExpandFilePattern(pattern string, t time.Time) (string, error) { |
There was a problem hiding this comment.
[P2] Time directives expand in each CN's OS-local zone — one statement can scatter files across date partitions
Only the instant travels to remote CNs (ExternalStmtUnixNano); time.Unix(0, nanos) on the receiving CN carries its OS-local zone, and the sender's defines.StartTS is local time.Now() in its zone. strftimeDirective extracts t.Year()/Month()/Day()/Hour() from the time's own location with no .UTC() normalization anywhere in this file — so the proto comment's promise that "every CN expands WRITE_FILE_PATTERN time directives identically" only holds when all hosts share a time zone.
Failure scenario: CN1 at TZ=UTC+8, CN2 at TZ=UTC, pattern stage://s/dt=%Y-%m-%d/part-%U.csv, statement at 2026-06-12 00:10 CN1-local → local pipelines write under dt=2026-06-12, remote pipelines under dt=2026-06-11.
Suggested fix: normalize once — e.g. t = t.UTC() at the top of ExpandFilePattern (documenting that time directives are UTC), or render in a fixed zone carried in the config. UTC also makes the dry-run validation in validateWriteFilePattern consistent with runtime.
There was a problem hiding this comment.
Fixed in 6c0c49b — ExpandFilePattern normalizes to UTC at the top (documented in the doc comment and design docs), so the same instant expands identically on every CN; unit test compares a UTC and UTC+8 rendering of the same instant.
| return nil, moerr.NewInternalErrorf(context.Background(), "external write (jsonline): %v", err) | ||
| } | ||
| buf.Write(jb) | ||
| buf.WriteByte('\n') |
There was a problem hiding this comment.
[P2] JSONLine writer hardcodes \n but the reader honors the table's LINES TERMINATED BY — self-incompatible round-trip
encodeJSONLine always emits \n as the record separator, ignoring cfg.LineTerminator (which buildExternalInsertArg populates from the table's LINES TERMINATED BY clause). The external reader's newCSVParserFromReader keeps the custom linesTerminatedBy for jsonline (the jsonline override only touches the field terminator/escape), so a table declared with LINES TERMINATED BY '\r\n' parses its own written file as one malformed line.
Failure scenario: format='jsonline' ... lines terminated by '\r\n' + INSERT 3 rows → file contains \n-separated objects; SELECT parses with \r\n and fails or returns the wrong row count.
Suggested fix: buf.Write(w.cfg.LineTerminator) here (CSV already honors it), or reject a non-\n LINES TERMINATED BY for jsonline writable tables at DDL time.
There was a problem hiding this comment.
Fixed in 6c0c49b — the JSONLine encoder now writes cfg.LineTerminator between records (exercised by a \r\n unit test). This came along with the map-free encoder rewrite from the perf finding.
| // parallel pipelines / multiple CNs all resolve the same path even when the | ||
| // pattern contains time directives (e.g. %Y/%m/%d/%H). Fall back to time.Now() | ||
| // when the statement start is not carried on the context. | ||
| stmtAt := time.Now() |
There was a problem hiding this comment.
[P3] defines.StartTS is only set by the MySQL frontend — internal-executor statements get a different time.Now() per scope
The only setter of defines.StartTS{} is the frontend's doComQuery; statements run through the internal SQL executor (sql_executor.go, used by ~20 components) never carry it, so this fallback runs once per source scope (compileInsert calls constructExternalInsert in the per-scope loop) and each writer gets a slightly different timestamp. A pattern with %H%M%S (or a midnight-straddling %d) can then expand to different paths within one statement — exactly what the statement-start timestamp was introduced to prevent.
Compile.startAt already carries the statement start on every construction path (frontend, internal executor receiveAt, retry) — threading c.startAt from compileInsert into constructExternalInsert removes the per-scope time.Now() entirely and fixes the internal-executor gap at the same time.
There was a problem hiding this comment.
Fixed in 6c0c49b — compileInsert resolves the timestamp once per statement via externalInsertStmtTime, preferring defines.StartTS, then Compile.startAt (populated on every construction path including the internal SQL executor), then the wall clock; all scopes share the single value.
| // legacy planner (buildInsert/buildLoad), which supports INSERT/LOAD into | ||
| // writable external tables and rejects the rest with a precise error. | ||
| if tableDef.TableType == catalog.SystemExternalRel { | ||
| return moerr.NewUnsupportedDML(ctx.GetContext(), "external table") |
There was a problem hiding this comment.
[P3] REPLACE INTO an external table now leaks the internal ErrUnsupportedDML sentinel instead of a user-facing error
This early return fires for every external table, and bindAndOptimizeReplaceQuery is the one DML entry point with no legacy fallback for ErrUnsupportedDML (insert/load/delete/update all catch it in build.go and replan). So REPLACE INTO <external table> now surfaces "unsupported DML: external table" — an internal fallback signal whose constructor even wraps the context with ContextWithNoReport — instead of the previous "invalid input: cannot insert/update/delete from external table". UPDATE/DELETE are fine (their fallbacks reach checkTableType, which still produces the old message).
Suggested fix: handle ErrUnsupportedDML in bindAndOptimizeReplaceQuery (mapping to the same InvalidInput message), or special-case REPLACE before this resolver.
There was a problem hiding this comment.
Fixed in 6c0c49b — two parts: read-only external tables now produce the user-facing "cannot insert/update/delete from external table" directly in ResolveSingleTable (only writable ones emit the legacy-fallback sentinel, which insert/load/update/delete consume), and bindAndOptimizeReplaceQuery maps the remaining external-table sentinel to the same user-facing error instead of leaking it. BVT asserts the REPLACE message.
|
|
||
| // addEscape escapes backslashes and (doubled) the enclosure character, matching | ||
| // pkg/frontend/export.go addEscapeToString. | ||
| func addEscape(s []byte, escape byte) []byte { |
There was a problem hiding this comment.
[Perf] Encoder hot path: avoidable per-cell/per-row allocations
This encoder runs per cell on the bulk INSERT/LOAD path; three cheap wins:
addEscapealways does twobytes.ReplaceAllpasses — each allocates a full copy even with zero matches (the common case), defeating the zero-copyvec.GetBytesAt. Fast-path it:if bytes.IndexByte(s, '\\') < 0 && (escape == 0 || bytes.IndexByte(s, escape) < 0) { return s }.encodeCSV/encodeJSONLineallocate a fresh unsizedbytes.Bufferper batch and discard it afterfw.Write(which fully consumes the data before returning). One reused buffer onexternalWriterwithReset()per batch right-sizes itself after the first batch — an ~800KB batch currently pays ~14 grow/realloc/copy steps from 64B.encodeJSONLinebuilds amap[string]interface{}per row and reflect-json.Marshals it, re-sorting the same fixed key set every row and boxing every numeric cell. Precomputing the encoded key prefixes (,"name":) once and appending values via the existing per-type switch removes the map, the boxing, and the per-row key sort. (1M rows × 10 cols ≈ 13M avoidable allocations.)
Also micro: []byte("true")/[]byte("false")/[]byte("\\N") per cell can be package-level vars.
There was a problem hiding this comment.
Fixed in 6c0c49b — all three: addEscape fast-paths the no-escape case (returns the input slice, asserted by a pointer-identity test), one reused buffer per writer replaces the per-batch allocation, and JSONLine now encodes directly (pre-encoded key prefixes, strconv into a scratch slice, json-compatible float formatting, json.Compact for JSON columns) with byte-identical output — existing exact-match tests pass unchanged. The constant \N/true/false slices are package vars now too.
P1 — silent data loss / constraint holes:
- DDL now requires a %U/%nN uniqueness directive in WRITE_FILE_PATTERN
(PatternHasUniqueDirective): every parallel pipeline expands the pattern
against the same statement timestamp, so a directive-free pattern made all
writers open the identical path (FileAlreadyExists or last-writer-wins).
- NOT NULL is enforced in insert_external (the minimal plan runs no
PreInsert); AUTO_INCREMENT columns are rejected at DDL time for writable
external tables (hidden fake-PK columns exempt — they are never written).
- Pipeline failure now ABORTS the in-flight file (FileServiceWriter.Abort
closes the pipe with an error so fileservice discards the partial object)
instead of finalizing it; verified live: a failing 200k-row insert leaves
zero files.
P2 — round-trip correctness:
- CSV writes bit values enclosed+escaped (raw bytes can contain the
terminator/quote); jsonline writable tables reject bit columns at DDL time
(bytes >= 0x80 cannot survive a JSON string) with an encoder guard behind.
- WRITE_FILE_PATTERN time directives render in UTC so local and remote CNs
expand the same instant identically regardless of host OS time zones.
- encodeJSONLine honors the table's LINES TERMINATED BY (the reader parses
jsonline with it; the writer hardcoded '\n').
P3 — error fidelity / timestamp plumbing:
- The statement timestamp is resolved once per statement in compileInsert,
preferring defines.StartTS then Compile.startAt (set on every construction
path, including the internal SQL executor) over per-scope time.Now().
- REPLACE INTO an external table reports the user-facing "cannot
insert/update/delete from external table" again: read-only external tables
error directly in ResolveSingleTable (only writable ones emit the
legacy-fallback sentinel) and bindAndOptimizeReplaceQuery maps the
remaining sentinel instead of leaking it.
Perf/reuse:
- Encoder hot path: addEscape fast-paths the no-escape case (no copy),
one reused buffer per writer replaces a fresh bytes.Buffer per batch, and
JSONLine encodes values directly (pre-encoded key prefixes, strconv into
scratch, json-compat float formatting) instead of building a
map[string]interface{} and reflect-marshaling per row — output stays
byte-identical. cellIsNull -> vector.IsNull; UrlToStageDef+ToPath ->
stageutil.UrlToPath.
BVT: bit tricky-byte round-trip (comma/quote/backslash/high-byte), NOT NULL
violation, REPLACE error, and DDL error cases for missing uniqueness
directive, AUTO_INCREMENT, and jsonline+bit; the jsonline wide table drops
its bit column. Note: bit bytes that are pure whitespace (e.g. 10) are
written correctly but read back NULL — the external reader TrimSpaces
non-string fields, a pre-existing read-side limitation.
All 105 BVT assertions pass against a freshly built 2-CN docker cluster;
make + static-check clean.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
aunjgr
left a comment
There was a problem hiding this comment.
All concerns from iamlinjunhong and the self-review addressed: remote ToExternal serialization, SHOW CREATE TABLE, CSV enclosure defaults, JSONLine jsondata, plus P1 fixes (uniqueness enforcement, NOT NULL, pipeline abort, bit encoding). Clean.
What
Make external tables writable. An external table created with a new
write_file_patternoption acceptsINSERT ... SELECTandLOAD, writingCSV or JSONLine files into a
stage://destination. The pattern is astrftime(3)template with two MO extensions:%nN→nrandom decimal digits%U→ a UUIDEach parallel write pipeline produces exactly one file. Reads,
UPDATE/DELETE,and read-only external tables are unchanged.
How
pkg/sql/colexec/externalwrite(new): strftime expander,ExternalWriterwith a fileservice streaming sink (io.Pipe), CSV/JSONLine encoders. Encoders
are const-vector aware and emit only the table's declared columns.
ToExternal/insert_externalalongsideToWriteS3/insert_table. Write config is carried on the GoInsertCtx(no proto change), built at compile time from
TableDef.Createsql.compileInsertroutes external-write nodes to one writer op persource scope (no S3 merge/shuffle) → one file per pipeline, parallel across CNs.
build_insert/build_load);op-aware
checkTableTypeallows writable-external forinsert;initInsertStmtPkey nil-guard (external tables have no PK); the modern DMLbinder defers external targets to the legacy planner.
write_file_pattern(must bestage://, csv/jsonline only,pattern must parse) and accepts it in the read-side option validators.
Semantics / limitations
stage://.failed statement may leave partial files. Documented as a v1 limitation.
UPDATE/DELETEon external tables remain unsupported.Tests
pkg/sql/colexec/externalwrite(expander, encoders),build_ddlvalidator.test/distributed/cases/stage/writable_external_table.{sql,result}—CSV/JSONLine insert + readback, multi-file accumulation, LOAD into external,
and all error cases. Passes 37/37 locally. Existing
stage.sql(external readsDesign + as-built notes:
docs/design/writable_external_table_impl.md.🤖 Generated with Claude Code