-
Notifications
You must be signed in to change notification settings - Fork 1
Entlein/adaptive write #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
86397a8
f590005
3510794
5a8fb65
17188d5
63f7d5f
839af02
06a8d3a
af340a5
de1e2f2
14d2fc9
57183a5
95b6b36
1370479
3f59c6d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -35,6 +35,9 @@ namespace px { | |
| namespace carnot { | ||
| namespace exec { | ||
|
|
||
| // TODO(ddelnano): Defend against columns that don't exist. These should be | ||
| // ignored by the Node. | ||
|
Comment on lines
+38
to
+39
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Implement the missing-column guard before dereferencing mapped indices. Line 38 states missing columns should be ignored, but Line 107 still unconditionally calls 🤖 Prompt for AI Agents |
||
|
|
||
| using table_store::schema::RowBatch; | ||
| using table_store::schema::RowDescriptor; | ||
|
|
||
|
|
@@ -148,12 +151,12 @@ Status ClickHouseExportSinkNode::ConsumeNextImpl(ExecState* /*exec_state*/, cons | |
| break; | ||
| } | ||
| case types::UINT128: { | ||
| // UINT128 is exported as STRING (UUID format) | ||
| // UINT128 is exported as STRING in "high:low" format to match | ||
| // the ClickHouseSourceNode's parsing in clickhouse_source_node.cc | ||
| auto col = std::make_shared<clickhouse::ColumnString>(); | ||
| for (int64_t i = 0; i < num_rows; ++i) { | ||
| auto val = types::GetValueFromArrowArray<types::UINT128>(arrow_col.get(), i); | ||
| std::string uuid_str = sole::rebuild(absl::Uint128High64(val), absl::Uint128Low64(val)).str(); | ||
| col->Append(uuid_str); | ||
| col->Append(absl::Substitute("$0:$1", absl::Uint128High64(val), absl::Uint128Low64(val))); | ||
| } | ||
| block.AppendColumn(mapping.clickhouse_column_name(), col); | ||
| break; | ||
|
|
@@ -164,6 +167,34 @@ Status ClickHouseExportSinkNode::ConsumeNextImpl(ExecState* /*exec_state*/, cons | |
| } | ||
| } | ||
|
|
||
| // Auto-derive event_time from time_ if time_ is present but event_time is not. | ||
| // The ClickHouse table schema uses event_time (DateTime64(3), milliseconds) for | ||
| // partitioning and ordering, but the Pixie table has time_ (TIME64NS, nanoseconds). | ||
| bool has_time_ = false; | ||
| bool has_event_time = false; | ||
| int time_col_index = -1; | ||
| for (const auto& mapping : plan_node_->column_mappings()) { | ||
| if (mapping.clickhouse_column_name() == "time_") { | ||
| has_time_ = true; | ||
| time_col_index = mapping.input_column_index(); | ||
| } | ||
| if (mapping.clickhouse_column_name() == "event_time") { | ||
| has_event_time = true; | ||
| } | ||
| } | ||
|
|
||
| if (has_time_ && !has_event_time && time_col_index >= 0) { | ||
| auto arrow_col = rb.ColumnAt(time_col_index); | ||
| int64_t num_rows = arrow_col->length(); | ||
| auto event_time_col = std::make_shared<clickhouse::ColumnDateTime64>(3); | ||
| for (int64_t i = 0; i < num_rows; ++i) { | ||
| int64_t ns_val = types::GetValueFromArrowArray<types::TIME64NS>(arrow_col.get(), i); | ||
| // Convert nanoseconds to milliseconds for DateTime64(3) | ||
| event_time_col->Append(ns_val / 1000000LL); | ||
| } | ||
| block.AppendColumn("event_time", event_time_col); | ||
| } | ||
|
|
||
| // Insert the block into ClickHouse | ||
| clickhouse_client_->Insert(plan_node_->table_name(), block); | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
What is the first patched (non-vulnerable) version of google.golang.org/protobuf for GHSA-8r3f-844c-mc37 / GO-2024-2611?💡 Result:
The first patched (non-vulnerable) version of google.golang.org/protobuf for GHSA-8r3f-844c-mc37 (also known as GO-2024-2611 and CVE-2024-24786) is v1.33.0.[1][2][3] This version fixes the infinite loop in protojson.Unmarshal when handling certain invalid JSON inputs, particularly with google.protobuf.Any or UnmarshalOptions.DiscardUnknown set.[4] Official advisories confirm affected versions are before v1.33.0, with v1.33.0 as the patched version.[1][5][2]
Citations:
🏁 Script executed:
Repository: k8sstormcenter/pixie
Length of output: 1339
🏁 Script executed:
Repository: k8sstormcenter/pixie
Length of output: 1000
🏁 Script executed:
Repository: k8sstormcenter/pixie
Length of output: 193
Remove the forced protobuf downgrade to a vulnerable version.
Line 324 pins
google.golang.org/protobuftov1.29.1, which contains the high-severity infinite-loop vulnerability (GHSA-8r3f-844c-mc37 / GO-2024-2611). This replace directive downgrades from the safev1.34.2already declared in the require section. Removing this replace allows the patched version to be used.Proposed fix
-replace google.golang.org/protobuf => google.golang.org/protobuf v1.29.1📝 Committable suggestion
🤖 Prompt for AI Agents