Skip to content

Adding node layers to tests and loaders#2597

Draft
arienandalibi wants to merge 96 commits intodb_v4from
db_v4_node_layers
Draft

Adding node layers to tests and loaders#2597
arienandalibi wants to merge 96 commits intodb_v4from
db_v4_node_layers

Conversation

@arienandalibi
Copy link
Copy Markdown
Collaborator

What changes were proposed in this pull request?

Node layers were previously not tested rigorously. They are now being added to tests, proptests, the loaders, and the parquet encoders. The loaders and parquet encoders are also used by materialize.

Why are the changes needed?

Fix node layer related bugs that we find.

Does this PR introduce any user-facing change? If yes is this documented?

It shouldn't

How was this patch tested?

proptests

Are there any further changes required?

There shouldn't be

…re-compute new IDs and turn them into RecordBatches
…ock the graph to get parallel iterators over edges. We filter to respect GraphView filtering behaviour.
…ill use ArrowWriter<File> for now, but we will add support for loading into a graph
# Conflicts:
#	raphtory/src/serialise/parquet/mod.rs
… function can now be passed to these functions to determine how the sinks will be created. This will allow us to pass a sink which is a crossbeam_channel to send RecordBatches elsewhere.
# Conflicts:
#	raphtory/src/serialise/parquet/mod.rs
…f encoding everything and then ingesting everything (which would keep everything in memory at once).
…anning each segment for each row. Now using this path in the new materialize_using_recordbatches function.
…separate out running materialize and parquet decoding. Test using SF10 for now.
…ic ordering for events at the same timestamp for Prop::List (Vec and Array should be the same) and Prop::Map (ordering of elements should be stable, previously depended on HashMap iteration order which is undefined).
…alization of ParquetTEdge. Cleaned up materialize tests so that they don't try to call an "old" materialize anymore
…odes_from_df call. We can actually pass a column of layer names to the "layer_id_col" parameter, the name is misleading
… was in persistent_semantics.rs, in fn node_updates_window. Proptests still fail.
…p_dst_id". GIDS are now "rap_src_id" and "rap_dst_id". This is inconsistent with other column's naming scheme, but it is backwards compatible with already encoded parquet files.
# Conflicts:
#	raphtory/src/arrow_loader/df_loaders/nodes.rs
#	raphtory/src/db/api/view/graph.rs
#	raphtory/src/io/parquet_loaders.rs
#	raphtory/src/parquet_encoder/edges.rs
#	raphtory/src/parquet_encoder/mod.rs
#	raphtory/src/parquet_encoder/model.rs
#	raphtory/src/parquet_encoder/nodes.rs
#	raphtory/src/python/graph/io/arrow_loaders.rs
#	raphtory/src/serialise/parquet.rs
#	raphtory/tests/df_loaders.rs
#	raphtory/tests/test_materialize_sf10.rs
…'re now back to ingesting using VIDs instead of resolving GIDs.
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Rust Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 2.

Benchmark suite Current: acdab5b Previous: 9823ef7 Ratio
lotr_graph/num_edges 5 ns/iter (± 0) 0 ns/iter (± 0) +∞
lotr_graph/num_nodes 5 ns/iter (± 0) 1 ns/iter (± 0) 5
lotr_graph/graph_latest 3 ns/iter (± 0) 0 ns/iter (± 0) +∞
lotr_graph_materialise/materialize 7899412 ns/iter (± 51328) 1564816 ns/iter (± 35303) 5.05
lotr_graph_window_100/num_nodes 13 ns/iter (± 0) 5 ns/iter (± 0) 2.60
lotr_graph_window_100/iterate_exploded_edges 791993 ns/iter (± 4342) 325242 ns/iter (± 847) 2.44
lotr_graph_window_100_materialise/materialize 8410279 ns/iter (± 38549) 1669150 ns/iter (± 10700) 5.04
lotr_graph_window_10/has_node_existing 144 ns/iter (± 8) 62 ns/iter (± 11) 2.32
lotr_graph_window_10/iterate nodes 31610 ns/iter (± 128) 11339 ns/iter (± 40) 2.79
lotr_graph_window_10/iterate edges 100182 ns/iter (± 380) 48684 ns/iter (± 211) 2.06
lotr_graph_window_10/iterate_exploded_edges 389238 ns/iter (± 3251) 155788 ns/iter (± 1001) 2.50
lotr_graph_window_10_materialise/materialize 3486386 ns/iter (± 14256) 971980 ns/iter (± 4278) 3.59
lotr_graph_subgraph_10pc_materialise/materialize 1682941 ns/iter (± 9159) 334634 ns/iter (± 1287) 5.03
lotr_graph_subgraph_10pc_windowed/has_node_existing 147 ns/iter (± 8) 62 ns/iter (± 14) 2.37
lotr_graph_subgraph_10pc_windowed/iterate nodes 5360 ns/iter (± 95) 1365 ns/iter (± 3) 3.93
lotr_graph_subgraph_10pc_windowed_materialise/materialize 990344 ns/iter (± 32643) 230399 ns/iter (± 2617) 4.30
lotr_graph_window_50_layered/num_edges_temporal 152746 ns/iter (± 8162) 70121 ns/iter (± 7586) 2.18
lotr_graph_window_50_layered/has_node_existing 418 ns/iter (± 20) 129 ns/iter (± 12) 3.24
lotr_graph_window_50_layered/iterate nodes 73147 ns/iter (± 1234) 19308 ns/iter (± 47) 3.79
lotr_graph_window_50_layered/iterate edges 197171 ns/iter (± 1664) 83616 ns/iter (± 1318) 2.36
lotr_graph_window_50_layered/graph_latest 78056 ns/iter (± 1718) 36649 ns/iter (± 916) 2.13
lotr_graph_window_50_layered_materialise/materialize 26895485 ns/iter (± 276669) 3488825 ns/iter (± 24948) 7.71
lotr_graph_persistent_window_50_layered/num_edges_temporal 600392 ns/iter (± 5483) 192686 ns/iter (± 1569) 3.12
lotr_graph_persistent_window_50_layered/has_node_existing 457 ns/iter (± 288) 174 ns/iter (± 83) 2.63
lotr_graph_persistent_window_50_layered/iterate nodes 97762 ns/iter (± 533) 35886 ns/iter (± 191) 2.72
lotr_graph_persistent_window_50_layered/iterate edges 171473 ns/iter (± 705) 84161 ns/iter (± 596) 2.04
lotr_graph_persistent_window_50_layered/iterate_exploded_edges 4348068 ns/iter (± 13607) 1659940 ns/iter (± 19402) 2.62
lotr_graph_persistent_window_50_layered_materialise/materialize 48794835 ns/iter (± 160949) 5298035 ns/iter (± 147912) 9.21
lotr_graph/proto_encode 9790791 ns/iter (± 142363) 1157897 ns/iter (± 73709) 8.46

This comment was automatically generated by workflow using github-action-benchmark.

# Conflicts:
#	raphtory/src/db/api/view/graph.rs
…c. resolve_layer fast path when layer ids are present is gone temporarily while debugging, will bring it back. fix node_updates_window in persistent_semantics.rs to account for the entire timestamp at the windows beginning for persisting properties properly.
…ders, and bringing back the fast path that uses these when resolving layers.
… back to Option, if it's not there then we imply STATIC_GRAPH_LAYER
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant