You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implements MDL metric for community detection. MDL gives a partition a score in bits. Lower is better. A good grouping mean dense inside communities and sparse between them compresses the network.
Motivation
The mdl had placeholder value 0.0, was not implemented.
Changes
In multilayer_quality_metrics.py:
Added mdl_score(). It scores each layer on its own and sums the results.
In multilayer_quality_metrics.py:
Added _mdl_single_layer(). It computes the bits for one layer: a model cost (bits to write down which community each node is in) plus a data cost (bits to describe the edges given those communities).
Scores each layer separately.
Only count edges that stay inside a layer. Edges crossing between layers are left out.
Skips cycle-edges (ex: A to A).
Handles directed graphs. A directed graph allows edges in both directions, so the "maximum possible edges" count is adjusted to match.
In autocommunity_executor.py:
import mdl_score.
Testing
make test
Breaking Changes
None.
Known Limitations
Cross-layer edges are not handled.
A node being in several layers is handled once per layer.
The model cost is a simple version. It does not add a separate cost for the community structure itself, or other complex computations.
Needs tests for mdl_score before merge.
The PR adds ~151 lines for MDL scoring but no tests in the changed files list. At minimum, add tests for empty partitions, single-layer graphs, multi-layer graphs, directed graphs, singleton communities, missing partition nodes, and inter-layer edges.
Partial partitions can be unfairly rewarded.
mdl_score builds layer partitions only from partition.items(), and _mdl_single_layer sets n = len(layer_partition). Edges whose endpoints are missing from the partition are skipped. This means an algorithm returning only a subset of nodes may get a smaller MDL simply because unassigned nodes and their edges are ignored.
Inter-layer edges are ignored.
The code only appends edges when u_layer == v_layer, so cross-layer coupling edges do not affect MDL. That may be acceptable for an intra-layer-only score, but it is surprising for a multilayer MDL metric used by AutoCommunity.
Parallel/weighted edges are not properly modeled.
The implementation clamps p to 1.0 when edge counts exceed simple-graph capacity. That prevents nan, but it silently turns multiedge density overflow into a perfect-density block. If py3plex can hold multigraph or weighted edges, MDL should either reject those inputs, collapse them explicitly, or use a multigraph/weighted likelihood.
Performance can degrade for highly fragmented partitions.
_mdl_single_layer loops over all community pairs per layer. If a partition has many singleton communities, this becomes roughly quadratic in the number of communities per layer. Probably okay for small graphs, but worth noting or testing on expected AutoCommunity graph sizes.
Once these are addressed, we have a solid version
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Implements MDL metric for community detection. MDL gives a partition a score in bits. Lower is better. A good grouping mean dense inside communities and sparse between them compresses the network.
Motivation
The
mdlhad placeholder value 0.0, was not implemented.Changes
multilayer_quality_metrics.py:Added
mdl_score(). It scores each layer on its own and sums the results.multilayer_quality_metrics.py:Added
_mdl_single_layer(). It computes the bits for one layer: a model cost (bits to write down which community each node is in) plus a data cost (bits to describe the edges given those communities).autocommunity_executor.py:import
mdl_score.Testing
Breaking Changes
None.
Known Limitations