evoc.label_propagation.label_propagation_init
- evoc.label_propagation.label_propagation_init(graph, n_label_prop_iter=20, n_embedding_epochs=50, approx_n_parts=512, n_components=2, scaling=0.1, random_scale=1.0, noise_level=0.5, random_state=None, data=None, recursive_init=True, base_init='pca', base_init_threshold=64, upscaling='partition_expander')[source]
Initialize a node embedding using label propagation on a sparse graph.
This function provides a high-quality initialization for node embeddings by combining graph-based label propagation with hierarchical partitioning. For large graphs, it recursively partitions the data and upscales the results. For small graphs, it uses direct methods (PCA, spectral embedding, or random).
- Parameters:
graph (scipy.sparse matrix) – A sparse adjacency or weighted graph matrix representing connectivity.
n_label_prop_iter (int, default=20) – Number of label propagation iterations to perform on the graph.
n_embedding_epochs (int, default=50) – Number of epochs when using node embedding for upscaling.
approx_n_parts (int, default=512) – Approximate number of partitions to create for recursive partitioning of large graphs. Useful for controlling memory and computation.
n_components (int, default=2) – The number of dimensions in the output embedding.
scaling (float, default=0.1) – Scaling factor applied to label propagation distances.
random_scale (float, default=1.0) – Scaling factor for random noise in the initialization.
noise_level (float, default=0.5) – The noise level parameter passed to node embedding algorithms.
random_state (RandomState instance or None, default=None) – Controls the randomness of the algorithm. If None, uses system randomness.
data (array-like of shape (n_samples, n_features) or None, default=None) – The original data array. Required if base_init=’pca’. Used for direct initialization methods on small graphs.
recursive_init (bool, default=True) – If True, uses recursive partitioning for large graphs. If False, applies the base initialization method directly.
base_init ({'pca', 'random', 'spectral', 'mds'}, default='pca') – The initialization method to use for small graphs (when graph size is below base_init_threshold). ‘pca’ requires the data parameter.
base_init_threshold (int, default=64) – The size threshold below which the base_init method is used directly. Graphs larger than this use recursive partitioning.
upscaling ({'partition_expander', 'node_embedding'}, default='partition_expander') – The method to use when upscaling partitions back to the full graph. ‘partition_expander’ uses a fast expansion method, ‘node_embedding’ uses full node embedding (slower but potentially better quality).
- Returns:
embedding – The initialized node embedding based on label propagation and graph structure.
- Return type:
array-like of shape (n_vertices, n_components)