evoc.node_embedding.node_embedding

evoc.node_embedding.node_embedding(graph, n_components, n_epochs, initial_embedding=None, initial_alpha=0.5, negative_sample_rate=1.0, noise_level=0.5, random_state=None, reproducible_flag=True, verbose=False, tqdm_kwds={})[source]

Learn a low-dimensional embedding of a graph using a UMAP-like algorithm.

This function performs stochastic gradient descent optimization to learn a low-dimensional embedding of graph structure. It uses both positive (connected edges) and negative (random) samples to guide the optimization.

Parameters:
  • graph (scipy.sparse matrix, typically csr_matrix or csc_matrix) – A sparse adjacency matrix representing the graph. The weights in the matrix represent connection strengths between nodes.

  • n_components (int) – The number of dimensions in the output embedding.

  • n_epochs (int) – The number of epochs to train the embedding.

  • initial_embedding (array-like of shape (n_vertices, n_components) or None, default=None) – An initial embedding to use as a starting point. If None, a random embedding is generated from a normal distribution with scale 0.25.

  • initial_alpha (float, default=0.5) – The initial learning rate. The learning rate decays linearly over epochs.

  • negative_sample_rate (float, default=1.0) – The rate at which negative samples are drawn relative to positive samples. Controls the ratio of negative to positive updates per epoch.

  • noise_level (float, default=0.5) – Controls the strength of noise in the gradient computation. Higher values increase the tolerance for larger distances before penalizing in the embedding space.

  • random_state (RandomState instance or None, default=None) – Random state for reproducibility. If None, uses system randomness.

  • reproducible_flag (bool, default=True) – If True, uses a deterministic (but slower) update strategy that processes nodes in blocks for reproducibility. If False, uses a faster stochastic approach.

  • verbose (bool, default=False) – If True, display a progress bar during training.

  • tqdm_kwds (dict, default={}) – Additional keyword arguments to pass to tqdm for progress bar customization.

Returns:

embedding – The learned low-dimensional embedding of the graph vertices.

Return type:

array-like of shape (n_vertices, n_components)