Benchmarking Robustness

Robustness is also a critical aspect of evaluating network alignment (NA) algorithms, especially when working with noisy real-world data. This tutorial shows how to benchmark the robustness of a network alignment (NA) algorithm in PlanetAlign against various types of noise that can occur in practice, such as:

  • Edge noise (randomly added or removed edges)

  • Attribute noise (corrupted node features)

  • Supervision noise (incorrect anchor links)

In this tutorial, we test the robustness of NA algorithms by injecting controlled noise into:

  1. Graph structure (edge noise)

  2. Node features (attribute noise)

  3. Supervision (supervision noise)

PlanetAlign provides utility functions in PlanetAlign.utils compatible with the built-in datasets and custom datasets derived from BaseData to facilitate such perturbations and evaluate model performance under varying levels of noise.

Adding Edge Noise

To simulate structural perturbation, you can add random edges to one or both graphs using

add_edge_noises(dataset: Dataset, noise_ratio: float, gids: int | List[int] | Tuple[int, ...] | None = None, seed: int | None = None, inplace: bool = False) Dataset[source]

Add structural noise to graphs in a PlanetAlign dataset by perturbing edges.

Parameters:
  • dataset (PyG dataset) – The input dataset containing graphs.

  • noise_ratio (float) – The ratio of edges to perturb in each graph.

  • gids (int, list of int, or tuple of int) – The graph IDs to perturb. If None, all graphs will be perturbed.

  • seed (int, optional) – Random seed for reproducibility.

  • inplace (bool, optional) – If True, modify the dataset in place. Otherwise, return a new dataset.

Returns:

The dataset with perturbed edges.

Return type:

PyG dataset

Here is an example that perturbs G1 and G2 with 10% edge noise:

from PlanetAlign.datasets import Douban
from PlanetAlign.utils import add_edge_noises

data = Douban()

# Add edge noise to both graphs with noise_rate = 0.1
data = add_edge_noises(data, noise_rate=0.1, gids=[0, 1], inplace=False)

# Check the new number of edges
print("G1 edges (after noise):", data.pyg_graphs[0].num_edges)
print("G2 edges (after noise):", data.pyg_graphs[1].num_edges)

Note

  • noise_rate ∈ [0, 1] determines the fraction of new random edges added.

  • gids specifies which graphs (by index) to perturb (e.g., [0], [1], or [0, 1])

  • inplace argument is defaulted to False, meaning it returns a new dataset object with noise applied. Set inplace=True to modify the original dataset directly.

Adding Attribute Noise

To test the model’s robustness to noisy node features, you can use:

add_attr_noises(dataset: Dataset, mode: str, noise_ratio: float, gids: int | List[int] | Tuple[int, ...] | None = None, seed: int | None = None, inplace: bool = False) Dataset[source]

Add attribute noise to graphs in a PlanetAlign dataset by perturbing node attributes.

Parameters:
  • dataset (PyG dataset) – The input dataset containing graphs.

  • mode (str) – The mode of noise to add. Options are ‘flip’ or ‘gaussian’.

  • noise_ratio (float) – The ratio of attributes to flip in each graph.

  • gids (int, list of int, or tuple of int) – The graph IDs to perturb. If None, all graphs will be perturbed.

  • seed (int, optional) – Random seed for reproducibility.

  • inplace (bool, optional) – If True, modify the dataset in place. Otherwise, return a new dataset.

Returns:

The dataset with perturbed attributes.

Return type:

PyG dataset

This randomly corrupts a proportion of feature vectors by replacing them with noise (e.g., Gaussian or uniform).

from PlanetAlign.datasets import Douban
from PlanetAlign.utils import add_attr_noises

data = Douban()

# Add 20% attribute noise to both graphs by fliping binary attributes
data = add_attr_noises(data, mode='flip', noise_rate=0.2, gids=[0, 1], inplace=False)

# Check that node features are still the same shape
print("X1 shape:", data.X1.shape)
print("X2 shape:", data.X2.shape)

Note

  • Only applies if X1 and/or X2 are present.

  • For binary features, use mode='flip' to randomly flip bits.

  • For continuous features, use mode='gaussian' to add Gaussian noise.

Adding Supervision Noise

To simulate incorrect alignment supervision (e.g., noisy anchors), use:

# Add supervision noise to graphs in a PyNetAlign dataset by injecting noisy anchors.
add_sup_noises(dataset, noise_ratio, src_gid=0, dst_gid=1, seed=None, inplace=False)

Parameters

  • dataset (Dataset) – The input dataset containing the graphs and ground-truth alignment.

  • noise_ratio (float) – The ratio of supervision to perturb (value between 0 and 1).

  • src_gid (int, optional) – The graph ID of the source graph. Default is 0.

  • dst_gid (int, optional) – The graph ID of the destination graph. Default is 1.

  • seed (int, optional) – Random seed for reproducibility.

  • inplace (bool, optional) – If True, modify the dataset in place. Otherwise, return a new dataset with supervision noise applied.

Returns

  • Dataset – A PyG dataset with perturbed supervision (modified anchors).

Return type

  • Dataset

This randomly replaces a proportion of training anchor pairs with mismatched ones:

from PlanetAlign.datasets import Douban
from PlanetAlign.utils import add_sup_noises

data = Douban()

# Add 30% supervision noise into anchor links for training
data = add_sup_noises(data, noise_rate=0.3, inplace=False)

Note

  • noise_rate controls the fraction of anchor pairs corrupted.

  • This helps test how models perform when supervision is imperfect or partially mislabeled.

Best Practices

  • Run multiple trials per noise level to reduce variance in evaluation.

  • Plot metric degradation (e.g., MRR, Hits@K) versus noise rate to analyze robustness curves.

  • Vary only one noise source at a time (e.g., edge vs. attribute) to isolate effects.

Summary

In this tutorial, we showed how to:

  • Inject edge noise using add_edge_noises()

  • Add attribute noise using add_attr_noises()

  • Simulate supervision noise using add_sup_noises()

These utilities allow you to evaluate how sensitive NA algorithms are to real-world imperfections. Next, consider benchmarking robustness across multiple datasets or noise regimes for more comprehensive analysis.