This is the official implementation of our ICML'25 paper Beyond Message Passing: Neural Graph Pattern Machine. GPM represents a significant step towards next-generation graph learning backbone by moving beyond traditional message passing approaches.
- 🔍 Direct learning from graph substructures instead of message passing
- 🚀 Enhanced ability to capture long-range dependencies
- 💡 Efficient extraction and encoding of task-relevant graph patterns
- 🎯 Superior expressivity in handling complex graph structures
GPM's workflow consists of three main steps:
- Pattern extraction using random walk tokenizer
- Pattern encoding via sequential modeling
- Pattern processing through transformer encoder for downstream tasks
- CUDA-compatible GPU (24GB memory minimum, 48GB recommended)
- CUDA 12.1
- Python 3.9+
# Create and activate conda environment
conda env create -f environment.yml
conda activate GPM
# Install DGL
pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/cu121/repo.html
# Install PyG dependencies
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.4.0+cu121.htmlThe code of GPM is presented in folder /GPM. You can run main.py and specify any dataset to run experiments. To ensure reproducability, we provide hyper-parameters in config/main.yaml. You can simply use command --use_params to set tuned hyper-parameters.
# Run with default parameters
python GPM/main.py --dataset computers --use_params-
Node Classification
cora_full,computers,arxiv,productswikics,deezer,blog,flickr,flickr_small
-
Link Prediction
link-cora,link-pubmed,link-collab
-
Graph Classification
imdb-b,collab,reddit-m5k,reddit-m12k
-
Graph Regression
zinc,zinc_full
We also provide the interfaces of other widely used datasets like
photo, physics, reddit, etc. Please check the datasets in GPM/data/pyg_data_loader.py for details.
--use_params: Use tuned hyperparameters--dataset: Target dataset name--epochs: Number of training epochs--batch_size: Batch size--lr: Learning rate--split: Data split strategy (public,low,median,high)
--hidden_dim: Hidden layer dimension--heads: Number of attention heads--num_layers: Number of Transformer layers--dropout: Dropout rate
--num_patterns: Number of patterns per instance--pattern_size: Pattern size (random walk length)--multiscale: Range of walk lengths--pattern_encoder: Pattern encoder type (transformer,mean,gru)
For complete configuration options, please refer to our code documentation.
Run domain adaptation experiments using:
python GPM/da.py --source acm --target dblp --use_paramsSupported domain pairs:
acm -> dblp,dblp -> acmDE -> {EN, ES, FR, PT, RU}
└── GPM
├── GPM/ # Main package directory
│ ├── data/ # Data loading and preprocessing
│ ├── model/ # Model architectures
│ ├── task/ # Task implementations
│ ├── utils/ # Utility functions
│ ├── main.py # Main training script
│ └── da.py # Domain adaptation script
├── config/ # Configuration files
├── assets/ # Images and assets
├── data/ # Dataset storage
├── patterns/ # Extracted graph patterns
└── environment.yml # Conda environment spec
If you find this work useful, please cite our paper:
@inproceedings{wang2025gpm,
title={Beyond Message Passing: Neural Graph Pattern Machine},
author={Wang, Zehong and Zhang, Zheyuan and Ma, Tianyi and Chawla, Nitesh V and Zhang, Chuxu and Ye, Yanfang},
booktitle={Forty-Second International Conference on Machine Learning},
year={2025},
}
@article{wang2025neural,
title={Neural Graph Pattern Machine},
author={Wang, Zehong and Zhang, Zheyuan and Ma, Tianyi and Chawla, Nitesh V and Zhang, Chuxu and Ye, Yanfang},
journal={arXiv preprint arXiv:2501.18739},
year={2025}
}For questions, please contact zwang43@nd.edu or open an issue.
This repository builds upon the excellent work from:
We thank these projects for their valuable contributions to the field.


