CogDL: A Comprehensive Library for Graph Deep Learning.
Homepage | Paper | Documentation | Discussion Forum | Dataset | 中文
CogDL is a graph deep learning toolkit that allows researchers and developers to easily train and compare baseline or customized models for node classification, graph classification, and other important tasks in the graph domain.
We summarize the contributions of CogDL as follows:
- Efficiency: CogDL utilizes well-optimized operators to speed up training and save GPU memory of GNN models.
- Ease of Use: CogDL provides easy-to-use APIs for running experiments with the given models and datasets using hyper-parameter search.
- Extensibility: The design of CogDL makes it easy to apply GNN models to new scenarios based on our framework.
❗ News
-
The CogDL paper was accepted by WWW 2023. Find us at WWW 2023! We also release the new v0.6 release which adds more examples of graph self-supervised learning, including GraphMAE, GraphMAE2, and BGRL.
-
A free GNN course provided by CogDL Team is present at this link. We also provide a discussion forum for Chinese users.
-
The new v0.5.3 release supports mixed-precision training by setting \textit{fp16=True} and provides a basic example written by Jittor. It also updates the tutorial in the document, fixes downloading links of some datasets, and fixes potential bugs of operators.
Getting Started
Requirements and Installation
- Python version >= 3.7
- PyTorch version >= 1.7.1
Please follow the instructions here to install PyTorch (https://github.com/pytorch/pytorch#installation).
When PyTorch has been installed, cogdl can be installed using pip as follows:
pip install cogdl
Install from source via:
pip install git+https://github.com/thudm/cogdl.git
Or clone the repository and install with the following commands:
git clone git@github.com:THUDM/cogdl.git
cd cogdl
pip install -e .
Usage
Command-Line Usage
You can also use python scripts/train.py --dataset example_dataset --model example_model
to run example_model on example_data.
- --dataset, dataset name to run, can be a list of datasets with space like
cora citeseer
. Supported datasets include 'cora', 'citeseer', 'pumbed', 'ppi', 'wikipedia', 'blogcatalog', 'flickr'. More datasets can be found in the cogdl/datasets. - --model, model name to run, can be a list of models like
gcn gat
. Supported models include 'gcn', 'gat', 'graphsage', 'deepwalk', 'node2vec', 'hope', 'grarep', 'netmf', 'netsmf', 'prone'. More models can be found in the cogdl/models.
For example, if you want to run GCN and GAT on the Cora dataset, with 5 different seeds:
python scripts/train.py --dataset cora --model gcn gat --seed 0 1 2 3 4
Expected output:
Variant | test_acc | val_acc |
---|---|---|
('cora', 'gcn') | 0.8050±0.0047 | 0.7940±0.0063 |
('cora', 'gat') | 0.8234±0.0042 | 0.8088±0.0016 |
If you have ANY difficulties to get things working in the above steps, feel free to open an issue. You can expect a reply within 24 hours.
How to enable fast GNN training?
CogDL provides a fast sparse matrix-matrix multiplication operator called [GE-SpMM](https://arxiv.org/abs/2007.03179) to speed up training of GNN models on the GPU. The feature will be automatically used if it is available. Note that this feature is still in testing and may not work under some versions of CUDA.
How to run parallel experiments with GPUs on several models?
If you want to run parallel experiments on your server with multiple GPUs on multiple models, GCN and GAT, on the Cora dataset:
$ python scripts/train.py --dataset cora --model gcn gat --hidden-size 64 --devices 0 1 --seed 0 1 2 3 4
Expected output:
Variant | Acc |
---|---|
('cora', 'gcn') | 0.8236±0.0033 |
('cora', 'gat') | 0.8262±0.0032 |
How to use models from other libraries?
If you are familiar with other popular graph libraries, you can implement your own model in CogDL using modules from PyTorch Geometric (PyG). For the installation of PyG, you can follow the instructions from PyG (https://github.com/rusty1s/pytorch_geometric/#installation). For the quick-start usage of how to use layers of PyG, you can find some examples in the [examples/pyg](https://github.com/THUDM/cogdl/tree/master/examples/pyg/).
How to make a successful pull request with unit test
To have a successful pull request, you need to have at least (1) your model implementation and (2) a unit test.CogDL Team
CogDL is developed and maintained by Tsinghua, ZJU, DAMO Academy, and ZHIPU.AI.
The core development team can be reached at cogdlteam@gmail.com.
Citing CogDL
Please cite our paper if you find our code or results useful for your research:
@inproceedings{cen2023cogdl,
title={CogDL: A Comprehensive Library for Graph Deep Learning},
author={Yukuo Cen and Zhenyu Hou and Yan Wang and Qibin Chen and Yizhen Luo and Zhongming Yu and Hengrui Zhang and Xingcheng Yao and Aohan Zeng and Shiguang Guo and Yuxiao Dong and Yang Yang and Peng Zhang and Guohao Dai and Yu Wang and Chang Zhou and Hongxia Yang and Jie Tang},
booktitle={Proceedings of the ACM Web Conference 2023 (WWW'23)},
year={2023}
}
from https://github.com/THUDM/CogDL
No comments:
Post a Comment