Yaning1001
/

CGSCORE

Model card Files Files and versions Community

Yaning1001 commited on Mar 31

Commit

92b9080

verified ·

1 Parent(s): d38bce3

Add files using upload-large-folder tool

Browse files

Files changed (47) hide show

LICENSE +21 -0
adversary_examples/cifar_advexample_orig.png +0 -0
deeprobust/graph/README.md +76 -0
deeprobust/graph/__init__.py +1 -0
deeprobust/graph/data/__init__.py +16 -0
deeprobust/graph/data/dataset.py +333 -0
deeprobust/graph/data/pyg_dataset.py +308 -0
deeprobust/graph/data/utils.py +10 -0
deeprobust/graph/defense/__init__.py +23 -0
deeprobust/graph/defense/pgd.py +207 -0
deeprobust/graph/defense/simpgcn.py +474 -0
deeprobust/graph/defense_pyg/gat.py +100 -0
deeprobust/graph/defense_pyg/gcn.py +110 -0
deeprobust/graph/global_attack/base_attack.py +130 -0
deeprobust/graph/global_attack/node_embedding_attack.py +522 -0
deeprobust/graph/global_attack/prbcd.py +440 -0
deeprobust/graph/rl/nipa_env.py +169 -0
deeprobust/graph/rl/rl_s2v_config.py +57 -0
deeprobust/graph/targeted_attack/__init__.py +9 -0
deeprobust/graph/targeted_attack/base_attack.py +126 -0
deeprobust/graph/targeted_attack/fga.py +124 -0
deeprobust/graph/targeted_attack/ig_attack.py +224 -0
deeprobust/graph/targeted_attack/nettack.py +624 -0
deeprobust/graph/targeted_attack/rnd.py +139 -0
deeprobust/graph/targeted_attack/sga.py +323 -0
deeprobust/graph/targeted_attack/ugba.py +913 -0
deeprobust/graph/utils.py +778 -0
deeprobust/image/README.md +45 -0
deeprobust/image/__init__.py +11 -0
deeprobust/image/attack/Nattack.py +181 -0
deeprobust/image/attack/fgsm.py +121 -0
deeprobust/image/attack/onepixel.py +186 -0
deeprobust/image/defense/AWP.py +301 -0
deeprobust/image/defense/TherEncoding.py +203 -0
deeprobust/image/defense/YOPO.py +410 -0
deeprobust/image/defense/__init__.py +6 -0
deeprobust/image/defense/base_defense.py +100 -0
deeprobust/image/defense/fast.py +169 -0
deeprobust/image/defense/fgsmtraining.py +227 -0
deeprobust/image/defense/pgdtraining.py +229 -0
deeprobust/image/defense/trades.py +241 -0
deeprobust/image/optimizer.py +914 -0
deeprobust/image/preprocessing/APE-GAN.py +127 -0
deeprobust/image/preprocessing/prepare_advdata.py +62 -0
deeprobust/image/utils.py +211 -0
docs/graph/defense.rst +109 -0
docs/graph/node_embedding.rst +110 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2020 Yaxin Li, Wei Jin, Han Xu and Jiliang Tang.
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

adversary_examples/cifar_advexample_orig.png ADDED Viewed

deeprobust/graph/README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+# Setup
+```
+git clone https://github.com/DSE-MSU/DeepRobust.git
+cd DeepRobust
+python setup.py install
+```
+# Test Examples
+Test GCN on perturbed graph (5% metattack)
+```
+python examples/graph/test_gcn.py --dataset cora
+```
+Test GCN-Jaccard on perturbed graph (5% metattack)
+```
+python examples/graph/test_gcn_jaccard.py --dataset cora
+```
+Generate attack by yourself
+```
+python examples/graph/test_mettack.py --dataset cora --ptb_rate 0.05
+```
+For a practice of deeprobust graph package, you can also refer to https://github.com/ChandlerBang/Pro-GNN.
+# Full README
+[click here](https://github.com/DSE-MSU/DeepRobust)
+# Supported Datasets
+* Cora
+* Cora-ML
+* Citeseer
+* Pubmed
+* Polblogs
+* ACM: [link1](https://github.com/zhumeiqiBUPT/AM-GCN) [link2](https://github.com/Jhy1993/HAN)
+* BlogCatalog: [link](https://github.com/mengzaiqiao/CAN)
+* Flickr: [link](https://github.com/mengzaiqiao/CAN)
+* UAI: A Unifed Weakly Supervised Framework for Community Detection and Semantic Matching.
+* PyTorch Geometric Datasets: Amazon-Computers, Amazon-Photo, CoauthorCS CoauthorPhysics...
+For more details, please take a look at [dataset.py](https://github.com/DSE-MSU/DeepRobust/blob/master/deeprobust/graph/data/dataset.py)
+# Attack Methods
+|   Attack Methods   | Type<img width=200> | Perturbation <img width=80> | Evasion/<br>Poisoning | Apply Domain | Paper | Code |
+|--------------------|------|--------------------|-------------|-------|----|----|
+| Nettack | Targeted Attack | Structure<br>Features | Both | Node Classification | [Adversarial Attacks on Neural Networks for Graph Data](https://arxiv.org/pdf/1805.07984.pdf)| [test_nettack.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_nettack.py) |
+| FGA | Targeted Attack | Structure | Both | Node Classification | [Fast Gradient Attack on Network Embedding](https://arxiv.org/pdf/1809.02797.pdf)| [test_fga.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_fga.py) |
+| Metattack | Global Attack |  Structure<br>Features | Poisoning | Node Classification | [Adversarial Attacks on Graph Neural Networks via Meta Learning](https://openreview.net/pdf?id=Bylnx209YX) | [test_mettack.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_mettack.py) |
+| RL-S2V | Targeted Attack | Structure | Evasion |  Node Classification | [Adversarial Attack on Graph Structured Data](https://arxiv.org/pdf/1806.02371.pdf) |[test_rl_s2v.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_rl_s2v.py) |
+| Node Embedding Attack | Global Attack |  Structure | Poisoning | Node Embedding | [Adversarial Attacks on Node Embeddings via Graph Poisoning](https://arxiv.org/abs/1809.01093) | [test_node_embedding_attack.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_node_embedding_attack.py) |
+| Baselines for Node Embedding Attack <br> Degree, eigencentrality and random | Global Attack |  Structure | Poisoning | Node Embedding | [Adversarial Attacks on Node Embeddings via Graph Poisoning](https://arxiv.org/abs/1809.01093) | [test_node_embedding_attack.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_node_embedding_attack.py) |
+| PGD, Min-max | Global Attack | Structure | Both | Node Classification | [Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective](https://arxiv.org/pdf/1906.04214.pdf)|[test_pgd.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_pgd.py)  [test_min_max.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_min_max.py) |
+| DICE | Global Attack | Structure | Both |  Node Classification | [Hiding individuals and communities in a social network](https://arxiv.org/abs/1608.00375)|[test_dice.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_dice.py) |
+| IG-Attack | Targeted Attack | Structure<br>Features| Both | Node Classification | [Adversarial Examples on Graph Data: Deep Insights into Attack and Defense](https://arxiv.org/pdf/1903.01610.pdf)|[test_ig.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_ig.py) |
+| NIPA | Global Attack | Structure | Poisoning |  Node Classification | [Non-target-specific Node Injection Attacks on Graph Neural Networks: A Hierarchical Reinforcement Learning Approach](https://faculty.ist.psu.edu/vhonavar/Papers/www20.pdf) | [test_nipa.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_nipa.py) |
+| RND | Targeted Attack<br>Global Attack | Structure<br>Features<br>Adding Nodes | Both | Node Classification | |[test_rnd.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_rnd.py) |
+| SGAttack | Targeted Attack | Structure | Poisoning | Node Classification | [Adversarial Attack on Large Scale Graph](https://arxiv.org/abs/2009.03488)| [test_sga.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_sga.py) |
+# Defense Methods
+|   Defense Methods   | Defense Type | Apply Domain | Paper | Code |
+|---------------------|--------------|--------------|------| ------|
+| GCN | Victim Model | Node Classification | [Semi-Supervised Classification with Graph Convolutional Networks](https://arxiv.org/abs/1609.02907) | [test_gcn.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_gcn.py) |
+| ChebNet | Victim Model | Node Classification | [Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://arxiv.org/abs/1606.09375) | [test_chebnet.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_chebnet.py) |
+| SGC | Victim Model | Node Classification | [Simplifying Graph Convolutional Networks](https://arxiv.org/abs/1902.07153) | [test_sgc.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_sgc.py) |
+| GAT | Adaptive Aggregation | Node Classification | [Graph Attention Networks](https://arxiv.org/abs/1710.10903) | [test_gat.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_gat.py) |
+| DeepWalk | Victim Model | Node Embedding | [DeepWalk: Online Learning of Social Representations](https://arxiv.org/abs/1403.6652) | [test_deepwalk.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_deepwalk.py) |
+| Node2Vec | Victim Model | Node Embedding | [node2vec: Scalable Feature Learning for Networks](https://arxiv.org/abs/1607.00653) | [test_deepwalk.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_deepwalk.py) |
+| RGCN | Adaptive Aggregation | Node Classification | [Robust Graph Convolutional Networks Against Adversarial Attacks](http://pengcui.thumedialab.com/papers/RGCN.pdf) | [test_rgcn.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_rgcn.py) |
+| GCN-Jaccard | Graph Purifying | Node Classification | [Adversarial Examples on Graph Data: Deep Insights into Attack and Defense](https://arxiv.org/pdf/1903.01610.pdf)| [test_gcn_jaccard.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_gcn_jaccard.py) |
+| GCN-SVD | Graph Purifying | Node Classification | [All You Need is Low (Rank): Defending Against Adversarial Attacks on Graphs](https://dl.acm.org/doi/pdf/10.1145/3336191.3371789?download=true) | [test_gcn_svd.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_gcn_svd.py) |
+| Adv-training | Adversarial Training | Node Classification |  |[test_adv_train_poisoning.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_adv_train_poisoning.py) |
+| Pro-GNN | Graph Purifying | Node Classification | [Graph Structure Learning for Robust Graph Neural Network](https://arxiv.org/abs/2005.10203)|[test_prognn.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_prognn.py) |
+| SimP-GCN | Adaptive Aggregation | Node Classification | [Node Similarity Preserving Graph Convolutional Networks](https://arxiv.org/abs/2011.09643)|[test_simpgcn.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_simpgcn.py) |
+| MedianGCN | Adaptive Aggregation | Node Classification | [Understanding Structural Vulnerability in Graph Convolutional Networks](https://arxiv.org/abs/2108.06280)|[test_median_gcn.py](https://github.com/DSE-MSU/DeepRobust/blob/master/examples/graph/test_median_gcn.py) |
+<!--| Adv-training | Adversarial Training | Node Classification | [Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective](https://arxiv.org/pdf/1906.04214.pdf)|
+-->
+<!--| Hidden-Adv-training | Adversarial Training | Node Classification<br>Graph Classification |[To be added]|
+-->

deeprobust/graph/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+

deeprobust/graph/data/__init__.py ADDED Viewed

	@@ -0,0 +1,16 @@

+from .dataset import Dataset
+from .attacked_data import PtbDataset
+from .attacked_data import PrePtbDataset
+import warnings
+try:
+    from .pyg_dataset import Pyg2Dpr, Dpr2Pyg, AmazonPyg, CoauthorPyg
+except ImportError as e:
+    print(e)
+    warnings.warn("Please install pytorch geometric if you " +
+            "would like to use the datasets from pytorch " +
+            "geometric. See details in https://pytorch-geom" +
+            "etric.readthedocs.io/en/latest/notes/installation.html")
+__all__ = ['Dataset', 'PtbDataset', 'PrePtbDataset',
+          'Pyg2Dpr', 'Dpr2Pyg', 'AmazonPyg', 'CoauthorPyg']

deeprobust/graph/data/dataset.py ADDED Viewed

	@@ -0,0 +1,333 @@

+import numpy as np
+import scipy.sparse as sp
+import os.path as osp
+import os
+import urllib.request
+import sys
+import pickle as pkl
+import networkx as nx
+from deeprobust.graph.utils import get_train_val_test, get_train_val_test_gcn
+import zipfile
+import json
+import platform
+class Dataset():
+    """Dataset class contains four citation network datasets "cora", "cora-ml", "citeseer" and "pubmed",
+    and one blog dataset "Polblogs". Datasets "ACM", "BlogCatalog", "Flickr", "UAI",
+    "Flickr" are also available. See more details in https://github.com/DSE-MSU/DeepRobust/tree/master/deeprobust/graph#supported-datasets.
+    The 'cora', 'cora-ml', 'polblogs' and 'citeseer' are downloaded from https://github.com/danielzuegner/gnn-meta-attack/tree/master/data, and 'pubmed' is from https://github.com/tkipf/gcn/tree/master/gcn/data.
+    Parameters
+    ----------
+    root : string
+        root directory where the dataset should be saved.
+    name : string
+        dataset name, it can be chosen from ['cora', 'citeseer', 'cora_ml', 'polblogs',
+        'pubmed', 'acm', 'blogcatalog', 'uai', 'flickr']
+    setting : string
+        there are two data splits settings. It can be chosen from ['nettack', 'gcn', 'prognn']
+        The 'nettack' setting follows nettack paper where they select the largest connected
+        components of the graph and use 10%/10%/80% nodes for training/validation/test .
+        The 'gcn' setting follows gcn paper where they use the full graph and 20 samples
+        in each class for traing, 500 nodes for validation, and 1000
+        nodes for test. (Note here 'netack' and 'gcn' setting do not provide fixed split, i.e.,
+        different random seed would return different data splits)
+    seed : int
+        random seed for splitting training/validation/test.
+    require_mask : bool
+        setting require_mask True to get training, validation and test mask
+        (self.train_mask, self.val_mask, self.test_mask)
+    Examples
+    --------
+	We can first create an instance of the Dataset class and then take out its attributes.
+	>>> from deeprobust.graph.data import Dataset
+	>>> data = Dataset(root='/tmp/', name='cora', seed=15)
+	>>> adj, features, labels = data.adj, data.features, data.labels
+	>>> idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    """
+    def __init__(self, root, name, setting='nettack', seed=None, require_mask=False):
+        self.name = name.lower()
+        self.setting = setting.lower()
+        assert self.name in ['cora', 'citeseer', 'cora_ml', 'polblogs',
+                'pubmed', 'acm', 'blogcatalog', 'uai', 'flickr'], \
+                'Currently only support cora, citeseer, cora_ml, ' + \
+                'polblogs, pubmed, acm, blogcatalog, flickr'
+        assert self.setting in ['gcn', 'nettack', 'prognn'], "Settings should be" + \
+                        " choosen from ['gcn', 'nettack', 'prognn']"
+        self.seed = seed
+        # self.url =  'https://raw.githubusercontent.com/danielzuegner/nettack/master/data/%s.npz' % self.name
+        self.url =  'https://raw.githubusercontent.com/danielzuegner/gnn-meta-attack/master/data/%s.npz' % self.name
+        if platform.system() == 'Windows':
+            self.root = root
+        else:
+        	self.root = osp.expanduser(osp.normpath(root))
+        self.data_folder = osp.join(root, self.name)
+        self.data_filename = self.data_folder + '.npz'
+        self.require_mask = require_mask
+        self.require_lcc = False if setting == 'gcn' else True
+        self.adj, self.features, self.labels = self.load_data()
+        if setting == 'prognn':
+            assert name in ['cora', 'citeseer', 'pubmed', 'cora_ml', 'polblogs', 'Flickr'], "ProGNN splits only " + \
+                        "cora, citeseer, pubmed, cora_ml, polblogs, Flickr"
+            self.idx_train, self.idx_val, self.idx_test = self.get_prognn_splits()
+        else:
+            self.idx_train, self.idx_val, self.idx_test = self.get_train_val_test()
+        if self.require_mask:
+            self.get_mask()
+    def get_train_val_test(self):
+        """Get training, validation, test splits according to self.setting (either 'nettack' or 'gcn').
+        """
+        if self.setting == 'nettack':
+            return get_train_val_test(nnodes=self.adj.shape[0], val_size=0.1, test_size=0.8, stratify=self.labels, seed=self.seed)
+        if self.setting == 'gcn':
+            return get_train_val_test_gcn(self.labels, seed=self.seed)
+    def get_prognn_splits(self):
+        """Get target nodes incides, which is the nodes with degree > 10 in the test set."""
+        url = 'https://raw.githubusercontent.com/ChandlerBang/Pro-GNN/' + \
+                     'master/splits/{}_prognn_splits.json'.format(self.name)
+        json_file = osp.join(self.root,
+                '{}_prognn_splits.json'.format(self.name))
+        if not osp.exists(json_file):
+            self.download_file(url, json_file)
+        # with open(f'/mnt/home/jinwei2/Projects/nettack/{dataset}_nettacked_nodes.json', 'r') as f:
+        with open(json_file, 'r') as f:
+            idx = json.loads(f.read())
+        return np.array(idx['idx_train']), \
+               np.array(idx['idx_val']), np.array(idx['idx_test'])
+    def load_data(self):
+        print('Loading {} dataset...'.format(self.name))
+        if self.name == 'pubmed':
+            return self.load_pubmed()
+        if self.name in ['acm', 'blogcatalog', 'uai', 'flickr']:
+            return self.load_zip()
+        if not osp.exists(self.data_filename):
+            self.download_npz()
+        adj, features, labels = self.get_adj()
+        return adj, features, labels
+    def download_file(self, url, file):
+        print('Dowloading from {} to {}'.format(url, file))
+        try:
+            urllib.request.urlretrieve(url, file)
+        except:
+            raise Exception("Download failed! Make sure you have \
+                    stable Internet connection and enter the right name")
+    def download_npz(self):
+        """Download adjacen matrix npz file from self.url.
+        """
+        print('Downloading from {} to {}'.format(self.url, self.data_filename))
+        try:
+            urllib.request.urlretrieve(self.url, self.data_filename)
+            print('Done!')
+        except:
+            raise Exception('''Download failed! Make sure you have stable Internet connection and enter the right name''')
+    def download_pubmed(self, name):
+        url = 'https://raw.githubusercontent.com/tkipf/gcn/master/gcn/data/'
+        try:
+            print('Downloading', url)
+            urllib.request.urlretrieve(url + name, osp.join(self.root, name))
+            print('Done!')
+        except:
+            raise Exception('''Download failed! Make sure you have stable Internet connection and enter the right name''')
+    def download_zip(self, name):
+        url = 'https://raw.githubusercontent.com/ChandlerBang/Pro-GNN/master/other_datasets/{}.zip'.\
+                format(name)
+        try:
+            print('Downlading', url)
+            urllib.request.urlretrieve(url, osp.join(self.root, name+'.zip'))
+            print('Done!')
+        except:
+            raise Exception('''Download failed! Make sure you have stable Internet connection and enter the right name''')
+    def load_zip(self):
+        data_filename = self.data_folder + '.zip'
+        name = self.name
+        if not osp.exists(data_filename):
+            self.download_zip(name)
+            with zipfile.ZipFile(data_filename, 'r') as zip_ref:
+                zip_ref.extractall(self.root)
+        feature_path = osp.join(self.data_folder, '{0}.feature'.format(name))
+        label_path = osp.join(self.data_folder, '{0}.label'.format(name))
+        graph_path = osp.join(self.data_folder, '{0}.edge'.format(name))
+        f = np.loadtxt(feature_path, dtype = float)
+        l = np.loadtxt(label_path, dtype = int)
+        features = sp.csr_matrix(f, dtype=np.float32)
+        # features = torch.FloatTensor(np.array(features.todense()))
+        struct_edges = np.genfromtxt(graph_path, dtype=np.int32)
+        sedges = np.array(list(struct_edges), dtype=np.int32).reshape(struct_edges.shape)
+        n = features.shape[0]
+        sadj = sp.coo_matrix((np.ones(sedges.shape[0]), (sedges[:, 0], sedges[:, 1])), shape=(n, n), dtype=np.float32)
+        sadj = sadj + sadj.T.multiply(sadj.T > sadj) - sadj.multiply(sadj.T > sadj)
+        label = np.array(l)
+        return sadj, features, label
+    def load_pubmed(self):
+        dataset = 'pubmed'
+        names = ['x', 'y', 'tx', 'ty', 'allx', 'ally', 'graph']
+        objects = []
+        for i in range(len(names)):
+            name = "ind.{}.{}".format(dataset, names[i])
+            data_filename = osp.join(self.root, name)
+            if not osp.exists(data_filename):
+                self.download_pubmed(name)
+            with open(data_filename, 'rb') as f:
+                if sys.version_info > (3, 0):
+                    objects.append(pkl.load(f, encoding='latin1'))
+                else:
+                    objects.append(pkl.load(f))
+        x, y, tx, ty, allx, ally, graph = tuple(objects)
+        test_idx_file = "ind.{}.test.index".format(dataset)
+        if not osp.exists(osp.join(self.root, test_idx_file)):
+            self.download_pubmed(test_idx_file)
+        test_idx_reorder = parse_index_file(osp.join(self.root, test_idx_file))
+        test_idx_range = np.sort(test_idx_reorder)
+        features = sp.vstack((allx, tx)).tolil()
+        features[test_idx_reorder, :] = features[test_idx_range, :]
+        adj = nx.adjacency_matrix(nx.from_dict_of_lists(graph))
+        labels = np.vstack((ally, ty))
+        labels[test_idx_reorder, :] = labels[test_idx_range, :]
+        labels = np.where(labels)[1]
+        return adj, features, labels
+    def get_adj(self):
+        adj, features, labels = self.load_npz(self.data_filename)
+        adj = adj + adj.T
+        adj = adj.tolil()
+        adj[adj > 1] = 1
+        if self.require_lcc:
+            lcc = self.largest_connected_components(adj)
+            adj = adj[lcc][:, lcc]
+            features = features[lcc]
+            labels = labels[lcc]
+            assert adj.sum(0).A1.min() > 0, "Graph contains singleton nodes"
+        # whether to set diag=0?
+        adj.setdiag(0)
+        adj = adj.astype("float32").tocsr()
+        adj.eliminate_zeros()
+        assert np.abs(adj - adj.T).sum() == 0, "Input graph is not symmetric"
+        assert adj.max() == 1 and len(np.unique(adj[adj.nonzero()].A1)) == 1, "Graph must be unweighted"
+        return adj, features, labels
+    def load_npz(self, file_name, is_sparse=True):
+        with np.load(file_name) as loader:
+            # loader = dict(loader)
+            if is_sparse:
+                adj = sp.csr_matrix((loader['adj_data'], loader['adj_indices'],
+                                            loader['adj_indptr']), shape=loader['adj_shape'])
+                if 'attr_data' in loader:
+                    features = sp.csr_matrix((loader['attr_data'], loader['attr_indices'],
+                                                 loader['attr_indptr']), shape=loader['attr_shape'])
+                else:
+                    features = None
+                labels = loader.get('labels')
+            else:
+                adj = loader['adj_data']
+                if 'attr_data' in loader:
+                    features = loader['attr_data']
+                else:
+                    features = None
+                labels = loader.get('labels')
+        if features is None:
+            features = np.eye(adj.shape[0])
+        features = sp.csr_matrix(features, dtype=np.float32)
+        return adj, features, labels
+    def largest_connected_components(self, adj, n_components=1):
+        """Select k largest connected components.
+		Parameters
+		----------
+		adj : scipy.sparse.csr_matrix
+			input adjacency matrix
+		n_components : int
+			n largest connected components we want to select
+		"""
+        _, component_indices = sp.csgraph.connected_components(adj)
+        component_sizes = np.bincount(component_indices)
+        components_to_keep = np.argsort(component_sizes)[::-1][:n_components]  # reverse order to sort descending
+        nodes_to_keep = [
+            idx for (idx, component) in enumerate(component_indices) if component in components_to_keep]
+        print("Selecting {0} largest connected components".format(n_components))
+        return nodes_to_keep
+    def __repr__(self):
+        return '{0}(adj_shape={1}, feature_shape={2})'.format(self.name, self.adj.shape, self.features.shape)
+    def get_mask(self):
+        idx_train, idx_val, idx_test = self.idx_train, self.idx_val, self.idx_test
+        labels = self.onehot(self.labels)
+        def get_mask(idx):
+            mask = np.zeros(labels.shape[0], dtype=np.bool)
+            mask[idx] = 1
+            return mask
+        def get_y(idx):
+            mx = np.zeros(labels.shape)
+            mx[idx] = labels[idx]
+            return mx
+        self.train_mask = get_mask(self.idx_train)
+        self.val_mask = get_mask(self.idx_val)
+        self.test_mask = get_mask(self.idx_test)
+        self.y_train, self.y_val, self.y_test = get_y(idx_train), get_y(idx_val), get_y(idx_test)
+    def onehot(self, labels):
+        eye = np.identity(labels.max() + 1)
+        onehot_mx = eye[labels]
+        return onehot_mx
+def parse_index_file(filename):
+    index = []
+    for line in open(filename):
+        index.append(int(line.strip()))
+    return index
+if __name__ == '__main__':
+    from deeprobust.graph.data import Dataset
+    for name in ['cora', 'citeseer', 'pubmed', 'cora_ml']:
+        data = Dataset(root='/tmp/', name=name, setting="prognn")
+        idx_train = data.idx_train
+        data2 = Dataset(root='/tmp/', name=name, setting="nettack", seed=15)
+        idx_train2 = data2.idx_train
+        assert (idx_train != idx_train2).sum() == 0
+    data = Dataset(root='/tmp/', name='flickr')
+    adj, features, labels = data.adj, data.features, data.labels
+    idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test

deeprobust/graph/data/pyg_dataset.py ADDED Viewed

	@@ -0,0 +1,308 @@

+import numpy as np
+import torch
+from .dataset import Dataset
+import scipy.sparse as sp
+from itertools import repeat
+import os.path as osp
+import warnings
+import sys
+from torch_geometric.data import InMemoryDataset, Data
+from torch_geometric.datasets import Coauthor, Amazon
+class Dpr2Pyg(InMemoryDataset):
+    """Convert deeprobust data (sparse matrix) to pytorch geometric data (tensor, edge_index)
+    Parameters
+    ----------
+    dpr_data :
+        data instance of class from deeprobust.graph.data, e.g., deeprobust.graph.data.Dataset,
+        deeprobust.graph.data.PtbDataset, deeprobust.graph.data.PrePtbDataset
+    transform :
+        A function/transform that takes in an object and returns a transformed version.
+        The data object will be transformed before every access. For example, you can
+        use torch_geometric.transforms.NormalizeFeatures()
+    Examples
+    --------
+    We can first create an instance of the Dataset class and convert it to
+    pytorch geometric data format.
+    >>> from deeprobust.graph.data import Dataset, Dpr2Pyg
+    >>> data = Dataset(root='/tmp/', name='cora')
+    >>> pyg_data = Dpr2Pyg(data)
+    >>> print(pyg_data)
+    >>> print(pyg_data[0])
+    """
+    def __init__(self, dpr_data, transform=None, **kwargs):
+        root = 'data/'  # dummy root; does not mean anything
+        self.dpr_data = dpr_data
+        super(Dpr2Pyg, self).__init__(root, transform)
+        pyg_data = self.process()
+        self.data, self.slices = self.collate([pyg_data])
+        self.transform = transform
+    def process(self):
+        dpr_data = self.dpr_data
+        edge_index = torch.LongTensor(dpr_data.adj.nonzero())
+        # by default, the features in pyg data is dense
+        if sp.issparse(dpr_data.features):
+            x = torch.FloatTensor(dpr_data.features.todense()).float()
+        else:
+            x = torch.FloatTensor(dpr_data.features).float()
+        y = torch.LongTensor(dpr_data.labels)
+        idx_train, idx_val, idx_test = dpr_data.idx_train, dpr_data.idx_val, dpr_data.idx_test
+        data = Data(x=x, edge_index=edge_index, y=y)
+        train_mask = index_to_mask(idx_train, size=y.size(0))
+        val_mask = index_to_mask(idx_val, size=y.size(0))
+        test_mask = index_to_mask(idx_test, size=y.size(0))
+        data.train_mask = train_mask
+        data.val_mask = val_mask
+        data.test_mask = test_mask
+        return data
+    def update_edge_index(self, adj):
+        """ This is an inplace operation to substitute the original edge_index
+        with adj.nonzero()
+        Parameters
+        ----------
+        adj: sp.csr_matrix
+            update the original adjacency into adj (by change edge_index)
+        """
+        self.data.edge_index = torch.LongTensor(adj.nonzero())
+        self.data, self.slices = self.collate([self.data])
+    def get(self, idx):
+        if self.slices is None:
+            return self.data
+        data = self.data.__class__()
+        if hasattr(self.data, '__num_nodes__'):
+            data.num_nodes = self.data.__num_nodes__[idx]
+        for key in self.data.keys:
+            item, slices = self.data[key], self.slices[key]
+            s = list(repeat(slice(None), item.dim()))
+            s[self.data.__cat_dim__(key, item)] = slice(slices[idx],
+                                                        slices[idx + 1])
+            data[key] = item[s]
+        return data
+    @property
+    def raw_file_names(self):
+        return ['some_file_1', 'some_file_2', ...]
+    @property
+    def processed_file_names(self):
+        return ['data.pt']
+    def _download(self):
+        pass
+class Pyg2Dpr(Dataset):
+    """Convert pytorch geometric data (tensor, edge_index) to deeprobust
+    data (sparse matrix)
+    Parameters
+    ----------
+    pyg_data :
+        data instance of class from pytorch geometric dataset
+    Examples
+    --------
+    We can first create an instance of the Dataset class and convert it to
+    pytorch geometric data format and then convert it back to Dataset class.
+    >>> from deeprobust.graph.data import Dataset, Dpr2Pyg, Pyg2Dpr
+    >>> data = Dataset(root='/tmp/', name='cora')
+    >>> pyg_data = Dpr2Pyg(data)
+    >>> print(pyg_data)
+    >>> print(pyg_data[0])
+    >>> dpr_data = Pyg2Dpr(pyg_data)
+    >>> print(dpr_data.adj)
+    """
+    def __init__(self, pyg_data, **kwargs):
+        is_ogb = hasattr(pyg_data, 'get_idx_split')
+        if is_ogb:  # get splits for ogb datasets
+            splits = pyg_data.get_idx_split()
+        pyg_data = pyg_data[0]
+        n = pyg_data.num_nodes
+        self.adj = sp.csr_matrix((np.ones(pyg_data.edge_index.shape[1]),
+                                  (pyg_data.edge_index[0], pyg_data.edge_index[1])), shape=(n, n))
+        self.features = pyg_data.x.numpy()
+        self.labels = pyg_data.y.numpy()
+        if len(self.labels.shape) == 2 and self.labels.shape[1] == 1:
+            self.labels = self.labels.reshape(-1)  # ogb-arxiv needs to reshape
+        if is_ogb:  # set splits for ogb datasets
+            self.idx_train = splits['train'].numpy()
+            self.idx_val = splits['valid'].numpy()
+            self.idx_test = splits['test'].numpy()
+        else:
+            try:
+                self.idx_train = mask_to_index(pyg_data.train_mask, n)
+                self.idx_val = mask_to_index(pyg_data.val_mask, n)
+                self.idx_test = mask_to_index(pyg_data.test_mask, n)
+            except AttributeError:
+                print(
+                    'Warning: This pyg dataset is not associated with any data splits...')
+        self.name = 'Pyg2Dpr'
+class AmazonPyg(Amazon):
+    """Amazon-Computers and Amazon-Photo datasets loaded from pytorch geomtric;
+    the way we split the dataset follows Towards Deeper Graph Neural Networks
+    (https://github.com/mengliu1998/DeeperGNN/blob/master/DeeperGNN/train_eval.py).
+    Specifically, 20 * num_classes labels for training, 30 * num_classes labels
+    for validation, rest labels for testing.
+    Parameters
+    ----------
+    root : string
+        root directory where the dataset should be saved.
+    name : string
+        dataset name, it can be choosen from ['computers', 'photo']
+    transform :
+        A function/transform that takes in an torch_geometric.data.Data object
+        and returns a transformed version. The data object will be transformed
+        before every access. (default: None)
+    pre_transform :
+         A function/transform that takes in an torch_geometric.data.Data object
+         and returns a transformed version. The data object will be transformed
+         before being saved to disk.
+    Examples
+    --------
+    We can directly load Amazon dataset from deeprobust in the format of pyg.
+    >>> from deeprobust.graph.data import AmazonPyg
+    >>> computers = AmazonPyg(root='/tmp', name='computers')
+    >>> print(computers)
+    >>> print(computers[0])
+    >>> photo = AmazonPyg(root='/tmp', name='photo')
+    >>> print(photo)
+    >>> print(photo[0])
+    """
+    def __init__(self, root, name, transform=None, pre_transform=None, **kwargs):
+        path = osp.join(root, 'pygdata', name)
+        super(AmazonPyg, self).__init__(path, name, transform, pre_transform)
+        random_coauthor_amazon_splits(self, self.num_classes, lcc_mask=None)
+        self.data, self.slices = self.collate([self.data])
+class CoauthorPyg(Coauthor):
+    """Coauthor-CS and Coauthor-Physics datasets loaded from pytorch geomtric;
+    the way we split the dataset follows Towards Deeper Graph Neural Networks
+    (https://github.com/mengliu1998/DeeperGNN/blob/master/DeeperGNN/train_eval.py).
+    Specifically, 20 * num_classes labels for training, 30 * num_classes labels
+    for validation, rest labels for testing.
+    Parameters
+    ----------
+    root : string
+        root directory where the dataset should be saved.
+    name : string
+        dataset name, it can be choosen from ['cs', 'physics']
+    transform :
+        A function/transform that takes in an torch_geometric.data.Data object
+        and returns a transformed version. The data object will be transformed
+        before every access. (default: None)
+    pre_transform :
+         A function/transform that takes in an torch_geometric.data.Data object
+         and returns a transformed version. The data object will be transformed
+         before being saved to disk.
+    Examples
+    --------
+    We can directly load Coauthor dataset from deeprobust in the format of pyg.
+    >>> from deeprobust.graph.data import CoauthorPyg
+    >>> cs = CoauthorPyg(root='/tmp', name='cs')
+    >>> print(cs)
+    >>> print(cs[0])
+    >>> physics = CoauthorPyg(root='/tmp', name='physics')
+    >>> print(physics)
+    >>> print(physics[0])
+    """
+    def __init__(self, root, name, transform=None, pre_transform=None, **kwargs):
+        path = osp.join(root, 'pygdata', name)
+        super(CoauthorPyg, self).__init__(path, name, transform, pre_transform)
+        random_coauthor_amazon_splits(self, self.num_classes, lcc_mask=None)
+        self.data, self.slices = self.collate([self.data])
+def random_coauthor_amazon_splits(dataset, num_classes, lcc_mask):
+    """https://github.com/mengliu1998/DeeperGNN/blob/master/DeeperGNN/train_eval.py
+    Set random coauthor/co-purchase splits:
+    * 20 * num_classes labels for training
+    * 30 * num_classes labels for validation
+    rest labels for testing
+    """
+    data = dataset.data
+    indices = []
+    if lcc_mask is not None:
+        for i in range(num_classes):
+            index = (data.y[lcc_mask] == i).nonzero().view(-1)
+            index = index[torch.randperm(index.size(0))]
+            indices.append(index)
+    else:
+        for i in range(num_classes):
+            index = (data.y == i).nonzero().view(-1)
+            index = index[torch.randperm(index.size(0))]
+            indices.append(index)
+    train_index = torch.cat([i[:20] for i in indices], dim=0)
+    val_index = torch.cat([i[20:50] for i in indices], dim=0)
+    rest_index = torch.cat([i[50:] for i in indices], dim=0)
+    rest_index = rest_index[torch.randperm(rest_index.size(0))]
+    data.train_mask = index_to_mask(train_index, size=data.num_nodes)
+    data.val_mask = index_to_mask(val_index, size=data.num_nodes)
+    data.test_mask = index_to_mask(rest_index, size=data.num_nodes)
+def mask_to_index(index, size):
+    all_idx = np.arange(size)
+    return all_idx[index]
+def index_to_mask(index, size):
+    mask = torch.zeros((size, ), dtype=torch.bool)
+    mask[index] = 1
+    return mask
+if __name__ == "__main__":
+    from deeprobust.graph.data import PrePtbDataset, Dataset
+    # load clean graph data
+    dataset_str = 'cora'
+    data = Dataset(root='/tmp/', name=dataset_str, seed=15)
+    pyg_data = Dpr2Pyg(data)
+    print(pyg_data)
+    print(pyg_data[0])
+    dpr_data = Pyg2Dpr(pyg_data)
+    print(dpr_data)
+    computers = AmazonPyg(root='/tmp', name='computers')
+    print(computers)
+    print(computers[0])
+    photo = AmazonPyg(root='/tmp', name='photo')
+    print(photo)
+    print(photo[0])
+    cs = CoauthorPyg(root='/tmp', name='cs')
+    print(cs)
+    print(cs[0])
+    physics = CoauthorPyg(root='/tmp', name='physics')
+    print(physics)
+    print(physics[0])
+    # from ogb.nodeproppred import PygNodePropPredDataset
+    # dataset = PygNodePropPredDataset(name = 'ogbn-arxiv')
+    # ogb_data = Pyg2Dpr(dataset)

deeprobust/graph/data/utils.py ADDED Viewed

	@@ -0,0 +1,10 @@

+"""
+This file provides functions for converting deeprobust data
+to pytorch geometric data.
+"""

deeprobust/graph/defense/__init__.py ADDED Viewed

	@@ -0,0 +1,23 @@

+from .gcn import GCN, GraphConvolution
+from .gcn_preprocess import GCNSVD, GCNJaccard
+from .gcn_cgscore import GCNScore
+from .r_gcn import RGCN, GGCL_F, GGCL_D
+from .prognn import ProGNN
+from .simpgcn import SimPGCN
+from .node_embedding import Node2Vec, DeepWalk
+import warnings
+try:
+    from .gat import GAT
+    from .chebnet import ChebNet
+    from .sgc import SGC
+    from .median_gcn import MedianGCN
+except ImportError as e:
+    print(e)
+    warnings.warn("Please install pytorch geometric if you " +
+                  "would like to use the datasets from pytorch " +
+                  "geometric. See details in https://pytorch-geom" +
+                  "etric.readthedocs.io/en/latest/notes/installation.html")
+__all__ = ['GCN', 'GCNSVD', 'GCNJaccard', 'RGCN', 'ProGNN',
+           'GraphConvolution', 'GGCL_F', 'GGCL_D', 'GAT', 'MedianGCN',
+           'ChebNet', 'SGC', 'SimPGCN', 'Node2Vec', 'DeepWalk']

deeprobust/graph/defense/pgd.py ADDED Viewed

	@@ -0,0 +1,207 @@

+from torch.optim.sgd import SGD
+from torch.optim.optimizer import required
+from torch.optim import Optimizer
+import torch
+import sklearn
+import numpy as np
+import scipy.sparse as sp
+class PGD(Optimizer):
+    """Proximal gradient descent.
+    Parameters
+    ----------
+    params : iterable
+        iterable of parameters to optimize or dicts defining parameter groups
+    proxs : iterable
+        iterable of proximal operators
+    alpha : iterable
+        iterable of coefficients for proximal gradient descent
+    lr : float
+        learning rate
+    momentum : float
+        momentum factor (default: 0)
+    weight_decay : float
+        weight decay (L2 penalty) (default: 0)
+    dampening : float
+        dampening for momentum (default: 0)
+    """
+    def __init__(self, params, proxs, alphas, lr=required, momentum=0, dampening=0, weight_decay=0):
+        defaults = dict(lr=lr, momentum=0, dampening=0,
+                        weight_decay=0, nesterov=False)
+        super(PGD, self).__init__(params, defaults)
+        for group in self.param_groups:
+            group.setdefault('proxs', proxs)
+            group.setdefault('alphas', alphas)
+    def __setstate__(self, state):
+        super(PGD, self).__setstate__(state)
+        for group in self.param_groups:
+            group.setdefault('nesterov', False)
+            group.setdefault('proxs', proxs)
+            group.setdefault('alphas', alphas)
+    def step(self, delta=0, closure=None):
+         for group in self.param_groups:
+            lr = group['lr']
+            weight_decay = group['weight_decay']
+            momentum = group['momentum']
+            dampening = group['dampening']
+            nesterov = group['nesterov']
+            proxs = group['proxs']
+            alphas = group['alphas']
+            # apply the proximal operator to each parameter in a group
+            for param in group['params']:
+                for prox_operator, alpha in zip(proxs, alphas):
+                    # param.data.add_(lr, -param.grad.data)
+                    # param.data.add_(delta)
+                    param.data = prox_operator(param.data, alpha=alpha*lr)
+class ProxOperators():
+    """Proximal Operators.
+    """
+    def __init__(self):
+        self.nuclear_norm = None
+    def prox_l1(self, data, alpha):
+        """Proximal operator for l1 norm.
+        """
+        data = torch.mul(torch.sign(data), torch.clamp(torch.abs(data)-alpha, min=0))
+        return data
+    def prox_nuclear(self, data, alpha):
+        """Proximal operator for nuclear norm (trace norm).
+        """
+        device = data.device
+        U, S, V = np.linalg.svd(data.cpu())
+        U, S, V = torch.FloatTensor(U).to(device), torch.FloatTensor(S).to(device), torch.FloatTensor(V).to(device)
+        self.nuclear_norm = S.sum()
+        # print("nuclear norm: %.4f" % self.nuclear_norm)
+        diag_S = torch.diag(torch.clamp(S-alpha, min=0))
+        return torch.matmul(torch.matmul(U, diag_S), V)
+    def prox_nuclear_truncated_2(self, data, alpha, k=50):
+        device = data.device
+        import tensorly as tl
+        tl.set_backend('pytorch')
+        U, S, V = tl.truncated_svd(data.cpu(), n_eigenvecs=k)
+        U, S, V = torch.FloatTensor(U).to(device), torch.FloatTensor(S).to(device), torch.FloatTensor(V).to(device)
+        self.nuclear_norm = S.sum()
+        # print("nuclear norm: %.4f" % self.nuclear_norm)
+        S = torch.clamp(S-alpha, min=0)
+        # diag_S = torch.diag(torch.clamp(S-alpha, min=0))
+        # U = torch.spmm(U, diag_S)
+        # V = torch.matmul(U, V)
+        # make diag_S sparse matrix
+        indices = torch.tensor((range(0, len(S)), range(0, len(S)))).to(device)
+        values = S
+        diag_S = torch.sparse.FloatTensor(indices, values, torch.Size((len(S), len(S))))
+        V = torch.spmm(diag_S, V)
+        V = torch.matmul(U, V)
+        return V
+    def prox_nuclear_truncated(self, data, alpha, k=50):
+        device = data.device
+        indices = torch.nonzero(data).t()
+        values = data[indices[0], indices[1]] # modify this based on dimensionality
+        data_sparse = sp.csr_matrix((values.cpu().numpy(), indices.cpu().numpy()))
+        U, S, V = sp.linalg.svds(data_sparse, k=k)
+        U, S, V = torch.FloatTensor(U).to(device), torch.FloatTensor(S).to(device), torch.FloatTensor(V).to(device)
+        self.nuclear_norm = S.sum()
+        diag_S = torch.diag(torch.clamp(S-alpha, min=0))
+        return torch.matmul(torch.matmul(U, diag_S), V)
+    def prox_nuclear_cuda(self, data, alpha):
+        device = data.device
+        U, S, V = torch.svd(data)
+        # self.nuclear_norm = S.sum()
+        # print(f"rank = {len(S.nonzero())}")
+        self.nuclear_norm = S.sum()
+        S = torch.clamp(S-alpha, min=0)
+        indices = torch.tensor([range(0, U.shape[0]),range(0, U.shape[0])]).to(device)
+        values = S
+        diag_S = torch.sparse.FloatTensor(indices, values, torch.Size(U.shape))
+        # diag_S = torch.diag(torch.clamp(S-alpha, min=0))
+        # print(f"rank_after = {len(diag_S.nonzero())}")
+        V = torch.spmm(diag_S, V.t_())
+        V = torch.matmul(U, V)
+        return V
+class SGD(Optimizer):
+    def __init__(self, params, lr=required, momentum=0, dampening=0,
+                 weight_decay=0, nesterov=False):
+        if lr is not required and lr < 0.0:
+            raise ValueError("Invalid learning rate: {}".format(lr))
+        if momentum < 0.0:
+            raise ValueError("Invalid momentum value: {}".format(momentum))
+        if weight_decay < 0.0:
+            raise ValueError("Invalid weight_decay value: {}".format(weight_decay))
+        defaults = dict(lr=lr, momentum=momentum, dampening=dampening,
+                        weight_decay=weight_decay, nesterov=nesterov)
+        if nesterov and (momentum <= 0 or dampening != 0):
+            raise ValueError("Nesterov momentum requires a momentum and zero dampening")
+        super(SGD, self).__init__(params, defaults)
+    def __setstate__(self, state):
+        super(SGD, self).__setstate__(state)
+        for group in self.param_groups:
+            group.setdefault('nesterov', False)
+    def step(self, closure=None):
+        """Performs a single optimization step.
+        Arguments:
+            closure (callable, optional): A closure that reevaluates the model
+                and returns the loss.
+        """
+        loss = None
+        if closure is not None:
+            loss = closure()
+        for group in self.param_groups:
+            weight_decay = group['weight_decay']
+            momentum = group['momentum']
+            dampening = group['dampening']
+            nesterov = group['nesterov']
+            for p in group['params']:
+                if p.grad is None:
+                    continue
+                d_p = p.grad.data
+                if weight_decay != 0:
+                    d_p.add_(weight_decay, p.data)
+                if momentum != 0:
+                    param_state = self.state[p]
+                    if 'momentum_buffer' not in param_state:
+                        buf = param_state['momentum_buffer'] = torch.clone(d_p).detach()
+                    else:
+                        buf = param_state['momentum_buffer']
+                        buf.mul_(momentum).add_(1 - dampening, d_p)
+                    if nesterov:
+                        d_p = d_p.add(momentum, buf)
+                    else:
+                        d_p = buf
+                p.data.add_(-group['lr'], d_p)
+        return loss
+prox_operators = ProxOperators()

deeprobust/graph/defense/simpgcn.py ADDED Viewed

	@@ -0,0 +1,474 @@

+import math
+import os
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.nn.parameter import Parameter
+import scipy.sparse as sp
+from deeprobust.graph.defense import GraphConvolution
+import deeprobust.graph.utils as utils
+import torch.optim as optim
+from sklearn.metrics.pairwise import cosine_similarity
+from copy import deepcopy
+from itertools import product
+class SimPGCN(nn.Module):
+    """SimP-GCN: Node similarity preserving graph convolutional networks.
+       https://arxiv.org/abs/2011.09643
+    Parameters
+    ----------
+    nnodes : int
+        number of nodes in the input grpah
+    nfeat : int
+        size of input feature dimension
+    nhid : int
+        number of hidden units
+    nclass : int
+        size of output dimension
+    lambda_ : float
+        coefficients for SSL loss in SimP-GCN
+    gamma : float
+        coefficients for adaptive learnable self-loops
+    bias_init : float
+        bias init for the score
+    dropout : float
+        dropout rate for GCN
+    lr : float
+        learning rate for GCN
+    weight_decay : float
+        weight decay coefficient (l2 normalization) for GCN. When `with_relu` is True, `weight_decay` will be set to 0.
+    with_bias: bool
+        whether to include bias term in GCN weights.
+    device: str
+        'cpu' or 'cuda'.
+    Examples
+    --------
+	We can first load dataset and then train SimPGCN.
+    See the detailed hyper-parameter setting in https://github.com/ChandlerBang/SimP-GCN.
+    >>> from deeprobust.graph.data import PrePtbDataset, Dataset
+    >>> from deeprobust.graph.defense import SimPGCN
+    >>> # load clean graph data
+    >>> data = Dataset(root='/tmp/', name='cora', seed=15)
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    >>> # load perturbed graph data
+    >>> perturbed_data = PrePtbDataset(root='/tmp/', name='cora')
+    >>> perturbed_adj = perturbed_data.adj
+    >>> model = SimPGCN(nnodes=features.shape[0], nfeat=features.shape[1],
+        nhid=16, nclass=labels.max()+1, device='cuda')
+    >>> model = model.to('cuda')
+    >>> model.fit(features, perturbed_adj, labels, idx_train, idx_val, train_iters=200, verbose=True)
+    >>> model.test(idx_test)
+    """
+    def __init__(self, nnodes, nfeat, nhid, nclass, dropout=0.5, lr=0.01,
+            weight_decay=5e-4, lambda_=5, gamma=0.1, bias_init=0,
+            with_bias=True, device=None):
+        super(SimPGCN, self).__init__()
+        assert device is not None, "Please specify 'device'!"
+        self.device = device
+        self.nfeat = nfeat
+        self.hidden_sizes = [nhid]
+        self.nclass = nclass
+        self.dropout = dropout
+        self.lr = lr
+        self.weight_decay = weight_decay
+        self.bias_init = bias_init
+        self.gamma = gamma
+        self.lambda_ = lambda_
+        self.output = None
+        self.best_model = None
+        self.best_output = None
+        self.adj_norm = None
+        self.features = None
+        self.gc1 = GraphConvolution(nfeat, nhid, with_bias=with_bias)
+        self.gc2 = GraphConvolution(nhid, nclass, with_bias=with_bias)
+        # self.reset_parameters()
+        self.scores = nn.ParameterList()
+        self.scores.append(Parameter(torch.FloatTensor(nfeat, 1)))
+        for i in range(1):
+            self.scores.append(Parameter(torch.FloatTensor(nhid, 1)))
+        self.bias = nn.ParameterList()
+        self.bias.append(Parameter(torch.FloatTensor(1)))
+        for i in range(1):
+            self.bias.append(Parameter(torch.FloatTensor(1)))
+        self.D_k = nn.ParameterList()
+        self.D_k.append(Parameter(torch.FloatTensor(nfeat, 1)))
+        for i in range(1):
+            self.D_k.append(Parameter(torch.FloatTensor(nhid, 1)))
+        self.identity = utils.sparse_mx_to_torch_sparse_tensor(
+                sp.eye(nnodes)).to(device)
+        self.D_bias = nn.ParameterList()
+        self.D_bias.append(Parameter(torch.FloatTensor(1)))
+        for i in range(1):
+            self.D_bias.append(Parameter(torch.FloatTensor(1)))
+        # discriminator for ssl
+        self.linear = nn.Linear(nhid, 1).to(device)
+        self.adj_knn = None
+        self.pseudo_labels = None
+    def get_knn_graph(self, features, k=20):
+        if not os.path.exists('saved_knn/'):
+           os.mkdir('saved_knn')
+        if not os.path.exists('saved_knn/knn_graph_{}.npz'.format(features.shape)):
+            features[features!=0] = 1
+            sims = cosine_similarity(features)
+            np.save('saved_knn/cosine_sims_{}.npy'.format(features.shape), sims)
+            sims[(np.arange(len(sims)), np.arange(len(sims)))] = 0
+            for i in range(len(sims)):
+                indices_argsort = np.argsort(sims[i])
+                sims[i, indices_argsort[: -k]] = 0
+            adj_knn = sp.csr_matrix(sims)
+            sp.save_npz('saved_knn/knn_graph_{}.npz'.format(features.shape), adj_knn)
+        else:
+            print('loading saved_knn/knn_graph_{}.npz...'.format(features.shape))
+            adj_knn = sp.load_npz('saved_knn/knn_graph_{}.npz'.format(features.shape))
+        return preprocess_adj_noloop(adj_knn, self.device)
+    def initialize(self):
+        """Initialize parameters of SimPGCN.
+        """
+        self.gc1.reset_parameters()
+        self.gc2.reset_parameters()
+        for s in self.scores:
+            stdv = 1. / math.sqrt(s.size(1))
+            s.data.uniform_(-stdv, stdv)
+        for b in self.bias:
+            # fill in b with postive value to make
+            # score s closer to 1 at the beginning
+            b.data.fill_(self.bias_init)
+        for Dk in self.D_k:
+            stdv = 1. / math.sqrt(Dk.size(1))
+            Dk.data.uniform_(-stdv, stdv)
+        for b in self.D_bias:
+            b.data.fill_(0)
+    def fit(self, features, adj, labels, idx_train, idx_val=None, train_iters=200, initialize=True, verbose=False, normalize=True, patience=500, **kwargs):
+        if initialize:
+            self.initialize()
+        if type(adj) is not torch.Tensor:
+            features, adj, labels = utils.to_tensor(features, adj, labels, device=self.device)
+        else:
+            features = features.to(self.device)
+            adj = adj.to(self.device)
+            labels = labels.to(self.device)
+        if normalize:
+            if utils.is_sparse_tensor(adj):
+                adj_norm = utils.normalize_adj_tensor(adj, sparse=True)
+            else:
+                adj_norm = utils.normalize_adj_tensor(adj)
+        else:
+            adj_norm = adj
+        self.adj_norm = adj_norm
+        self.features = features
+        self.labels = labels
+        if idx_val is None:
+            self._train_without_val(labels, idx_train, train_iters, verbose)
+        else:
+            if patience < train_iters:
+                self._train_with_early_stopping(labels, idx_train, idx_val, train_iters, patience, verbose)
+            else:
+                self._train_with_val(labels, idx_train, idx_val, train_iters, verbose)
+    def forward(self, fea, adj):
+        x, _ = self.myforward(fea, adj)
+        return x
+    def myforward(self, fea, adj):
+        '''output embedding and log_softmax'''
+        if self.adj_knn is None:
+            self.adj_knn = self.get_knn_graph(fea.to_dense().cpu().numpy())
+        adj_knn = self.adj_knn
+        gamma = self.gamma
+        s_i = torch.sigmoid(fea @ self.scores[0] + self.bias[0])
+        Dk_i = (fea @ self.D_k[0] + self.D_bias[0])
+        x = (s_i * self.gc1(fea, adj) + (1-s_i) * self.gc1(fea, adj_knn)) + (gamma) * Dk_i * self.gc1(fea, self.identity)
+        x = F.dropout(x, self.dropout, training=self.training)
+        embedding = x.clone()
+        # output, no relu and dropput here.
+        s_o = torch.sigmoid(x @ self.scores[-1] + self.bias[-1])
+        Dk_o = (x @ self.D_k[-1] + self.D_bias[-1])
+        x = (s_o * self.gc2(x, adj) + (1-s_o) * self.gc2(x, adj_knn)) + (gamma) * Dk_o * self.gc2(x, self.identity)
+        x = F.log_softmax(x, dim=1)
+        self.ss = torch.cat((s_i.view(1,-1), s_o.view(1,-1), gamma*Dk_i.view(1,-1), gamma*Dk_o.view(1,-1)), dim=0)
+        return x, embedding
+    def regression_loss(self, embeddings):
+        if self.pseudo_labels is None:
+            agent = AttrSim(self.features.to_dense())
+            self.pseudo_labels = agent.get_label().to(self.device)
+            node_pairs = agent.node_pairs
+            self.node_pairs = node_pairs
+        k = 10000
+        node_pairs = self.node_pairs
+        if len(self.node_pairs[0]) > k:
+            sampled = np.random.choice(len(self.node_pairs[0]), k, replace=False)
+            embeddings0 = embeddings[node_pairs[0][sampled]]
+            embeddings1 = embeddings[node_pairs[1][sampled]]
+            embeddings = self.linear(torch.abs(embeddings0 - embeddings1))
+            loss = F.mse_loss(embeddings, self.pseudo_labels[sampled], reduction='mean')
+        else:
+            embeddings0 = embeddings[node_pairs[0]]
+            embeddings1 = embeddings[node_pairs[1]]
+            embeddings = self.linear(torch.abs(embeddings0 - embeddings1))
+            loss = F.mse_loss(embeddings, self.pseudo_labels, reduction='mean')
+        # print(loss)
+        return loss
+    def _train_without_val(self, labels, idx_train, train_iters, verbose):
+        self.train()
+        optimizer = optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        for i in range(train_iters):
+            self.train()
+            optimizer.zero_grad()
+            output, embeddings = self.myforward(self.features, self.adj_norm)
+            loss_train = F.nll_loss(output[idx_train], labels[idx_train])
+            loss_ssl = self.lambda_ * self.regression_loss(embeddings)
+            loss_total = loss_train + loss_ssl
+            loss_total.backward()
+            optimizer.step()
+            if verbose and i % 10 == 0:
+                print('Epoch {}, training loss: {}'.format(i, loss_train.item()))
+        self.eval()
+        output = self.forward(self.features, self.adj_norm)
+        self.output = output
+    def _train_with_val(self, labels, idx_train, idx_val, train_iters, verbose):
+        if verbose:
+            print('=== training gcn model ===')
+        optimizer = optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        best_loss_val = 100
+        best_acc_val = 0
+        for i in range(train_iters):
+            self.train()
+            optimizer.zero_grad()
+            output, embeddings = self.myforward(self.features, self.adj_norm)
+            loss_train = F.nll_loss(output[idx_train], labels[idx_train])
+            # acc_train = accuracy(output[idx_train], labels[idx_train])
+            loss_ssl = self.lambda_ * self.regression_loss(embeddings)
+            loss_total = loss_train + loss_ssl
+            loss_total.backward()
+            optimizer.step()
+            if verbose and i % 10 == 0:
+                print('Epoch {}, training loss: {}'.format(i, loss_train.item()))
+            self.eval()
+            output = self.forward(self.features, self.adj_norm)
+            loss_val = F.nll_loss(output[idx_val], labels[idx_val])
+            acc_val = utils.accuracy(output[idx_val], labels[idx_val])
+            if best_loss_val > loss_val:
+                best_loss_val = loss_val
+                self.output = output
+                weights = deepcopy(self.state_dict())
+            if acc_val > best_acc_val:
+                best_acc_val = acc_val
+                self.output = output
+                weights = deepcopy(self.state_dict())
+        if verbose:
+            print('=== picking the best model according to the performance on validation ===')
+        self.load_state_dict(weights)
+    def _train_with_early_stopping(self, labels, idx_train, idx_val, train_iters, patience, verbose):
+        if verbose:
+            print('=== training gcn model ===')
+        optimizer = optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        early_stopping = patience
+        best_loss_val = 100
+        for i in range(train_iters):
+            self.train()
+            optimizer.zero_grad()
+            output, embeddings = self.myforward(self.features, self.adj_norm)
+            loss_train = F.nll_loss(output[idx_train], labels[idx_train])
+            loss_ssl = self.lambda_ * self.regression_loss(embeddings)
+            loss_total = loss_train + loss_ssl
+            loss_total.backward()
+            optimizer.step()
+            if verbose and i % 10 == 0:
+                print('Epoch {}, training loss: {}'.format(i, loss_train.item()))
+            self.eval()
+            output = self.forward(self.features, self.adj_norm)
+            loss_val = F.nll_loss(output[idx_val], labels[idx_val])
+            if best_loss_val > loss_val:
+                best_loss_val = loss_val
+                self.output = output
+                weights = deepcopy(self.state_dict())
+                patience = early_stopping
+            else:
+                patience -= 1
+            if i > early_stopping and patience <= 0:
+                break
+        if verbose:
+             print('=== early stopping at {0}, loss_val = {1} ==='.format(i, best_loss_val) )
+        self.load_state_dict(weights)
+    def test(self, idx_test):
+        """Evaluate GCN performance on test set.
+        Parameters
+        ----------
+        idx_test :
+            node testing indices
+        """
+        self.eval()
+        output = self.predict()
+        # output = self.output
+        loss_test = F.nll_loss(output[idx_test], self.labels[idx_test])
+        acc_test = utils.accuracy(output[idx_test], self.labels[idx_test])
+        print("Test set results:",
+              "loss= {:.4f}".format(loss_test.item()),
+              "accuracy= {:.4f}".format(acc_test.item()))
+        return acc_test.item()
+    def predict(self, features=None, adj=None):
+        """By default, the inputs should be unnormalized data
+        Parameters
+        ----------
+        features :
+            node features. If `features` and `adj` are not given, this function will use previous stored `features` and `adj` from training to make predictions.
+        adj :
+            adjcency matrix. If `features` and `adj` are not given, this function will use previous stored `features` and `adj` from training to make predictions.
+        Returns
+        -------
+        torch.FloatTensor
+            output (log probabilities) of GCN
+        """
+        self.eval()
+        if features is None and adj is None:
+            return self.forward(self.features, self.adj_norm)
+        else:
+            if type(adj) is not torch.Tensor:
+                features, adj = utils.to_tensor(features, adj, device=self.device)
+            self.features = features
+            if utils.is_sparse_tensor(adj):
+                self.adj_norm = utils.normalize_adj_tensor(adj, sparse=True)
+            else:
+                self.adj_norm = utils.normalize_adj_tensor(adj)
+            return self.forward(self.features, self.adj_norm)
+class AttrSim:
+    def __init__(self, features):
+        self.features = features.cpu().numpy()
+        self.features[self.features!=0] = 1
+    def get_label(self, k=5):
+        features = self.features
+        if not os.path.exists('saved_knn/cosine_sims_{}.npy'.format(features.shape)):
+            sims = cosine_similarity(features)
+            np.save('saved_knn/cosine_sims_{}.npy'.format(features.shape), sims)
+        else:
+            print('loading saved_knn/cosine_sims_{}.npy'.format(features.shape))
+            sims = np.load('saved_knn/cosine_sims_{}.npy'.format(features.shape))
+        if not os.path.exists('saved_knn/attrsim_sampled_idx_{}.npy'.format(features.shape)):
+            try:
+                indices_sorted = sims.argsort(1)
+                idx = np.arange(k, sims.shape[0]-k)
+                selected = np.hstack((indices_sorted[:, :k],
+                    indices_sorted[:, -k-1:]))
+                selected_set = set()
+                for i in range(len(sims)):
+                    for pair in product([i], selected[i]):
+                        if pair[0] > pair[1]:
+                            pair = (pair[1], pair[0])
+                        if  pair[0] == pair[1]:
+                            continue
+                        selected_set.add(pair)
+            except MemoryError:
+                selected_set = set()
+                for ii, row in tqdm(enumerate(sims)):
+                    row = row.argsort()
+                    idx = np.arange(k, sims.shape[0]-k)
+                    sampled = np.random.choice(idx, k, replace=False)
+                    for node in np.hstack((row[:k], row[-k-1:], row[sampled])):
+                        if ii > node:
+                            pair = (node, ii)
+                        else:
+                            pair = (ii, node)
+                        selected_set.add(pair)
+            sampled = np.array(list(selected_set)).transpose()
+            np.save('saved_knn/attrsim_sampled_idx_{}.npy'.format(features.shape), sampled)
+        else:
+            print('loading saved_knn/attrsim_sampled_idx_{}.npy'.format(features.shape))
+            sampled = np.load('saved_knn/attrsim_sampled_idx_{}.npy'.format(features.shape))
+        print('number of sampled:', len(sampled[0]))
+        self.node_pairs = (sampled[0], sampled[1])
+        self.sims = sims
+        return torch.FloatTensor(sims[self.node_pairs]).reshape(-1,1)
+def preprocess_adj_noloop(adj, device):
+    adj_normalizer = noaug_normalized_adjacency
+    r_adj = adj_normalizer(adj)
+    r_adj = utils.sparse_mx_to_torch_sparse_tensor(r_adj).float()
+    r_adj = r_adj.to(device)
+    return r_adj
+def noaug_normalized_adjacency(adj):
+   adj = sp.coo_matrix(adj)
+   row_sum = np.array(adj.sum(1))
+   d_inv_sqrt = np.power(row_sum, -0.5).flatten()
+   d_inv_sqrt[np.isinf(d_inv_sqrt)] = 0.
+   d_mat_inv_sqrt = sp.diags(d_inv_sqrt)
+   return d_mat_inv_sqrt.dot(adj).dot(d_mat_inv_sqrt).tocoo()

deeprobust/graph/defense_pyg/gat.py ADDED Viewed

	@@ -0,0 +1,100 @@

+import torch.nn as nn
+import torch.nn.functional as F
+import math
+import torch
+from torch.nn.parameter import Parameter
+from torch.nn.modules.module import Module
+# from torch_geometric.nn import GATConv
+from .mygat_conv import GATConv
+from .base_model import BaseModel
+class GAT(BaseModel):
+    def __init__(self, nfeat, nhid, nclass, heads=8, output_heads=1, dropout=0.5, lr=0.01,
+            nlayers=2, with_bn=False, weight_decay=5e-4, with_bias=True, device=None):
+        super(GAT, self).__init__()
+        assert device is not None, "Please specify 'device'!"
+        self.device = device
+        self.convs = nn.ModuleList([])
+        if with_bn:
+            self.bns = nn.ModuleList([])
+            self.bns.append(nn.BatchNorm1d(nhid*heads))
+        self.convs.append(GATConv(
+            nfeat,
+            nhid,
+            heads=heads,
+            dropout=dropout,
+            bias=with_bias))
+        for i in range(nlayers-2):
+            self.convs.append(GATConv(nhid*heads,
+                nhid, heads=heads, dropout=dropout, bias=with_bias))
+            if with_bn:
+                self.bns.append(nn.BatchNorm1d(nhid*heads))
+        self.convs.append(GATConv(
+            nhid * heads,
+            nclass,
+            heads=output_heads,
+            concat=False,
+            dropout=dropout,
+            bias=with_bias))
+        self.dropout = dropout
+        self.weight_decay = weight_decay
+        self.lr = lr
+        self.output = None
+        self.best_model = None
+        self.best_output = None
+        self.name = 'GAT'
+        self.with_bn = with_bn
+    def forward(self, x, edge_index, edge_weight=None):
+        for ii, conv in enumerate(self.convs[:-1]):
+            x = F.dropout(x, p=self.dropout, training=self.training)
+            x = conv(x, edge_index, edge_weight)
+            if self.with_bn:
+                x = self.bns[ii](x)
+                x = F.elu(x)
+        x = F.dropout(x, p=self.dropout, training=self.training)
+        x = self.convs[-1](x, edge_index, edge_weight)
+        return F.log_softmax(x, dim=1)
+    def get_embed(self, x, edge_index, edge_weight=None):
+        for ii, conv in enumerate(self.convs[:-1]):
+            x = F.dropout(x, p=self.dropout, training=self.training)
+            x = conv(x, edge_index, edge_weight)
+            if self.with_bn:
+                x = self.bns[ii](x)
+                x = F.elu(x)
+        return x
+    def initialize(self):
+        for conv in self.convs:
+            conv.reset_parameters()
+        if self.with_bn:
+            for bn in self.bns:
+                bn.reset_parameters()
+if __name__ == "__main__":
+    from deeprobust.graph.data import Dataset, Dpr2Pyg
+    # from deeprobust.graph.defense import GAT
+    data = Dataset(root='/tmp/', name='cora')
+    adj, features, labels = data.adj, data.features, data.labels
+    idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    gat = GAT(nfeat=features.shape[1],
+          nhid=8, heads=8,
+          nclass=labels.max().item() + 1,
+          dropout=0.5, device='cpu')
+    gat = gat.to('cpu')
+    pyg_data = Dpr2Pyg(data)
+    gat.fit(pyg_data, verbose=True) # train with earlystopping
+    gat.test()
+    print(gat.predict())

deeprobust/graph/defense_pyg/gcn.py ADDED Viewed

	@@ -0,0 +1,110 @@

+import torch.nn as nn
+import torch.nn.functional as F
+import math
+import torch
+from torch.nn.parameter import Parameter
+from torch.nn.modules.module import Module
+from torch_geometric.nn import GCNConv
+from .base_model import BaseModel
+from torch_sparse import coalesce, SparseTensor, matmul
+class GCN(BaseModel):
+    def __init__(self, nfeat, nhid, nclass, nlayers=2, dropout=0.5, lr=0.01,
+                with_bn=False, weight_decay=5e-4, with_bias=True, device=None):
+        super(GCN, self).__init__()
+        assert device is not None, "Please specify 'device'!"
+        self.device = device
+        self.layers = nn.ModuleList([])
+        if with_bn:
+            self.bns = nn.ModuleList()
+        if nlayers == 1:
+            self.layers.append(GCNConv(nfeat, nclass, bias=with_bias))
+        else:
+            self.layers.append(GCNConv(nfeat, nhid, bias=with_bias))
+            if with_bn:
+                self.bns.append(nn.BatchNorm1d(nhid))
+            for i in range(nlayers-2):
+                self.layers.append(GCNConv(nhid, nhid, bias=with_bias))
+                if with_bn:
+                    self.bns.append(nn.BatchNorm1d(nhid))
+            self.layers.append(GCNConv(nhid, nclass, bias=with_bias))
+        self.dropout = dropout
+        self.weight_decay = weight_decay
+        self.lr = lr
+        self.output = None
+        self.best_model = None
+        self.best_output = None
+        self.with_bn = with_bn
+        self.name = 'GCN'
+    def forward(self, x, edge_index, edge_weight=None):
+        x, edge_index, edge_weight = self._ensure_contiguousness(x, edge_index, edge_weight)
+        for ii, layer in enumerate(self.layers):
+            if edge_weight is not None:
+                adj = SparseTensor.from_edge_index(edge_index, edge_weight, sparse_sizes=2 * x.shape[:1]).t()
+                x = layer(x, adj)
+            else:
+                x = layer(x, edge_index)
+            if ii != len(self.layers) - 1:
+                if self.with_bn:
+                    x = self.bns[ii](x)
+                x = F.relu(x)
+                x = F.dropout(x, p=self.dropout, training=self.training)
+        return F.log_softmax(x, dim=1)
+    def get_embed(self, x, edge_index, edge_weight=None):
+        x, edge_index, edge_weight = self._ensure_contiguousness(x, edge_index, edge_weight)
+        for ii, layer in enumerate(self.layers):
+            if ii == len(self.layers) - 1:
+                return x
+            if edge_weight is not None:
+                adj = SparseTensor.from_edge_index(edge_index, edge_weight, sparse_sizes=2 * x.shape[:1]).t()
+                x = layer(x, adj)
+            else:
+                x = layer(x, edge_index)
+            if ii != len(self.layers) - 1:
+                if self.with_bn:
+                    x = self.bns[ii](x)
+                x = F.relu(x)
+        return x
+    def initialize(self):
+        for m in self.layers:
+            m.reset_parameters()
+        if self.with_bn:
+            for bn in self.bns:
+                bn.reset_parameters()
+if __name__ == "__main__":
+    from deeprobust.graph.data import Dataset, Dpr2Pyg
+    # from deeprobust.graph.defense import GCN
+    data = Dataset(root='/tmp/', name='citeseer', setting='prognn')
+    adj, features, labels = data.adj, data.features, data.labels
+    idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    model = GCN(nfeat=features.shape[1],
+          nhid=16,
+          nclass=labels.max().item() + 1,
+          dropout=0.5, device='cuda')
+    model = model.to('cuda')
+    pyg_data = Dpr2Pyg(data)[0]
+    # model.fit(features, adj, labels, idx_train, train_iters=200, verbose=True)
+    # model.test(idx_test)
+    from utils import get_dataset
+    pyg_data = get_dataset('citeseer', True, if_dpr=False)[0]
+    import ipdb
+    ipdb.set_trace()
+    model.fit(pyg_data, verbose=True) # train with earlystopping
+    model.test()
+    print(model.predict())

deeprobust/graph/global_attack/base_attack.py ADDED Viewed

	@@ -0,0 +1,130 @@

+import os.path as osp
+import numpy as np
+import scipy.sparse as sp
+import torch
+from torch.nn.modules.module import Module
+from deeprobust.graph import utils
+class BaseAttack(Module):
+    """Abstract base class for target attack classes.
+    Parameters
+    ----------
+    model :
+        model to attack
+    nnodes : int
+        number of nodes in the input graph
+    attack_structure : bool
+        whether to attack graph structure
+    attack_features : bool
+        whether to attack node features
+    device: str
+        'cpu' or 'cuda'
+    """
+    def __init__(self, model, nnodes, attack_structure=True, attack_features=False, device='cpu'):
+        super(BaseAttack, self).__init__()
+        self.surrogate = model
+        self.nnodes = nnodes
+        self.attack_structure = attack_structure
+        self.attack_features = attack_features
+        self.device = device
+        self.modified_adj = None
+        self.modified_features = None
+        if model is not None:
+            self.nclass = model.nclass
+            self.nfeat = model.nfeat
+            self.hidden_sizes = model.hidden_sizes
+    def attack(self, ori_adj, n_perturbations, **kwargs):
+        """Generate attacks on the input graph.
+        Parameters
+        ----------
+        ori_adj : scipy.sparse.csr_matrix
+            Original (unperturbed) adjacency matrix.
+        n_perturbations : int
+            Number of edge removals/additions.
+        Returns
+        -------
+        None.
+        """
+        pass
+    def check_adj(self, adj):
+        """Check if the modified adjacency is symmetric and unweighted.
+        """
+        assert np.abs(adj - adj.T).sum() == 0, "Input graph is not symmetric"
+        assert adj.tocsr().max() == 1, "Max value should be 1!"
+        assert adj.tocsr().min() == 0, "Min value should be 0!"
+    def check_adj_tensor(self, adj):
+        """Check if the modified adjacency is symmetric, unweighted, all-zero diagonal.
+        """
+        assert torch.abs(adj - adj.t()).sum() == 0, "Input graph is not symmetric"
+        assert adj.max() == 1, "Max value should be 1!"
+        assert adj.min() == 0, "Min value should be 0!"
+        diag = adj.diag()
+        assert diag.max() == 0, "Diagonal should be 0!"
+        assert diag.min() == 0, "Diagonal should be 0!"
+    def save_adj(self, root=r'/tmp/', name='mod_adj'):
+        """Save attacked adjacency matrix.
+        Parameters
+        ----------
+        root :
+            root directory where the variable should be saved
+        name : str
+            saved file name
+        Returns
+        -------
+        None.
+        """
+        assert self.modified_adj is not None, \
+                'modified_adj is None! Please perturb the graph first.'
+        name = name + '.npz'
+        modified_adj = self.modified_adj
+        if type(modified_adj) is torch.Tensor:
+            sparse_adj = utils.to_scipy(modified_adj)
+            sp.save_npz(osp.join(root, name), sparse_adj)
+        else:
+            sp.save_npz(osp.join(root, name), modified_adj)
+    def save_features(self, root=r'/tmp/', name='mod_features'):
+        """Save attacked node feature matrix.
+        Parameters
+        ----------
+        root :
+            root directory where the variable should be saved
+        name : str
+            saved file name
+        Returns
+        -------
+        None.
+        """
+        assert self.modified_features is not None, \
+                'modified_features is None! Please perturb the graph first.'
+        name = name + '.npz'
+        modified_features = self.modified_features
+        if type(modified_features) is torch.Tensor:
+            sparse_features = utils.to_scipy(modified_features)
+            sp.save_npz(osp.join(root, name), sparse_features)
+        else:
+            sp.save_npz(osp.join(root, name), modified_features)

deeprobust/graph/global_attack/node_embedding_attack.py ADDED Viewed

	@@ -0,0 +1,522 @@

+"""
+Code in this file is modified from https://github.com/abojchevski/node_embedding_attack
+'Adversarial Attacks on Node Embeddings via Graph Poisoning'
+Aleksandar Bojchevski and Stephan Günnemann, ICML 2019
+http://proceedings.mlr.press/v97/bojchevski19a.html
+Copyright (C) owned by the authors, 2019
+"""
+import numba
+import numpy as np
+import scipy.sparse as sp
+import scipy.linalg as spl
+import torch
+import networkx as nx
+from deeprobust.graph.global_attack import BaseAttack
+class NodeEmbeddingAttack(BaseAttack):
+    """Node embedding attack. Adversarial Attacks on Node Embeddings via Graph
+    Poisoning. Aleksandar Bojchevski and Stephan Günnemann, ICML 2019
+    http://proceedings.mlr.press/v97/bojchevski19a.html
+    Examples
+    -----
+    >>> from deeprobust.graph.data import Dataset
+    >>> from deeprobust.graph.global_attack import NodeEmbeddingAttack
+    >>> data = Dataset(root='/tmp/', name='cora_ml', seed=15)
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> model = NodeEmbeddingAttack()
+    >>> model.attack(adj, attack_type="remove")
+    >>> modified_adj = model.modified_adj
+    >>> model.attack(adj, attack_type="remove", min_span_tree=True)
+    >>> modified_adj = model.modified_adj
+    >>> model.attack(adj, attack_type="add", n_candidates=10000)
+    >>> modified_adj = model.modified_adj
+    >>> model.attack(adj, attack_type="add_by_remove", n_candidates=10000)
+    >>> modified_adj = model.modified_adj
+    """
+    def __init__(self):
+        pass
+    def attack(self, adj, n_perturbations=1000, dim=32, window_size=5,
+            attack_type="remove", min_span_tree=False, n_candidates=None, seed=None, **kwargs):
+        """Selects the top (n_perturbations) number of flips using our perturbation attack.
+        :param adj: sp.spmatrix
+            The graph represented as a sparse scipy matrix
+        :param n_perturbations: int
+            Number of flips to select
+        :param dim: int
+            Dimensionality of the embeddings.
+        :param window_size: int
+            Co-occurence window size.
+        :param attack_type: str
+            can be chosed from ["remove", "add", "add_by_remove"]
+        :param min_span_tree: bool
+            Whether to disallow edges that lie on the minimum spanning tree;
+            only valid when `attack_type` is "remove"
+        :param n_candidates: int
+            Number of candiates for addition; only valid when `attack_type` is "add" or "add_by_remove";
+        :param seed: int
+            Random seed
+        """
+        assert attack_type in ["remove", "add", "add_by_remove"],  \
+                "attack_type can only be `remove` or `add`"
+        if attack_type == "remove":
+            if min_span_tree:
+                candidates = self.generate_candidates_removal_minimum_spanning_tree(adj)
+            else:
+                candidates = self.generate_candidates_removal(adj, seed)
+        elif attack_type == "add" or attack_type == "add_by_remove":
+            assert n_candidates, "please specify the value of `n_candidates`, " \
+                    +  "i.e. how many candiate you want to genereate for addition"
+            candidates = self.generate_candidates_addition(adj, n_candidates, seed)
+        n_nodes = adj.shape[0]
+        if attack_type == "add_by_remove":
+            candidates_add = candidates
+            adj_add = self.flip_candidates(adj, candidates_add)
+            vals_org_add, vecs_org_add = spl.eigh(adj_add.toarray(), np.diag(adj_add.sum(1).A1))
+            flip_indicator = 1 - 2 * adj_add[candidates[:, 0], candidates[:, 1]].A1
+            loss_est = estimate_loss_with_delta_eigenvals(candidates_add, flip_indicator,
+                                                          vals_org_add, vecs_org_add, n_nodes, dim, window_size)
+            loss_argsort = loss_est.argsort()
+            top_flips = candidates_add[loss_argsort[:n_perturbations]]
+        else:
+            # vector indicating whether we are adding an edge (+1) or removing an edge (-1)
+            delta_w = 1 - 2 * adj[candidates[:, 0], candidates[:, 1]].A1
+            # generalized eigenvalues/eigenvectors
+            deg_matrix = np.diag(adj.sum(1).A1)
+            vals_org, vecs_org = spl.eigh(adj.toarray(), deg_matrix)
+            loss_for_candidates = estimate_loss_with_delta_eigenvals(candidates, delta_w, vals_org, vecs_org, n_nodes, dim, window_size)
+            top_flips = candidates[loss_for_candidates.argsort()[-n_perturbations:]]
+        assert len(top_flips) == n_perturbations
+        modified_adj = self.flip_candidates(adj, top_flips)
+        self.check_adj(modified_adj)
+        self.modified_adj = modified_adj
+    def generate_candidates_removal(self, adj, seed=None):
+        """Generates candidate edge flips for removal (edge -> non-edge),
+        disallowing one random edge per node to prevent singleton nodes.
+        :param adj: sp.csr_matrix, shape [n_nodes, n_nodes]
+            Adjacency matrix of the graph
+        :param seed: int
+            Random seed
+        :return: np.ndarray, shape [?, 2]
+            Candidate set of edge flips
+        """
+        n_nodes = adj.shape[0]
+        if seed is not None:
+            np.random.seed(seed)
+        deg = np.where(adj.sum(1).A1 == 1)[0]
+        hiddeen = np.column_stack(
+            (np.arange(n_nodes), np.fromiter(map(np.random.choice, adj.tolil().rows), dtype=np.int32)))
+        adj_hidden = edges_to_sparse(hiddeen, adj.shape[0])
+        adj_hidden = adj_hidden.maximum(adj_hidden.T)
+        adj_keep = adj - adj_hidden
+        candidates = np.column_stack((sp.triu(adj_keep).nonzero()))
+        candidates = candidates[np.logical_not(np.in1d(candidates[:, 0], deg) | np.in1d(candidates[:, 1], deg))]
+        return candidates
+    def generate_candidates_removal_minimum_spanning_tree(self, adj):
+        """Generates candidate edge flips for removal (edge -> non-edge),
+         disallowing edges that lie on the minimum spanning tree.
+        adj: sp.csr_matrix, shape [n_nodes, n_nodes]
+            Adjacency matrix of the graph
+        :return: np.ndarray, shape [?, 2]
+            Candidate set of edge flips
+        """
+        mst = sp.csgraph.minimum_spanning_tree(adj)
+        mst = mst.maximum(mst.T)
+        adj_sample = adj - mst
+        candidates = np.column_stack(sp.triu(adj_sample, 1).nonzero())
+        return candidates
+    def generate_candidates_addition(self, adj, n_candidates, seed=None):
+        """Generates candidate edge flips for addition (non-edge -> edge).
+        :param adj: sp.csr_matrix, shape [n_nodes, n_nodes]
+            Adjacency matrix of the graph
+        :param n_candidates: int
+            Number of candidates to generate.
+        :param seed: int
+            Random seed
+        :return: np.ndarray, shape [?, 2]
+            Candidate set of edge flips
+        """
+        if seed is not None:
+            np.random.seed(seed)
+        num_nodes = adj.shape[0]
+        candidates = np.random.randint(0, num_nodes, [n_candidates * 5, 2])
+        candidates = candidates[candidates[:, 0] < candidates[:, 1]]
+        candidates = candidates[adj[candidates[:, 0], candidates[:, 1]].A1 == 0]
+        candidates = np.array(list(set(map(tuple, candidates))))
+        candidates = candidates[:n_candidates]
+        assert len(candidates) == n_candidates
+        return candidates
+    def flip_candidates(self, adj, candidates):
+        """Flip the edges in the candidate set to non-edges and vise-versa.
+        :param adj: sp.csr_matrix, shape [n_nodes, n_nodes]
+            Adjacency matrix of the graph
+        :param candidates: np.ndarray, shape [?, 2]
+            Candidate set of edge flips
+        :return: sp.csr_matrix, shape [n_nodes, n_nodes]
+            Adjacency matrix of the graph with the flipped edges/non-edges.
+        """
+        adj_flipped = adj.copy().tolil()
+        adj_flipped[candidates[:, 0], candidates[:, 1]] = 1 - adj[candidates[:, 0], candidates[:, 1]]
+        adj_flipped[candidates[:, 1], candidates[:, 0]] = 1 - adj[candidates[:, 1], candidates[:, 0]]
+        adj_flipped = adj_flipped.tocsr()
+        adj_flipped.eliminate_zeros()
+        return adj_flipped
+@numba.jit(nopython=True)
+def estimate_loss_with_delta_eigenvals(candidates, flip_indicator, vals_org, vecs_org, n_nodes, dim, window_size):
+    """Computes the estimated loss using the change in the eigenvalues for every candidate edge flip.
+    :param candidates: np.ndarray, shape [?, 2]
+        Candidate set of edge flips,
+    :param flip_indicator: np.ndarray, shape [?]
+        Vector indicating whether we are adding an edge (+1) or removing an edge (-1)
+    :param vals_org: np.ndarray, shape [n]
+        The generalized eigenvalues of the clean graph
+    :param vecs_org: np.ndarray, shape [n, n]
+        The generalized eigenvectors of the clean graph
+    :param n_nodes: int
+        Number of nodes
+    :param dim: int
+        Embedding dimension
+    :param window_size: int
+        Size of the window
+    :return: np.ndarray, shape [?]
+        Estimated loss for each candidate flip
+    """
+    loss_est = np.zeros(len(candidates))
+    for x in range(len(candidates)):
+        i, j = candidates[x]
+        vals_est = vals_org + flip_indicator[x] * (
+                2 * vecs_org[i] * vecs_org[j] - vals_org * (vecs_org[i] ** 2 + vecs_org[j] ** 2))
+        vals_sum_powers = sum_of_powers(vals_est, window_size)
+        loss_ij = np.sqrt(np.sum(np.sort(vals_sum_powers ** 2)[:n_nodes - dim]))
+        loss_est[x] = loss_ij
+    return loss_est
+@numba.jit(nopython=True)
+def estimate_delta_eigenvecs(candidates, flip_indicator, degrees, vals_org, vecs_org, delta_eigvals, pinvs):
+    """Computes the estimated change in the eigenvectors for every candidate edge flip.
+    :param candidates: np.ndarray, shape [?, 2]
+        Candidate set of edge flips,
+    :param flip_indicator: np.ndarray, shape [?]
+        Vector indicating whether we are adding an edge (+1) or removing an edge (-1)
+    :param degrees: np.ndarray, shape [n]
+        Vector of node degrees.
+    :param vals_org: np.ndarray, shape [n]
+        The generalized eigenvalues of the clean graph
+    :param vecs_org: np.ndarray, shape [n, n]
+        The generalized eigenvectors of the clean graph
+    :param delta_eigvals: np.ndarray, shape [?, n]
+        Estimated change in the eigenvalues for all candidate edge flips
+    :param pinvs: np.ndarray, shape [k, n, n]
+        Precomputed pseudo-inverse matrices for every dimension
+    :return: np.ndarray, shape [?, n, k]
+        Estimated change in the eigenvectors for all candidate edge flips
+    """
+    n_nodes, dim = vecs_org.shape
+    n_candidates = len(candidates)
+    delta_eigvecs = np.zeros((n_candidates, dim, n_nodes))
+    for k in range(dim):
+        cur_eigvecs = vecs_org[:, k]
+        cur_eigvals = vals_org[k]
+        for c in range(n_candidates):
+            degree_eigvec = (-delta_eigvals[c, k] * degrees) * cur_eigvecs
+            i, j = candidates[c]
+            degree_eigvec[i] += cur_eigvecs[j] - cur_eigvals * cur_eigvecs[i]
+            degree_eigvec[j] += cur_eigvecs[i] - cur_eigvals * cur_eigvecs[j]
+            delta_eigvecs[c, k] = np.dot(pinvs[k], flip_indicator[c] * degree_eigvec)
+    return delta_eigvecs
+def estimate_delta_eigvals(candidates, adj, vals_org, vecs_org):
+    """Computes the estimated change in the eigenvalues for every candidate edge flip.
+    :param candidates: np.ndarray, shape [?, 2]
+        Candidate set of edge flips
+    :param adj: sp.spmatrix
+        The graph represented as a sparse scipy matrix
+    :param vals_org: np.ndarray, shape [n]
+        The generalized eigenvalues of the clean graph
+    :param vecs_org: np.ndarray, shape [n, n]
+        The generalized eigenvectors of the clean graph
+    :return: np.ndarray, shape [?, n]
+        Estimated change in the eigenvalues for all candidate edge flips
+    """
+    # vector indicating whether we are adding an edge (+1) or removing an edge (-1)
+    delta_w = 1 - 2 * adj[candidates[:, 0], candidates[:, 1]].A1
+    delta_eigvals = delta_w[:, None] * (2 * vecs_org[candidates[:, 0]] * vecs_org[candidates[:, 1]]
+                                        - vals_org * (
+                                                vecs_org[candidates[:, 0]] ** 2 + vecs_org[candidates[:, 1]] ** 2))
+    return delta_eigvals
+class OtherNodeEmbeddingAttack(NodeEmbeddingAttack):
+    """ Baseline methods from the paper Adversarial Attacks on Node Embeddings
+    via Graph Poisoning. Aleksandar Bojchevski and Stephan Günnemann, ICML 2019.
+    http://proceedings.mlr.press/v97/bojchevski19a.html
+    Examples
+    -----
+    >>> from deeprobust.graph.data import Dataset
+    >>> from deeprobust.graph.global_attack import OtherNodeEmbeddingAttack
+    >>> data = Dataset(root='/tmp/', name='cora_ml', seed=15)
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> model = OtherNodeEmbeddingAttack(type='degree')
+    >>> model.attack(adj, attack_type="remove")
+    >>> modified_adj = model.modified_adj
+    >>> #
+    >>> model = OtherNodeEmbeddingAttack(type='eigencentrality')
+    >>> model.attack(adj, attack_type="remove")
+    >>> modified_adj = model.modified_adj
+    >>> #
+    >>> model = OtherNodeEmbeddingAttack(type='random')
+    >>> model.attack(adj, attack_type="add", n_candidates=10000)
+    >>> modified_adj = model.modified_adj
+    """
+    def __init__(self, type):
+        assert type in ["degree", "eigencentrality", "random"]
+        self.type = type
+    def attack(self, adj, n_perturbations=1000, attack_type="remove",
+            min_span_tree=False, n_candidates=None, seed=None, **kwargs):
+        """Selects the top (n_perturbations) number of flips using our perturbation attack.
+        :param adj: sp.spmatrix
+            The graph represented as a sparse scipy matrix
+        :param n_perturbations: int
+            Number of flips to select
+        :param dim: int
+            Dimensionality of the embeddings.
+        :param attack_type: str
+            can be chosed from ["remove", "add"]
+        :param min_span_tree: bool
+            Whether to disallow edges that lie on the minimum spanning tree;
+            only valid when `attack_type` is "remove"
+        :param n_candidates: int
+            Number of candiates for addition; only valid when `attack_type` is "add";
+        :param seed: int
+            Random seed;
+        :return: np.ndarray, shape [?, 2]
+            The top edge flips from the candidate set
+        """
+        assert attack_type in ["remove", "add"],  \
+                "attack_type can only be `remove` or `add`"
+        if attack_type == "remove":
+            if min_span_tree:
+                candidates = self.generate_candidates_removal_minimum_spanning_tree(adj)
+            else:
+                candidates = self.generate_candidates_removal(adj, seed)
+        elif attack_type == "add":
+            assert n_candidates, "please specify the value of `n_candidates`, " \
+                    +  "i.e. how many candiate you want to genereate for addition"
+            candidates = self.generate_candidates_addition(adj, n_candidates, seed)
+        else:
+            raise NotImplementedError
+        if self.type == "random":
+            top_flips = self.random_top_flips(candidates, n_perturbations, seed)
+        elif self.type == "eigencentrality":
+            top_flips = self.eigencentrality_top_flips(adj, candidates, n_perturbations)
+        elif self.type == "degree":
+            top_flips = self.degree_top_flips(adj, candidates, n_perturbations, complement=False)
+        else:
+            raise NotImplementedError
+        assert len(top_flips) == n_perturbations
+        modified_adj = self.flip_candidates(adj, top_flips)
+        self.check_adj(modified_adj)
+        self.modified_adj = modified_adj
+    def random_top_flips(self, candidates, n_perturbations, seed=None):
+        """Selects (n_perturbations) number of flips at random.
+        :param candidates: np.ndarray, shape [?, 2]
+            Candidate set of edge flips
+        :param n_perturbations: int
+            Number of flips to select
+        :param seed: int
+            Random seed
+        :return: np.ndarray, shape [?, 2]
+            The top edge flips from the candidate set
+        """
+        if seed is not None:
+            np.random.seed(seed)
+        return candidates[np.random.permutation(len(candidates))[:n_perturbations]]
+    def eigencentrality_top_flips(self, adj, candidates, n_perturbations):
+        """Selects the top (n_perturbations) number of flips using eigencentrality score of the edges.
+        Applicable only when removing edges.
+        :param adj: sp.spmatrix
+            The graph represented as a sparse scipy matrix
+        :param candidates: np.ndarray, shape [?, 2]
+            Candidate set of edge flips
+        :param n_perturbations: int
+            Number of flips to select
+        :return: np.ndarray, shape [?, 2]
+            The top edge flips from the candidate set
+        """
+        edges = np.column_stack(sp.triu(adj, 1).nonzero())
+        line_graph = construct_line_graph(adj)
+        eigcentrality_scores = nx.eigenvector_centrality_numpy(nx.Graph(line_graph))
+        eigcentrality_scores = {tuple(edges[k]): eigcentrality_scores[k] for k, v in eigcentrality_scores.items()}
+        eigcentrality_scores = np.array([eigcentrality_scores[tuple(cnd)] for cnd in candidates])
+        scores_argsrt = eigcentrality_scores.argsort()
+        return candidates[scores_argsrt[-n_perturbations:]]
+    def degree_top_flips(self, adj, candidates, n_perturbations, complement):
+        """Selects the top (n_perturbations) number of flips using degree centrality score of the edges.
+        :param adj: sp.spmatrix
+            The graph represented as a sparse scipy matrix
+        :param candidates: np.ndarray, shape [?, 2]
+            Candidate set of edge flips
+        :param n_perturbations: int
+            Number of flips to select
+        :param complement: bool
+            Whether to look at the complement graph
+        :return: np.ndarray, shape [?, 2]
+            The top edge flips from the candidate set
+        """
+        if complement:
+            adj = sp.csr_matrix(1-adj.toarray())
+        deg = adj.sum(1).A1
+        deg_argsort = (deg[candidates[:, 0]] + deg[candidates[:, 1]]).argsort()
+        return candidates[deg_argsort[-n_perturbations:]]
+@numba.jit(nopython=True)
+def sum_of_powers(x, power):
+    """For each x_i, computes \sum_{r=1}^{pow) x_i^r (elementwise sum of powers).
+    :param x: shape [?]
+        Any vector
+    :param pow: int
+        The largest power to consider
+    :return: shape [?]
+        Vector where each element is the sum of powers from 1 to pow.
+    """
+    n = x.shape[0]
+    sum_powers = np.zeros((power, n))
+    for i, i_power in enumerate(range(1, power + 1)):
+        sum_powers[i] = np.power(x, i_power)
+    return sum_powers.sum(0)
+def edges_to_sparse(edges, num_nodes, weights=None):
+    if weights is None:
+        weights = np.ones(edges.shape[0])
+    return sp.coo_matrix((weights, (edges[:, 0], edges[:, 1])), shape=(num_nodes, num_nodes)).tocsr()
+def construct_line_graph(adj):
+    """Construct a line graph from an undirected original graph.
+    Parameters
+    ----------
+    adj : sp.spmatrix [n_samples ,n_samples]
+        Symmetric binary adjacency matrix.
+    Returns
+    -------
+    L : sp.spmatrix, shape [A.nnz/2, A.nnz/2]
+        Symmetric binary adjacency matrix of the line graph.
+    """
+    N = adj.shape[0]
+    edges = np.column_stack(sp.triu(adj, 1).nonzero())
+    e1, e2 = edges[:, 0], edges[:, 1]
+    I = sp.eye(N).tocsr()
+    E1 = I[e1]
+    E2 = I[e2]
+    L = E1.dot(E1.T) + E1.dot(E2.T) + E2.dot(E1.T) + E2.dot(E2.T)
+    return L - 2 * sp.eye(L.shape[0])
+if __name__ == "__main__":
+    from deeprobust.graph.data import Dataset
+    from deeprobust.graph.defense import DeepWalk
+    import itertools
+    # load clean graph data
+    dataset_str = 'cora_ml'
+    data = Dataset(root='/tmp/', name=dataset_str, seed=15)
+    adj, features, labels = data.adj, data.features, data.labels
+    idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    comb = itertools.product(["random", "degree", "eigencentrality"], ["remove", "add"])
+    for type, attack_type in comb:
+        model = OtherNodeEmbeddingAttack(type=type)
+        print(model.type, attack_type)
+        try:
+            model.attack(adj, attack_type=attack_type, n_candidates=10000)
+            defender = DeepWalk()
+            defender.fit(adj)
+            defender.evaluate_node_classification(labels, idx_train, idx_test)
+        except KeyError:
+            print('eigencentrality only supports removing edges')
+    model = NodeEmbeddingAttack()
+    model.attack(adj, attack_type="remove")
+    model.attack(adj, attack_type="remove", min_span_tree=True)
+    modified_adj = model.modified_adj
+    model.attack(adj, attack_type="add", n_candidates=10000)
+    model.attack(adj, attack_type="add_by_remove", n_candidates=10000)
+    # model.attack(adj, attack_type="add")

deeprobust/graph/global_attack/prbcd.py ADDED Viewed

	@@ -0,0 +1,440 @@

+"""
+Robustness of Graph Neural Networks at Scale. NeurIPS 2021.
+Modified from https://github.com/sigeisler/robustness_of_gnns_at_scale/blob/main/rgnn_at_scale/attacks/prbcd.py
+"""
+import numpy as np
+from deeprobust.graph.defense_pyg import GCN
+import torch.nn.functional as F
+import torch
+import deeprobust.graph.utils as utils
+from torch.nn.parameter import Parameter
+from tqdm import tqdm
+import torch_sparse
+from torch_sparse import coalesce
+import math
+from torch_geometric.utils import to_scipy_sparse_matrix, from_scipy_sparse_matrix
+class PRBCD:
+    def __init__(self, data, model=None,
+            make_undirected=True,
+            eps=1e-7, search_space_size=10_000_000,
+            max_final_samples=20,
+            fine_tune_epochs=100,
+            epochs=400, lr_adj=0.1,
+            with_early_stopping=True,
+            do_synchronize=True,
+            device='cuda',
+            **kwargs
+            ):
+        """
+        Parameters
+        ----------
+        data : pyg format data
+        model : the model to be attacked, should be models in deeprobust.graph.defense_pyg
+        """
+        self.device = device
+        self.data = data
+        if model is None:
+            model = self.pretrain_model()
+        self.model = model
+        nnodes = data.x.shape[0]
+        d = data.x.shape[1]
+        self.n, self.d = nnodes, nnodes
+        self.make_undirected = make_undirected
+        self.max_final_samples = max_final_samples
+        self.search_space_size = search_space_size
+        self.eps = eps
+        self.lr_adj = lr_adj
+        self.modified_edge_index: torch.Tensor = None
+        self.perturbed_edge_weight: torch.Tensor = None
+        if self.make_undirected:
+            self.n_possible_edges = self.n * (self.n - 1) // 2
+        else:
+            self.n_possible_edges = self.n ** 2  # We filter self-loops later
+        # lr_factor = 0.1
+        # self.lr_factor = lr_factor * max(math.log2(self.n_possible_edges / self.search_space_size), 1.)
+        self.epochs = epochs
+        self.epochs_resampling = epochs - fine_tune_epochs # TODO
+        self.with_early_stopping = with_early_stopping
+        self.do_synchronize = do_synchronize
+    def pretrain_model(self, model=None):
+        data = self.data
+        device = self.device
+        feat, labels = data.x, data.y
+        nclass = max(labels).item()+1
+        if model is None:
+            model = GCN(nfeat=feat.shape[1], nhid=256, dropout=0,
+                    nlayers=3, with_bn=True, weight_decay=5e-4, nclass=nclass,
+                    device=device).to(device)
+            print(model)
+        model.fit(data, train_iters=1000, patience=200, verbose=True)
+        model.eval()
+        model.data = data.to(self.device)
+        output = model.predict()
+        labels = labels.to(device)
+        print(f"{model.name} Test set results:", self.get_perf(output, labels, data.test_mask, verbose=0)[1])
+        self.clean_node_mask = (output.argmax(1) == labels)
+        return model
+    def sample_random_block(self, n_perturbations):
+        for _ in range(self.max_final_samples):
+            self.current_search_space = torch.randint(
+                self.n_possible_edges, (self.search_space_size,), device=self.device)
+            self.current_search_space = torch.unique(self.current_search_space, sorted=True)
+            if self.make_undirected:
+                self.modified_edge_index = linear_to_triu_idx(self.n, self.current_search_space)
+            else:
+                self.modified_edge_index = linear_to_full_idx(self.n, self.current_search_space)
+                is_not_self_loop = self.modified_edge_index[0] != self.modified_edge_index[1]
+                self.current_search_space = self.current_search_space[is_not_self_loop]
+                self.modified_edge_index = self.modified_edge_index[:, is_not_self_loop]
+            self.perturbed_edge_weight = torch.full_like(
+                self.current_search_space, self.eps, dtype=torch.float32, requires_grad=True
+            )
+            if self.current_search_space.size(0) >= n_perturbations:
+                return
+        raise RuntimeError('Sampling random block was not successfull. Please decrease `n_perturbations`.')
+    @torch.no_grad()
+    def sample_final_edges(self, n_perturbations):
+        best_loss = -float('Inf')
+        perturbed_edge_weight = self.perturbed_edge_weight.detach()
+        perturbed_edge_weight[perturbed_edge_weight <= self.eps] = 0
+        _, feat, labels = self.edge_index, self.data.x, self.data.y
+        for i in range(self.max_final_samples):
+            if best_loss == float('Inf') or best_loss == -float('Inf'):
+                # In first iteration employ top k heuristic instead of sampling
+                sampled_edges = torch.zeros_like(perturbed_edge_weight)
+                sampled_edges[torch.topk(perturbed_edge_weight, n_perturbations).indices] = 1
+            else:
+                sampled_edges = torch.bernoulli(perturbed_edge_weight).float()
+            if sampled_edges.sum() > n_perturbations:
+                n_samples = sampled_edges.sum()
+                print(f'{i}-th sampling: too many samples {n_samples}')
+                continue
+            self.perturbed_edge_weight = sampled_edges
+            edge_index, edge_weight = self.get_modified_adj()
+            with torch.no_grad():
+                output = self.model.forward(feat, edge_index, edge_weight)
+                loss = F.nll_loss(output[self.data.val_mask], labels[self.data.val_mask]).item()
+            if best_loss < loss:
+                best_loss = loss
+                print('best_loss:', best_loss)
+                best_edges = self.perturbed_edge_weight.clone().cpu()
+        # Recover best sample
+        self.perturbed_edge_weight.data.copy_(best_edges.to(self.device))
+        edge_index, edge_weight = self.get_modified_adj()
+        edge_mask = edge_weight == 1
+        allowed_perturbations = 2 * n_perturbations if self.make_undirected else n_perturbations
+        edges_after_attack = edge_mask.sum()
+        clean_edges = self.edge_index.shape[1]
+        assert (edges_after_attack >= clean_edges - allowed_perturbations
+                and edges_after_attack <= clean_edges + allowed_perturbations), \
+            f'{edges_after_attack} out of range with {clean_edges} clean edges and {n_perturbations} pertutbations'
+        return edge_index[:, edge_mask], edge_weight[edge_mask]
+    def resample_random_block(self, n_perturbations: int):
+        self.keep_heuristic = 'WeightOnly'
+        if self.keep_heuristic == 'WeightOnly':
+            sorted_idx = torch.argsort(self.perturbed_edge_weight)
+            idx_keep = (self.perturbed_edge_weight <= self.eps).sum().long()
+            # Keep at most half of the block (i.e. resample low weights)
+            if idx_keep < sorted_idx.size(0) // 2:
+                idx_keep = sorted_idx.size(0) // 2
+        else:
+            raise NotImplementedError('Only keep_heuristic=`WeightOnly` supported')
+        sorted_idx = sorted_idx[idx_keep:]
+        self.current_search_space = self.current_search_space[sorted_idx]
+        self.modified_edge_index = self.modified_edge_index[:, sorted_idx]
+        self.perturbed_edge_weight = self.perturbed_edge_weight[sorted_idx]
+        # Sample until enough edges were drawn
+        for i in range(self.max_final_samples):
+            n_edges_resample = self.search_space_size - self.current_search_space.size(0)
+            lin_index = torch.randint(self.n_possible_edges, (n_edges_resample,), device=self.device)
+            self.current_search_space, unique_idx = torch.unique(
+                torch.cat((self.current_search_space, lin_index)),
+                sorted=True,
+                return_inverse=True
+            )
+            if self.make_undirected:
+                self.modified_edge_index = linear_to_triu_idx(self.n, self.current_search_space)
+            else:
+                self.modified_edge_index = linear_to_full_idx(self.n, self.current_search_space)
+            # Merge existing weights with new edge weights
+            perturbed_edge_weight_old = self.perturbed_edge_weight.clone()
+            self.perturbed_edge_weight = torch.full_like(self.current_search_space, self.eps, dtype=torch.float32)
+            self.perturbed_edge_weight[
+                unique_idx[:perturbed_edge_weight_old.size(0)]
+                ] = perturbed_edge_weight_old # unique_idx: the indices for the old edges
+            if not self.make_undirected:
+                is_not_self_loop = self.modified_edge_index[0] != self.modified_edge_index[1]
+                self.current_search_space = self.current_search_space[is_not_self_loop]
+                self.modified_edge_index = self.modified_edge_index[:, is_not_self_loop]
+                self.perturbed_edge_weight = self.perturbed_edge_weight[is_not_self_loop]
+            if self.current_search_space.size(0) > n_perturbations:
+                return
+        raise RuntimeError('Sampling random block was not successfull. Please decrease `n_perturbations`.')
+    def project(self, n_perturbations, values, eps, inplace=False):
+        if not inplace:
+            values = values.clone()
+        if torch.clamp(values, 0, 1).sum() > n_perturbations:
+            left = (values - 1).min()
+            right = values.max()
+            miu = bisection(values, left, right, n_perturbations)
+            values.data.copy_(torch.clamp(
+                values - miu, min=eps, max=1 - eps
+            ))
+        else:
+            values.data.copy_(torch.clamp(
+                values, min=eps, max=1 - eps
+            ))
+        return values
+    def get_modified_adj(self):
+        if self.make_undirected:
+            modified_edge_index, modified_edge_weight = to_symmetric(
+                self.modified_edge_index, self.perturbed_edge_weight, self.n
+            )
+        else:
+            modified_edge_index, modified_edge_weight = self.modified_edge_index, self.perturbed_edge_weight
+        edge_index = torch.cat((self.edge_index.to(self.device), modified_edge_index), dim=-1)
+        edge_weight = torch.cat((self.edge_weight.to(self.device), modified_edge_weight))
+        edge_index, edge_weight = torch_sparse.coalesce(edge_index, edge_weight, m=self.n, n=self.n, op='sum')
+        # Allow removal of edges
+        edge_weight[edge_weight > 1] = 2 - edge_weight[edge_weight > 1]
+        return edge_index, edge_weight
+    def update_edge_weights(self, n_perturbations, epoch, gradient):
+        self.optimizer_adj.zero_grad()
+        self.perturbed_edge_weight.grad = -gradient
+        self.optimizer_adj.step()
+        self.perturbed_edge_weight.data[self.perturbed_edge_weight < self.eps] = self.eps
+    def _update_edge_weights(self, n_perturbations, epoch, gradient):
+        lr_factor = n_perturbations / self.n / 2 * self.lr_factor
+        lr = lr_factor / np.sqrt(max(0, epoch - self.epochs_resampling) + 1)
+        self.perturbed_edge_weight.data.add_(lr * gradient)
+        self.perturbed_edge_weight.data[self.perturbed_edge_weight < self.eps] = self.eps
+        return None
+    def attack(self, edge_index=None, edge_weight=None, ptb_rate=0.1):
+        data = self.data
+        epochs, lr_adj = self.epochs, self.lr_adj
+        model = self.model
+        model.eval() # should set to eval
+        self.edge_index, feat, labels = data.edge_index, data.x, data.y
+        with torch.no_grad():
+            output = model.forward(feat, self.edge_index)
+            pred = output.argmax(1)
+        gt_labels = labels
+        labels = labels.clone() # to avoid shallow copy
+        labels[~data.train_mask] = pred[~data.train_mask]
+        if edge_index is not None:
+            self.edge_index = edge_index
+        self.edge_weight = torch.ones(self.edge_index.shape[1]).to(self.device)
+        n_perturbations = int(ptb_rate * self.edge_index.shape[1] //2)
+        print('n_perturbations:', n_perturbations)
+        self.sample_random_block(n_perturbations)
+        self.perturbed_edge_weight.requires_grad = True
+        self.optimizer_adj = torch.optim.Adam([self.perturbed_edge_weight], lr=lr_adj)
+        best_loss_val = -float('Inf')
+        for it in tqdm(range(epochs)):
+            self.perturbed_edge_weight.requires_grad = True
+            edge_index, edge_weight  = self.get_modified_adj()
+            if torch.cuda.is_available() and self.do_synchronize:
+                torch.cuda.empty_cache()
+                torch.cuda.synchronize()
+            output = model.forward(feat, edge_index, edge_weight)
+            loss = self.loss_attack(output, labels, type='tanhMargin')
+            gradient = grad_with_checkpoint(loss, self.perturbed_edge_weight)[0]
+            if torch.cuda.is_available() and self.do_synchronize:
+                torch.cuda.empty_cache()
+                torch.cuda.synchronize()
+            if it % 10 == 0:
+                print(f'Epoch {it}: {loss}')
+            with torch.no_grad():
+                self.update_edge_weights(n_perturbations, it, gradient)
+                self.perturbed_edge_weight = self.project(
+                    n_perturbations, self.perturbed_edge_weight, self.eps)
+                del edge_index, edge_weight #, logits
+                if it < self.epochs_resampling - 1:
+                    self.resample_random_block(n_perturbations)
+                edge_index, edge_weight = self.get_modified_adj()
+                output = model.predict(feat, edge_index, edge_weight)
+                loss_val = F.nll_loss(output[data.val_mask], labels[data.val_mask])
+            self.perturbed_edge_weight.requires_grad = True
+            self.optimizer_adj = torch.optim.Adam([self.perturbed_edge_weight], lr=lr_adj)
+        # Sample final discrete graph
+        edge_index, edge_weight = self.sample_final_edges(n_perturbations)
+        output = model.predict(feat, edge_index, edge_weight)
+        print('Test:')
+        self.get_perf(output, gt_labels, data.test_mask)
+        print('Validatoin:')
+        self.get_perf(output, gt_labels, data.val_mask)
+        return edge_index, edge_weight
+    def loss_attack(self, logits, labels, type='CE'):
+        self.loss_type = type
+        if self.loss_type == 'tanhMargin':
+            sorted = logits.argsort(-1)
+            best_non_target_class = sorted[sorted != labels[:, None]].reshape(logits.size(0), -1)[:, -1]
+            margin = (
+                logits[np.arange(logits.size(0)), labels]
+                - logits[np.arange(logits.size(0)), best_non_target_class]
+            )
+            loss = torch.tanh(-margin).mean()
+        elif self.loss_type == 'MCE':
+            not_flipped = logits.argmax(-1) == labels
+            loss = F.cross_entropy(logits[not_flipped], labels[not_flipped])
+        elif self.loss_type == 'NCE':
+            sorted = logits.argsort(-1)
+            best_non_target_class = sorted[sorted != labels[:, None]].reshape(logits.size(0), -1)[:, -1]
+            loss = -F.cross_entropy(logits, best_non_target_class)
+        else:
+            loss = F.cross_entropy(logits, labels)
+        return loss
+    def get_perf(self, output, labels, mask, verbose=True):
+        loss = F.nll_loss(output[mask], labels[mask])
+        acc = utils.accuracy(output[mask], labels[mask])
+        if verbose:
+            print("loss= {:.4f}".format(loss.item()),
+                  "accuracy= {:.4f}".format(acc.item()))
+        return loss.item(), acc.item()
+@torch.jit.script
+def softmax_entropy(x: torch.Tensor) -> torch.Tensor:
+    """Entropy of softmax distribution from **logits**."""
+    return -(x.softmax(1) * x.log_softmax(1)).sum(1)
+@torch.jit.script
+def entropy(x: torch.Tensor) -> torch.Tensor:
+    """Entropy of softmax distribution from **log_softmax**."""
+    return -(torch.exp(x) * x).sum(1)
+def to_symmetric(edge_index, edge_weight, n, op='mean'):
+    symmetric_edge_index = torch.cat(
+        (edge_index, edge_index.flip(0)), dim=-1
+    )
+    symmetric_edge_weight = edge_weight.repeat(2)
+    symmetric_edge_index, symmetric_edge_weight = coalesce(
+        symmetric_edge_index,
+        symmetric_edge_weight,
+        m=n,
+        n=n,
+        op=op
+    )
+    return symmetric_edge_index, symmetric_edge_weight
+def linear_to_full_idx(n: int, lin_idx: torch.Tensor) -> torch.Tensor:
+    row_idx = lin_idx // n
+    col_idx = lin_idx % n
+    return torch.stack((row_idx, col_idx))
+def linear_to_triu_idx(n: int, lin_idx: torch.Tensor) -> torch.Tensor:
+    row_idx = (
+        n
+        - 2
+        - torch.floor(torch.sqrt(-8 * lin_idx.double() + 4 * n * (n - 1) - 7) / 2.0 - 0.5)
+    ).long()
+    col_idx = (
+        lin_idx
+        + row_idx
+        + 1 - n * (n - 1) // 2
+        + (n - row_idx) * ((n - row_idx) - 1) // 2
+    )
+    return torch.stack((row_idx, col_idx))
+def grad_with_checkpoint(outputs, inputs):
+    inputs = (inputs,) if isinstance(inputs, torch.Tensor) else tuple(inputs)
+    for input in inputs:
+        if not input.is_leaf:
+            input.retain_grad()
+    torch.autograd.backward(outputs)
+    grad_outputs = []
+    for input in inputs:
+        grad_outputs.append(input.grad.clone())
+        input.grad.zero_()
+    return grad_outputs
+def bisection(edge_weights, a, b, n_perturbations, epsilon=1e-5, iter_max=1e5):
+    def func(x):
+        return torch.clamp(edge_weights - x, 0, 1).sum() - n_perturbations
+    miu = a
+    for i in range(int(iter_max)):
+        miu = (a + b) / 2
+        # Check if middle point is root
+        if (func(miu) == 0.0):
+            break
+        # Decide the side to repeat the steps
+        if (func(miu) * func(a) < 0):
+            b = miu
+        else:
+            a = miu
+        if ((b - a) <= epsilon):
+            break
+    return miu
+if __name__ == "__main__":
+    from ogb.nodeproppred import PygNodePropPredDataset
+    from torch_geometric.utils import to_undirected
+    import torch_geometric.transforms as T
+    dataset = PygNodePropPredDataset(name='ogbn-arxiv')
+    dataset.transform = T.NormalizeFeatures()
+    data = dataset[0]
+    if not hasattr(data, 'train_mask'):
+        utils.add_mask(data, dataset)
+    data.edge_index = to_undirected(data.edge_index, data.num_nodes)
+    agent = PRBCD(data)
+    edge_index, edge_weight = agent.attack()

deeprobust/graph/rl/nipa_env.py ADDED Viewed

	@@ -0,0 +1,169 @@

+"""
+    This part of code is adopted from https://github.com/Hanjun-Dai/graph_adversarial_attack (Copyright (c) 2018 Dai, Hanjun and Li, Hui and Tian, Tian and Huang, Xin and Wang, Lin and Zhu, Jun and Song, Le)
+    but modified to be integrated into the repository.
+"""
+import os
+import sys
+import numpy as np
+import torch
+import networkx as nx
+import random
+from torch.nn.parameter import Parameter
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+from tqdm import tqdm
+from copy import deepcopy
+import pickle as cp
+from deeprobust.graph.utils import *
+import scipy.sparse as sp
+from scipy.sparse.linalg.eigen.arpack import eigsh
+from deeprobust.graph import utils
+from deeprobust.graph.rl.env import *
+class NodeInjectionEnv(NodeAttackEnv):
+    """Node attack environment. It executes an action and then change the
+    environment status (modify the graph).
+    """
+    def __init__(self, features, labels, idx_train, idx_val, dict_of_lists, classifier, ratio=0.01, parallel_size=1, reward_type='binary'):
+        """number of injected nodes: ratio*|V|
+           number of modifications: ratio*|V|*|D_avg|
+        """
+        # super(NodeInjectionEnv, self).__init__(features, labels, all_targets, list_action_space, classifier, num_mod, reward_type)
+        super(NodeInjectionEnv, self).__init__(features, labels, idx_val, dict_of_lists, classifier)
+        self.parallel_size = parallel_size
+        degrees = np.array([len(d) for n, d in dict_of_lists.items()])
+        N = len(degrees[degrees > 0])
+        avg_degree = degrees.sum() / N
+        self.n_injected = len(degrees) - N
+        assert self.n_injected == int(ratio * N)
+        self.ori_adj_size = N
+        self.n_perturbations = int(self.n_injected * avg_degree)
+        print("number of perturbations: {}".format(self.n_perturbations))
+        self.all_nodes = np.arange(N)
+        self.injected_nodes = self.all_nodes[-self.n_injected: ]
+        self.previous_acc = [1] * parallel_size
+        self.idx_train = np.hstack((idx_train, self.injected_nodes))
+        self.idx_val = idx_val
+        self.modified_label_list = []
+        for i in range(self.parallel_size):
+            self.modified_label_list.append(labels[-self.n_injected: ].clone())
+    def init_overall_steps(self):
+        self.overall_steps = 0
+        self.modified_list = []
+        for i in range(self.parallel_size):
+            self.modified_list.append(ModifiedGraph())
+    def setup(self):
+        self.n_steps = 0
+        self.first_nodes = None
+        self.second_nodes = None
+        self.rewards = None
+        self.binary_rewards = None
+        self.list_acc_of_all = []
+    def step(self, actions, inference=False):
+        '''
+            run actions and get reward
+        '''
+        if self.first_nodes is None: # pick the first node of edge
+            assert (self.n_steps + 1) % 3 == 1
+            self.first_nodes = actions[:]
+        if (self.n_steps + 1) % 3 == 2:
+            self.second_nodes = actions[:]
+            for i in range(self.parallel_size):
+                # add an edge from the graph
+                self.modified_list[i].add_edge(self.first_nodes[i], actions[i], 1.0)
+        if (self.n_steps + 1) % 3 == 0:
+            for i in range(self.parallel_size):
+                # change label
+                self.modified_label_list[i][self.first_nodes[i] - self.ori_adj_size] = actions[i]
+            self.first_nodes = None
+            self.second_nodes = None
+        self.n_steps += 1
+        self.overall_steps += 1
+        if not inference:
+            if self.isActionFinished() :
+                rewards = []
+                for i in (range(self.parallel_size)):
+                    device = self.labels.device
+                    extra_adj = self.modified_list[i].get_extra_adj(device=device)
+                    adj = self.classifier.norm_tool.norm_extra(extra_adj)
+                    labels = torch.cat((self.labels, self.modified_label_list[i]))
+                    # self.classifier.fit(self.features, adj, labels, self.idx_train, self.idx_val, normalize=False)
+                    self.classifier.fit(self.features, adj, labels, self.idx_train, self.idx_val, normalize=False, patience=30)
+                    output = self.classifier(self.features, adj)
+                    loss, correct = loss_acc(output, self.labels, self.idx_val, avg_loss=False)
+                    acc = correct.sum()
+                    # r = 1 if self.previous_acc[i] - acc > 0.01  else -1
+                    r = 1 if self.previous_acc[i] - acc > 0  else -1
+                    self.previous_acc[i] = acc
+                    rewards.append(r)
+                    self.rewards = np.array(rewards).astype(np.float32)
+    def sample_pos_rewards(self, num_samples):
+        assert self.list_acc_of_all is not None
+        cands = []
+        for i in range(len(self.list_acc_of_all)):
+            succ = np.where( self.list_acc_of_all[i] < 0.9 )[0]
+            for j in range(len(succ)):
+                cands.append((i, self.all_targets[succ[j]]))
+        if num_samples > len(cands):
+            return cands
+        random.shuffle(cands)
+        return cands[0:num_samples]
+    def uniformRandActions(self):
+        act_list = []
+        for i in range(self.parallel_size):
+            if self.first_nodes is None:
+                # a1: choose a node from injected nodes
+                cur_action = np.random.choice(self.injected_nodes)
+            if self.first_nodes is not None and self.second_nodes is None:
+                # a2: choose a node from all nodes
+                cur_action = np.random.randint(len(self.list_action_space))
+                while (self.first_nodes[i], cur_action) in self.modified_list[i].edge_set:
+                    cur_action = np.random.randint(len(self.list_action_space))
+            if self.first_nodes is not None and self.second_nodes is not None:
+                # a3: choose label
+                cur_action = np.random.randint(self.labels.cpu().max() + 1)
+            act_list.append(cur_action)
+        return act_list
+    def isActionFinished(self):
+        if (self.n_steps) % 3 == 0 and self.n_steps != 0:
+            return True
+        return False
+    def isTerminal(self):
+        if self.overall_steps == 3 * self.n_perturbations:
+            return True
+        return False
+    def getStateRef(self):
+        return list(zip(self.modified_list, self.modified_label_list))
+    def cloneState(self):
+        return list(zip(deepcopy(self.modified_list), deepcopy(self.modified_label_list)))

deeprobust/graph/rl/rl_s2v_config.py ADDED Viewed

	@@ -0,0 +1,57 @@

+"""Copyright (c) 2018 Dai, Hanjun and Li, Hui and Tian, Tian and Huang, Xin and Wang, Lin and Zhu, Jun and Song, Le
+"""
+import argparse
+import pickle as cp
+cmd_opt = argparse.ArgumentParser(description='Argparser for molecule vae')
+cmd_opt.add_argument('-saved_model', type=str, default=None, help='saved model')
+cmd_opt.add_argument('-save_dir', type=str, default=None, help='save folder')
+cmd_opt.add_argument('-ctx', type=str, default='gpu', help='cpu/gpu')
+cmd_opt.add_argument('-phase', type=str, default='train', help='train/test')
+cmd_opt.add_argument('-batch_size', type=int, default=10, help='minibatch size')
+cmd_opt.add_argument('-seed', type=int, default=1, help='seed')
+cmd_opt.add_argument('-gm', default='mean_field', help='mean_field/loopy_bp/gcn')
+cmd_opt.add_argument('-latent_dim', type=int, default=64, help='dimension of latent layers')
+cmd_opt.add_argument('-hidden', type=int, default=0, help='dimension of classification')
+cmd_opt.add_argument('-max_lv', type=int, default=1, help='max rounds of message passing')
+# target model
+cmd_opt.add_argument('-num_epochs', type=int, default=200, help='number of epochs')
+cmd_opt.add_argument('-learning_rate', type=float, default=0.01, help='init learning_rate')
+cmd_opt.add_argument('-weight_decay', type=float, default=5e-4, help='weight_decay')
+cmd_opt.add_argument('-dropout', type=float, default=0.5, help='dropout rate')
+# for node classification
+cmd_opt.add_argument('-dataset', type=str, default='cora', help='citeseer/cora/pubmed')
+# for attack
+cmd_opt.add_argument('-num_steps', type=int, default=500000, help='rl training steps')
+# cmd_opt.add_argument('-frac_meta', type=float, default=0, help='fraction for meta rl learning')
+cmd_opt.add_argument('-meta_test', type=int, default=0, help='for meta rl learning')
+cmd_opt.add_argument('-reward_type', type=str, default='binary', help='binary/nll')
+cmd_opt.add_argument('-num_mod', type=int, default=1, help='number of modifications allowed')
+# for node attack
+cmd_opt.add_argument('-bilin_q', type=int, default=1, help='bilinear q or not')
+cmd_opt.add_argument('-mlp_hidden', type=int, default=64, help='mlp hidden layer size')
+# cmd_opt.add_argument('-n_hops', type=int, default=2, help='attack range')
+args, _ = cmd_opt.parse_known_args()
+args.save_dir = './results/rl_s2v/{}-gcn'.format(args.dataset)
+args.saved_model = 'results/node_classification/{}'.format(args.dataset)
+print(args)
+def build_kwargs(keys, arg_dict):
+    st = ''
+    for key in keys:
+        st += '%s-%s' % (key, str(arg_dict[key]))
+    return st
+def save_args(fout, args):
+    with open(fout, 'wb') as f:
+        cp.dump(args, f, cp.HIGHEST_PROTOCOL)

deeprobust/graph/targeted_attack/__init__.py ADDED Viewed

	@@ -0,0 +1,9 @@

+from .base_attack import BaseAttack
+from .fga import FGA
+from .rnd import RND
+from .nettack import Nettack
+from .ig_attack import IGAttack
+from .rl_s2v import RLS2V
+from .sga import SGAttack
+__all__ = ['BaseAttack', 'FGA', 'RND', 'Nettack', 'IGAttack', 'RLS2V', 'SGAttack']

deeprobust/graph/targeted_attack/base_attack.py ADDED Viewed

	@@ -0,0 +1,126 @@

+from torch.nn.modules.module import Module
+import numpy as np
+import torch
+import scipy.sparse as sp
+import os.path as osp
+class BaseAttack(Module):
+    """Abstract base class for target attack classes.
+    Parameters
+    ----------
+    model :
+        model to attack
+    nnodes : int
+        number of nodes in the input graph
+    attack_structure : bool
+        whether to attack graph structure
+    attack_features : bool
+        whether to attack node features
+    device: str
+        'cpu' or 'cuda'
+    """
+    def __init__(self, model, nnodes, attack_structure=True, attack_features=False, device='cpu'):
+        super(BaseAttack, self).__init__()
+        self.surrogate = model
+        self.nnodes = nnodes
+        self.attack_structure = attack_structure
+        self.attack_features = attack_features
+        self.device = device
+        if model is not None:
+            self.nclass = model.nclass
+            self.nfeat = model.nfeat
+            self.hidden_sizes = model.hidden_sizes
+        self.modified_adj = None
+        self.modified_features = None
+    def attack(self, ori_adj, n_perturbations, **kwargs):
+        """Generate perturbations on the input graph.
+        Parameters
+        ----------
+        ori_adj : scipy.sparse.csr_matrix
+            Original (unperturbed) adjacency matrix.
+        n_perturbations : int
+            Number of perturbations on the input graph. Perturbations could
+            be edge removals/additions or feature removals/additions.
+        Returns
+        -------
+        None.
+        """
+        pass
+    def check_adj(self, adj):
+        """Check if the modified adjacency is symmetric and unweighted.
+        """
+        if type(adj) is torch.Tensor:
+            adj = adj.cpu().numpy()
+        assert np.abs(adj - adj.T).sum() == 0, "Input graph is not symmetric"
+        if sp.issparse(adj):
+            assert adj.tocsr().max() == 1, "Max value should be 1!"
+            assert adj.tocsr().min() == 0, "Min value should be 0!"
+        else:
+            assert adj.max() == 1, "Max value should be 1!"
+            assert adj.min() == 0, "Min value should be 0!"
+    def save_adj(self, root=r'/tmp/', name='mod_adj'):
+        """Save attacked adjacency matrix.
+        Parameters
+        ----------
+        root :
+            root directory where the variable should be saved
+        name : str
+            saved file name
+        Returns
+        -------
+        None.
+        """
+        assert self.modified_adj is not None, \
+                'modified_adj is None! Please perturb the graph first.'
+        name = name + '.npz'
+        modified_adj = self.modified_adj
+        if type(modified_adj) is torch.Tensor:
+            modified_adj = utils.to_scipy(modified_adj)
+        if sp.issparse(modified_adj):
+            modified_adj = modified_adj.tocsr()
+        sp.save_npz(osp.join(root, name), modified_adj)
+    def save_features(self, root=r'/tmp/', name='mod_features'):
+        """Save attacked node feature matrix.
+        Parameters
+        ----------
+        root :
+            root directory where the variable should be saved
+        name : str
+            saved file name
+        Returns
+        -------
+        None.
+        """
+        assert self.modified_features is not None, \
+                'modified_features is None! Please perturb the graph first.'
+        name = name + '.npz'
+        modified_features = self.modified_features
+        if type(modified_features) is torch.Tensor:
+            modified_features = utils.to_scipy(modified_features)
+        if sp.issparse(modified_features):
+            modified_features = modified_features.tocsr()
+        sp.save_npz(osp.join(root, name), modified_features)

deeprobust/graph/targeted_attack/fga.py ADDED Viewed

	@@ -0,0 +1,124 @@

+"""
+    FGA: Fast Gradient Attack on Network Embedding (https://arxiv.org/pdf/1809.02797.pdf)
+    Another very similar algorithm to mention here is FGSM (for graph data).
+    It is mentioned in Zugner's paper,
+    Adversarial Attacks on Neural Networks for Graph Data, KDD'19
+"""
+import torch
+from deeprobust.graph.targeted_attack import BaseAttack
+from torch.nn.parameter import Parameter
+from copy import deepcopy
+from deeprobust.graph import utils
+import torch.nn.functional as F
+import scipy.sparse as sp
+class FGA(BaseAttack):
+    """FGA/FGSM.
+    Parameters
+    ----------
+    model :
+        model to attack
+    nnodes : int
+        number of nodes in the input graph
+    feature_shape : tuple
+        shape of the input node features
+    attack_structure : bool
+        whether to attack graph structure
+    attack_features : bool
+        whether to attack node features
+    device: str
+        'cpu' or 'cuda'
+    Examples
+    --------
+    >>> from deeprobust.graph.data import Dataset
+    >>> from deeprobust.graph.defense import GCN
+    >>> from deeprobust.graph.targeted_attack import FGA
+    >>> data = Dataset(root='/tmp/', name='cora')
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    >>> # Setup Surrogate model
+    >>> surrogate = GCN(nfeat=features.shape[1], nclass=labels.max().item()+1,
+                    nhid=16, dropout=0, with_relu=False, with_bias=False, device='cpu').to('cpu')
+    >>> surrogate.fit(features, adj, labels, idx_train, idx_val, patience=30)
+    >>> # Setup Attack Model
+    >>> target_node = 0
+    >>> model = FGA(surrogate, nnodes=adj.shape[0], attack_structure=True, attack_features=False, device='cpu').to('cpu')
+    >>> # Attack
+    >>> model.attack(features, adj, labels, idx_train, target_node, n_perturbations=5)
+    >>> modified_adj = model.modified_adj
+    """
+    def __init__(self, model, nnodes, feature_shape=None, attack_structure=True, attack_features=False, device='cpu'):
+        super(FGA, self).__init__(model, nnodes, attack_structure=attack_structure, attack_features=attack_features, device=device)
+        assert not self.attack_features, "not support attacking features"
+        if self.attack_features:
+            self.feature_changes = Parameter(torch.FloatTensor(feature_shape))
+            self.feature_changes.data.fill_(0)
+    def attack(self, ori_features, ori_adj, labels, idx_train, target_node, n_perturbations, verbose=False, **kwargs):
+        """Generate perturbations on the input graph.
+        Parameters
+        ----------
+        ori_features : scipy.sparse.csr_matrix
+            Original (unperturbed) adjacency matrix
+        ori_adj : scipy.sparse.csr_matrix
+            Original (unperturbed) node feature matrix
+        labels :
+            node labels
+        idx_train:
+            training node indices
+        target_node : int
+            target node index to be attacked
+        n_perturbations : int
+            Number of perturbations on the input graph. Perturbations could
+            be edge removals/additions or feature removals/additions.
+        """
+        modified_adj = ori_adj.todense()
+        modified_features = ori_features.todense()
+        modified_adj, modified_features, labels = utils.to_tensor(modified_adj, modified_features, labels, device=self.device)
+        self.surrogate.eval()
+        if verbose == True:
+            print('number of pertubations: %s' % n_perturbations)
+        pseudo_labels = self.surrogate.predict().detach().argmax(1)
+        pseudo_labels[idx_train] = labels[idx_train]
+        modified_adj.requires_grad = True
+        for i in range(n_perturbations):
+            adj_norm = utils.normalize_adj_tensor(modified_adj)
+            if self.attack_structure:
+                output = self.surrogate(modified_features, adj_norm)
+                loss = F.nll_loss(output[[target_node]], pseudo_labels[[target_node]])
+                grad = torch.autograd.grad(loss, modified_adj)[0]
+                # bidirection
+                grad = (grad[target_node] + grad[:, target_node]) * (-2*modified_adj[target_node] + 1)
+                grad[target_node] = -10
+                grad_argmax = torch.argmax(grad)
+            value = -2*modified_adj[target_node][grad_argmax] + 1
+            modified_adj.data[target_node][grad_argmax] += value
+            modified_adj.data[grad_argmax][target_node] += value
+            if self.attack_features:
+                pass
+        modified_adj = modified_adj.detach().cpu().numpy()
+        modified_adj = sp.csr_matrix(modified_adj)
+        self.check_adj(modified_adj)
+        self.modified_adj = modified_adj
+        # self.modified_features = modified_features

deeprobust/graph/targeted_attack/ig_attack.py ADDED Viewed

	@@ -0,0 +1,224 @@

+"""
+    Adversarial Examples on Graph Data: Deep Insights into Attack and Defense
+        https://arxiv.org/pdf/1903.01610.pdf
+"""
+import torch
+import torch.multiprocessing as mp
+from deeprobust.graph.targeted_attack import BaseAttack
+from torch.nn.parameter import Parameter
+from deeprobust.graph import utils
+import torch.nn.functional as F
+import numpy as np
+import scipy.sparse as sp
+from torch import optim
+from torch.nn import functional as F
+from torch.nn.modules.module import Module
+import numpy as np
+from tqdm import tqdm
+import math
+import scipy.sparse as sp
+class IGAttack(BaseAttack):
+    """IGAttack: IG-FGSM. Adversarial Examples on Graph Data: Deep Insights into Attack and Defense, https://arxiv.org/pdf/1903.01610.pdf.
+    Parameters
+    ----------
+    model :
+        model to attack
+    nnodes : int
+        number of nodes in the input graph
+    feature_shape : tuple
+        shape of the input node features
+    attack_structure : bool
+        whether to attack graph structure
+    attack_features : bool
+        whether to attack node features
+    device: str
+        'cpu' or 'cuda'
+    Examples
+    --------
+    >>> from deeprobust.graph.data import Dataset
+    >>> from deeprobust.graph.defense import GCN
+    >>> from deeprobust.graph.targeted_attack import IGAttack
+    >>> data = Dataset(root='/tmp/', name='cora')
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    >>> # Setup Surrogate model
+    >>> surrogate = GCN(nfeat=features.shape[1], nclass=labels.max().item()+1,
+                    nhid=16, dropout=0, with_relu=False, with_bias=False, device='cpu').to('cpu')
+    >>> surrogate.fit(features, adj, labels, idx_train, idx_val, patience=30)
+    >>> # Setup Attack Model
+    >>> target_node = 0
+    >>> model = IGAttack(surrogate, nnodes=adj.shape[0], attack_structure=True, attack_features=True, device='cpu').to('cpu')
+    >>> # Attack
+    >>> model.attack(features, adj, labels, idx_train, target_node, n_perturbations=5, steps=10)
+    >>> modified_adj = model.modified_adj
+    >>> modified_features = model.modified_features
+    """
+    def __init__(self, model, nnodes=None, feature_shape=None, attack_structure=True, attack_features=True, device='cpu'):
+        super(IGAttack, self).__init__(model, nnodes, attack_structure, attack_features, device)
+        assert attack_features or attack_structure, 'attack_features or attack_structure cannot be both False'
+        self.modified_adj = None
+        self.modified_features = None
+        self.target_node = None
+    def attack(self, ori_features, ori_adj, labels, idx_train, target_node, n_perturbations, steps=10, **kwargs):
+        """Generate perturbations on the input graph.
+        Parameters
+        ----------
+        ori_features :
+            Original (unperturbed) node feature matrix
+        ori_adj :
+            Original (unperturbed) adjacency matrix
+        labels :
+            node labels
+        idx_train:
+            training nodes indices
+        target_node : int
+            target node index to be attacked
+        n_perturbations : int
+            Number of perturbations on the input graph. Perturbations could
+            be edge removals/additions or feature removals/additions.
+        steps : int
+            steps for computing integrated gradients
+        """
+        self.surrogate.eval()
+        self.target_node = target_node
+        modified_adj = ori_adj.todense()
+        modified_features = ori_features.todense()
+        adj, features, labels = utils.to_tensor(modified_adj, modified_features, labels, device=self.device)
+        adj_norm = utils.normalize_adj_tensor(adj)
+        pseudo_labels = self.surrogate.predict().detach().argmax(1)
+        pseudo_labels[idx_train] = labels[idx_train]
+        self.pseudo_labels = pseudo_labels
+        s_e = np.zeros(adj.shape[1])
+        s_f = np.zeros(features.shape[1])
+        if self.attack_structure:
+            s_e = self.calc_importance_edge(features, adj_norm, labels, steps)
+        if self.attack_features:
+            s_f = self.calc_importance_feature(features, adj_norm, labels, steps)
+        for t in (range(n_perturbations)):
+            s_e_max = np.argmax(s_e)
+            s_f_max = np.argmax(s_f)
+            if s_e[s_e_max] >= s_f[s_f_max]:
+                # edge perturbation score is larger
+                if self.attack_structure:
+                    value = np.abs(1 - modified_adj[target_node, s_e_max])
+                    modified_adj[target_node, s_e_max] = value
+                    modified_adj[s_e_max, target_node] = value
+                    s_e[s_e_max] = 0
+                else:
+                    raise Exception("""No posisble perturbation on the structure can be made!
+                            See https://github.com/DSE-MSU/DeepRobust/issues/42 for more details.""")
+            else:
+                # feature perturbation score is larger
+                if self.attack_features:
+                    modified_features[target_node, s_f_max] = np.abs(1 - modified_features[target_node, s_f_max])
+                    s_f[s_f_max] = 0
+                else:
+                    raise Exception("""No posisble perturbation on the features can be made!
+                            See https://github.com/DSE-MSU/DeepRobust/issues/42 for more details.""")
+        self.modified_adj = sp.csr_matrix(modified_adj)
+        self.modified_features = sp.csr_matrix(modified_features)
+        self.check_adj(modified_adj)
+    def calc_importance_edge(self, features, adj_norm, labels, steps):
+        """Calculate integrated gradient for edges. Although I think the the gradient should be
+        with respect to adj instead of adj_norm, but the calculation is too time-consuming. So I
+        finally decided to calculate the gradient of loss with respect to adj_norm
+        """
+        baseline_add = adj_norm.clone()
+        baseline_remove = adj_norm.clone()
+        baseline_add.data[self.target_node] = 1
+        baseline_remove.data[self.target_node] = 0
+        adj_norm.requires_grad = True
+        integrated_grad_list = []
+        i = self.target_node
+        for j in tqdm(range(adj_norm.shape[1])):
+            if adj_norm[i][j]:
+                scaled_inputs = [baseline_remove + (float(k)/ steps) * (adj_norm - baseline_remove) for k in range(0, steps + 1)]
+            else:
+                scaled_inputs = [baseline_add - (float(k)/ steps) * (baseline_add - adj_norm) for k in range(0, steps + 1)]
+            _sum = 0
+            for new_adj in scaled_inputs:
+                output = self.surrogate(features, new_adj)
+                loss = F.nll_loss(output[[self.target_node]],
+                        self.pseudo_labels[[self.target_node]])
+                adj_grad = torch.autograd.grad(loss, adj_norm)[0]
+                adj_grad = adj_grad[i][j]
+                _sum += adj_grad
+            if adj_norm[i][j]:
+                avg_grad = (adj_norm[i][j] - 0) * _sum.mean()
+            else:
+                avg_grad = (1 - adj_norm[i][j]) * _sum.mean()
+            integrated_grad_list.append(avg_grad.detach().item())
+        integrated_grad_list[i] = 0
+        # make impossible perturbation to be negative
+        integrated_grad_list = np.array(integrated_grad_list)
+        adj = (adj_norm > 0).cpu().numpy()
+        integrated_grad_list = (-2 * adj[self.target_node] + 1) * integrated_grad_list
+        integrated_grad_list[self.target_node] = -10
+        return integrated_grad_list
+    def calc_importance_feature(self, features, adj_norm, labels, steps):
+        """Calculate integrated gradient for features
+        """
+        baseline_add = features.clone()
+        baseline_remove = features.clone()
+        baseline_add.data[self.target_node] = 1
+        baseline_remove.data[self.target_node] = 0
+        features.requires_grad = True
+        integrated_grad_list = []
+        i = self.target_node
+        for j in tqdm(range(features.shape[1])):
+            if features[i][j]:
+                scaled_inputs = [baseline_add + (float(k)/ steps) * (features - baseline_add) for k in range(0, steps + 1)]
+            else:
+                scaled_inputs = [baseline_remove - (float(k)/ steps) * (baseline_remove - features) for k in range(0, steps + 1)]
+            _sum = 0
+            for new_features in scaled_inputs:
+                output = self.surrogate(new_features, adj_norm)
+                loss = F.nll_loss(output[[self.target_node]],
+                        self.pseudo_labels[[self.target_node]])
+                feature_grad = torch.autograd.grad(loss, features)[0]
+                feature_grad = feature_grad[i][j]
+                _sum += feature_grad
+            if features[i][j]:
+                avg_grad = (features[i][j] - 0) * _sum.mean()
+            else:
+                avg_grad = (1 - features[i][j]) * _sum.mean()
+            integrated_grad_list.append(avg_grad.detach().item())
+        # make impossible perturbation to be negative
+        features = (features > 0).cpu().numpy()
+        integrated_grad_list = np.array(integrated_grad_list)
+        integrated_grad_list = (-2 * features[self.target_node] + 1) * integrated_grad_list
+        return integrated_grad_list

deeprobust/graph/targeted_attack/nettack.py ADDED Viewed

	@@ -0,0 +1,624 @@

+"""
+    Adversarial Attacks on Neural Networks for Graph Data. KDD 2018.
+        https://arxiv.org/pdf/1805.07984.pdf
+    Author's Implementation
+        https://github.com/danielzuegner/nettack
+    Since pytorch does not have good enough support to the operations
+    on sparse tensor, this part of code is heavily based on the author's implementation.
+"""
+"""
+Implementation of the method proposed in the paper:
+'Adversarial Attacks on Neural Networks for Graph Data'
+by Daniel Zügner, Amir Akbarnejad and Stephan Günnemann,
+published at SIGKDD'18, August 2018, London, UK
+Copyright (C) 2018
+Daniel Zügner
+Technical University of Munich
+"""
+import torch
+from deeprobust.graph.targeted_attack import BaseAttack
+from torch.nn.parameter import Parameter
+from deeprobust.graph import utils
+import torch.nn.functional as F
+from torch import optim
+from torch.nn import functional as F
+from torch.nn.modules.module import Module
+from torch.nn.parameter import Parameter
+import numpy as np
+import scipy.sparse as sp
+from copy import deepcopy
+from numba import jit
+from torch import spmm
+class Nettack(BaseAttack):
+    """Nettack.
+    Parameters
+    ----------
+    model :
+        model to attack
+    nnodes : int
+        number of nodes in the input graph
+    attack_structure : bool
+        whether to attack graph structure
+    attack_features : bool
+        whether to attack node features
+    device: str
+        'cpu' or 'cuda'
+    Examples
+    --------
+    >>> from deeprobust.graph.data import Dataset
+    >>> from deeprobust.graph.defense import GCN
+    >>> from deeprobust.graph.targeted_attack import Nettack
+    >>> data = Dataset(root='/tmp/', name='cora')
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    >>> # Setup Surrogate model
+    >>> surrogate = GCN(nfeat=features.shape[1], nclass=labels.max().item()+1,
+                    nhid=16, dropout=0, with_relu=False, with_bias=False, device='cpu').to('cpu')
+    >>> surrogate.fit(features, adj, labels, idx_train, idx_val, patience=30)
+    >>> # Setup Attack Model
+    >>> target_node = 0
+    >>> model = Nettack(surrogate, nnodes=adj.shape[0], attack_structure=True, attack_features=True, device='cpu').to('cpu')
+    >>> # Attack
+    >>> model.attack(features, adj, labels, target_node, n_perturbations=5)
+    >>> modified_adj = model.modified_adj
+    >>> modified_features = model.modified_features
+    """
+    def __init__(self, model, nnodes=None, attack_structure=True, attack_features=False, device='cpu'):
+        super(Nettack, self).__init__(model, nnodes, attack_structure=attack_structure, attack_features=attack_features, device=device)
+        self.structure_perturbations = []
+        self.feature_perturbations = []
+        self.influencer_nodes = []
+        self.potential_edges = []
+        self.cooc_constraint = None
+    def filter_potential_singletons(self, modified_adj):
+        """Computes a mask for entries potentially leading to singleton nodes, i.e.
+        one of the two nodes corresponding to the entry have degree 1 and there
+        is an edge between the two nodes.
+        """
+        degrees = modified_adj.sum(0)
+        degree_one = (degrees == 1)
+        resh = degree_one.repeat(self.nnodes, 1).float()
+        l_and = resh * modified_adj
+        logical_and_symmetric = l_and + l_and.t()
+        flat_mask = 1 - logical_and_symmetric
+        return flat_mask
+    def get_linearized_weight(self):
+        surrogate = self.surrogate
+        W = surrogate.gc1.weight @ surrogate.gc2.weight
+        return W.detach().cpu().numpy()
+    def attack(self, features, adj, labels, target_node, n_perturbations, direct=True, n_influencers= 0, ll_cutoff=0.004, verbose=True, **kwargs):
+        """Generate perturbations on the input graph.
+        Parameters
+        ----------
+        ori_features : torch.Tensor or scipy.sparse.csr_matrix
+            Origina (unperturbed) node feature matrix. Note that
+            torch.Tensor will be automatically transformed into
+            scipy.sparse.csr_matrix
+        ori_adj : torch.Tensor or scipy.sparse.csr_matrix
+            Original (unperturbed) adjacency matrix. Note that
+            torch.Tensor will be automatically transformed into
+            scipy.sparse.csr_matrix
+        labels :
+            node labels
+        target_node : int
+            target node index to be attacked
+        n_perturbations : int
+            Number of perturbations on the input graph. Perturbations could
+            be edge removals/additions or feature removals/additions.
+        direct: bool
+            whether to conduct direct attack
+        n_influencers:
+            number of influencer nodes when performing indirect attack.
+            (setting `direct` to False). When `direct` is True, it would be ignored.
+        ll_cutoff : float
+            The critical value for the likelihood ratio test of the power law distributions.
+            See the Chi square distribution with one degree of freedom. Default value 0.004
+            corresponds to a p-value of roughly 0.95.
+        verbose : bool
+            whether to show verbose logs
+        """
+        if self.nnodes is None:
+            self.nnodes = adj.shape[0]
+        self.target_node = target_node
+        if type(adj) is torch.Tensor:
+            self.ori_adj = utils.to_scipy(adj).tolil()
+            self.modified_adj = utils.to_scipy(adj).tolil()
+            self.ori_features = utils.to_scipy(features).tolil()
+            self.modified_features = utils.to_scipy(features).tolil()
+        else:
+            self.ori_adj = adj.tolil()
+            self.modified_adj = adj.tolil()
+            self.ori_features = features.tolil()
+            self.modified_features = features.tolil()
+        self.cooc_matrix = self.modified_features.T.dot(self.modified_features).tolil()
+        attack_features = self.attack_features
+        attack_structure = self.attack_structure
+        assert not (direct==False and n_influencers==0), "indirect mode requires at least one influencer node"
+        assert n_perturbations > 0, "need at least one perturbation"
+        assert attack_features or attack_structure, "either attack_features or attack_structure must be true"
+        # adj_norm = utils.normalize_adj_tensor(modified_adj, sparse=True)
+        self.adj_norm = utils.normalize_adj(self.modified_adj)
+        self.W = self.get_linearized_weight()
+        logits = (self.adj_norm @ self.adj_norm @ self.modified_features @ self.W )[target_node]
+        self.label_u = labels[target_node]
+        label_target_onehot = np.eye(int(self.nclass))[labels[target_node]]
+        best_wrong_class = (logits - 1000*label_target_onehot).argmax()
+        surrogate_losses = [logits[labels[target_node]] - logits[best_wrong_class]]
+        if verbose:
+            print("##### Starting attack #####")
+            if attack_structure and attack_features:
+                print("##### Attack node with ID {} using structure and feature perturbations #####".format(target_node))
+            elif attack_features:
+                print("##### Attack only using feature perturbations #####")
+            elif attack_structure:
+                print("##### Attack only using structure perturbations #####")
+            if direct:
+                print("##### Attacking the node directly #####")
+            else:
+                print("##### Attacking the node indirectly via {} influencer nodes #####".format(n_influencers))
+            print("##### Performing {} perturbations #####".format(n_perturbations))
+        if attack_structure:
+            # Setup starting values of the likelihood ratio test.
+            degree_sequence_start = self.ori_adj.sum(0).A1
+            current_degree_sequence = self.modified_adj.sum(0).A1
+            d_min = 2
+            S_d_start = np.sum(np.log(degree_sequence_start[degree_sequence_start >= d_min]))
+            current_S_d = np.sum(np.log(current_degree_sequence[current_degree_sequence >= d_min]))
+            n_start = np.sum(degree_sequence_start >= d_min)
+            current_n = np.sum(current_degree_sequence >= d_min)
+            alpha_start = compute_alpha(n_start, S_d_start, d_min)
+            log_likelihood_orig = compute_log_likelihood(n_start, alpha_start, S_d_start, d_min)
+        if len(self.influencer_nodes) == 0:
+            if not direct:
+                # Choose influencer nodes
+                infls, add_infls = self.get_attacker_nodes(n_influencers, add_additional_nodes=True)
+                self.influencer_nodes = np.concatenate((infls, add_infls)).astype("int")
+                # Potential edges are all edges from any attacker to any other node, except the respective
+                # attacker itself or the node being attacked.
+                self.potential_edges = np.row_stack([np.column_stack((np.tile(infl, self.nnodes - 2),
+                                                            np.setdiff1d(np.arange(self.nnodes),
+                                                            np.array([target_node,infl])))) for infl in
+                                                            self.influencer_nodes])
+                if verbose:
+                    print("Influencer nodes: {}".format(self.influencer_nodes))
+            else:
+                # direct attack
+                influencers = [target_node]
+                self.potential_edges = np.column_stack((np.tile(target_node, self.nnodes-1), np.setdiff1d(np.arange(self.nnodes), target_node)))
+                self.influencer_nodes = np.array(influencers)
+        self.potential_edges = self.potential_edges.astype("int32")
+        for _ in range(n_perturbations):
+            if verbose:
+                print("##### ...{}/{} perturbations ... #####".format(_+1, n_perturbations))
+            if attack_structure:
+                # Do not consider edges that, if removed, result in singleton edges in the graph.
+                singleton_filter = filter_singletons(self.potential_edges, self.modified_adj)
+                filtered_edges = self.potential_edges[singleton_filter]
+                # Update the values for the power law likelihood ratio test.
+                deltas = 2 * (1 - self.modified_adj[tuple(filtered_edges.T)].toarray()[0] )- 1
+                d_edges_old = current_degree_sequence[filtered_edges]
+                d_edges_new = current_degree_sequence[filtered_edges] + deltas[:, None]
+                new_S_d, new_n = update_Sx(current_S_d, current_n, d_edges_old, d_edges_new, d_min)
+                new_alphas = compute_alpha(new_n, new_S_d, d_min)
+                new_ll = compute_log_likelihood(new_n, new_alphas, new_S_d, d_min)
+                alphas_combined = compute_alpha(new_n + n_start, new_S_d + S_d_start, d_min)
+                new_ll_combined = compute_log_likelihood(new_n + n_start, alphas_combined, new_S_d + S_d_start, d_min)
+                new_ratios = -2 * new_ll_combined + 2 * (new_ll + log_likelihood_orig)
+                # Do not consider edges that, if added/removed, would lead to a violation of the
+                # likelihood ration Chi_square cutoff value.
+                powerlaw_filter = filter_chisquare(new_ratios, ll_cutoff)
+                filtered_edges_final = filtered_edges[powerlaw_filter]
+                # Compute new entries in A_hat_square_uv
+                a_hat_uv_new = self.compute_new_a_hat_uv(filtered_edges_final, target_node)
+                # Compute the struct scores for each potential edge
+                struct_scores = self.struct_score(a_hat_uv_new, self.modified_features @ self.W)
+                best_edge_ix = struct_scores.argmin()
+                best_edge_score = struct_scores.min()
+                best_edge = filtered_edges_final[best_edge_ix]
+            if attack_features:
+                # Compute the feature scores for each potential feature perturbation
+                feature_ixs, feature_scores = self.feature_scores()
+                best_feature_ix = feature_ixs[0]
+                best_feature_score = feature_scores[0]
+            if attack_structure and attack_features:
+                # decide whether to choose an edge or feature to change
+                if best_edge_score < best_feature_score:
+                    if verbose:
+                        print("Edge perturbation: {}".format(best_edge))
+                    change_structure = True
+                else:
+                    if verbose:
+                        print("Feature perturbation: {}".format(best_feature_ix))
+                    change_structure=False
+            elif attack_structure:
+                change_structure = True
+            elif attack_features:
+                change_structure = False
+            if change_structure:
+                # perform edge perturbation
+                self.modified_adj[tuple(best_edge)] = self.modified_adj[tuple(best_edge[::-1])] = 1 - self.modified_adj[tuple(best_edge)]
+                self.adj_norm = utils.normalize_adj(self.modified_adj)
+                self.structure_perturbations.append(tuple(best_edge))
+                self.feature_perturbations.append(())
+                surrogate_losses.append(best_edge_score)
+                # Update likelihood ratio test values
+                current_S_d = new_S_d[powerlaw_filter][best_edge_ix]
+                current_n = new_n[powerlaw_filter][best_edge_ix]
+                current_degree_sequence[best_edge] += deltas[powerlaw_filter][best_edge_ix]
+            else:
+                self.modified_features[tuple(best_feature_ix)] = 1 - self.modified_features[tuple(best_feature_ix)]
+                self.feature_perturbations.append(tuple(best_feature_ix))
+                self.structure_perturbations.append(())
+                surrogate_losses.append(best_feature_score)
+        # return self.modified_adj, self.modified_features
+    def get_attacker_nodes(self, n=5, add_additional_nodes = False):
+        """Determine the influencer nodes to attack node i based on
+        the weights W and the attributes X.
+        """
+        assert n < self.nnodes-1, "number of influencers cannot be >= number of nodes in the graph!"
+        neighbors = self.ori_adj[self.target_node].nonzero()[1]
+        assert self.target_node not in neighbors
+        potential_edges = np.column_stack((np.tile(self.target_node, len(neighbors)),neighbors)).astype("int32")
+        # The new A_hat_square_uv values that we would get if we removed the edge from u to each of the neighbors, respectively
+        a_hat_uv = self.compute_new_a_hat_uv(potential_edges, self.target_node)
+        # XW = self.compute_XW()
+        XW = self.modified_features @ self.W
+        # compute the struct scores for all neighbors
+        struct_scores = self.struct_score(a_hat_uv, XW)
+        if len(neighbors) >= n:  # do we have enough neighbors for the number of desired influencers?
+            influencer_nodes = neighbors[np.argsort(struct_scores)[:n]]
+            if add_additional_nodes:
+                return influencer_nodes, np.array([])
+            return influencer_nodes
+        else:
+            influencer_nodes = neighbors
+            if add_additional_nodes:  # Add additional influencers by connecting them to u first.
+                # Compute the set of possible additional influencers, i.e. all nodes except the ones
+                # that are already connected to u.
+                poss_add_infl = np.setdiff1d(np.setdiff1d(np.arange(self.nnodes),neighbors), self.target_node)
+                n_possible_additional = len(poss_add_infl)
+                n_additional_attackers = n-len(neighbors)
+                possible_edges = np.column_stack((np.tile(self.target_node, n_possible_additional), poss_add_infl))
+                # Compute the struct_scores for all possible additional influencers, and choose the one
+                # with the best struct score.
+                a_hat_uv_additional = self.compute_new_a_hat_uv(possible_edges, self.target_node)
+                additional_struct_scores = self.struct_score(a_hat_uv_additional, XW)
+                additional_influencers = poss_add_infl[np.argsort(additional_struct_scores)[-n_additional_attackers::]]
+                return influencer_nodes, additional_influencers
+            else:
+                return influencer_nodes
+    def compute_logits(self):
+        return (self.adj_norm @ self.adj_norm @ self.modified_features @ self.W)[self.target_node]
+    def strongest_wrong_class(self, logits):
+        label_u_onehot = np.eye(self.nclass)[self.label_u]
+        return (logits - 1000*label_u_onehot).argmax()
+    def feature_scores(self):
+        """Compute feature scores for all possible feature changes.
+        """
+        if self.cooc_constraint is None:
+            self.compute_cooccurrence_constraint(self.influencer_nodes)
+        logits = self.compute_logits()
+        best_wrong_class = self.strongest_wrong_class(logits)
+        surrogate_loss = logits[self.label_u] - logits[best_wrong_class]
+        gradient = self.gradient_wrt_x(self.label_u) - self.gradient_wrt_x(best_wrong_class)
+        # gradients_flipped = (gradient * -1).tolil()
+        gradients_flipped = sp.lil_matrix(gradient * -1)
+        gradients_flipped[self.modified_features.nonzero()] *= -1
+        X_influencers = sp.lil_matrix(self.modified_features.shape)
+        X_influencers[self.influencer_nodes] = self.modified_features[self.influencer_nodes]
+        gradients_flipped = gradients_flipped.multiply((self.cooc_constraint + X_influencers) > 0)
+        nnz_ixs = np.array(gradients_flipped.nonzero()).T
+        sorting = np.argsort(gradients_flipped[tuple(nnz_ixs.T)]).A1
+        sorted_ixs = nnz_ixs[sorting]
+        grads = gradients_flipped[tuple(nnz_ixs[sorting].T)]
+        scores = surrogate_loss - grads
+        return sorted_ixs[::-1], scores.A1[::-1]
+    def compute_cooccurrence_constraint(self, nodes):
+        """
+        Co-occurrence constraint as described in the paper.
+        Parameters
+        ----------
+        nodes: np.array
+            Nodes whose features are considered for change
+        Returns
+        -------
+        np.array [len(nodes), D], dtype bool
+            Binary matrix of dimension len(nodes) x D. A 1 in entry n,d indicates that
+            we are allowed to add feature d to the features of node n.
+        """
+        words_graph = self.cooc_matrix.copy()
+        D = self.modified_features.shape[1]
+        words_graph.setdiag(0)
+        words_graph = (words_graph > 0)
+        word_degrees = np.sum(words_graph, axis=0).A1
+        inv_word_degrees = np.reciprocal(word_degrees.astype(float) + 1e-8)
+        sd = np.zeros([self.nnodes])
+        for n in range(self.nnodes):
+            n_idx = self.modified_features[n, :].nonzero()[1]
+            sd[n] = np.sum(inv_word_degrees[n_idx.tolist()])
+        scores_matrix = sp.lil_matrix((self.nnodes, D))
+        for n in nodes:
+            common_words = words_graph.multiply(self.modified_features[n])
+            idegs = inv_word_degrees[common_words.nonzero()[1]]
+            nnz = common_words.nonzero()[0]
+            scores = np.array([idegs[nnz == ix].sum() for ix in range(D)])
+            scores_matrix[n] = scores
+        self.cooc_constraint = sp.csr_matrix(scores_matrix - 0.5 * sd[:, None] > 0)
+    def gradient_wrt_x(self, label):
+        # return self.adj_norm.dot(self.adj_norm)[self.target_node].T.dot(self.W[:, label].T)
+        return self.adj_norm.dot(self.adj_norm)[self.target_node].T.dot(self.W[:, label].reshape(1, -1))
+    def reset(self):
+        """Reset Nettack
+        """
+        self.modified_adj = self.ori_adj.copy()
+        self.modified_features = self.ori_features.copy()
+        self.structure_perturbations = []
+        self.feature_perturbations = []
+        self.influencer_nodes = []
+        self.potential_edges = []
+        self.cooc_constraint = None
+    def struct_score(self, a_hat_uv, XW):
+        """
+        Compute structure scores, cf. Eq. 15 in the paper
+        Parameters
+        ----------
+        a_hat_uv: sp.sparse_matrix, shape [P,2]
+            Entries of matrix A_hat^2_u for each potential edge (see paper for explanation)
+        XW: sp.sparse_matrix, shape [N, K], dtype float
+            The class logits for each node.
+        Returns
+        -------
+        np.array [P,]
+            The struct score for every row in a_hat_uv
+        """
+        logits = a_hat_uv.dot(XW)
+        label_onehot = np.eye(XW.shape[1])[self.label_u]
+        best_wrong_class_logits = (logits - 1000 * label_onehot).max(1)
+        logits_for_correct_class = logits[:,self.label_u]
+        struct_scores = logits_for_correct_class - best_wrong_class_logits
+        return struct_scores
+    def compute_new_a_hat_uv(self, potential_edges, target_node):
+        """
+        Compute the updated A_hat_square_uv entries that would result from inserting/deleting the input edges,
+        for every edge.
+        Parameters
+        ----------
+        potential_edges: np.array, shape [P,2], dtype int
+            The edges to check.
+        Returns
+        -------
+        sp.sparse_matrix: updated A_hat_square_u entries, a sparse PxN matrix, where P is len(possible_edges).
+        """
+        edges = np.array(self.modified_adj.nonzero()).T
+        edges_set = {tuple(x) for x in edges}
+        A_hat_sq = self.adj_norm @ self.adj_norm
+        values_before = A_hat_sq[target_node].toarray()[0]
+        node_ixs = np.unique(edges[:, 0], return_index=True)[1]
+        twohop_ixs = np.array(A_hat_sq.nonzero()).T
+        degrees = self.modified_adj.sum(0).A1 + 1
+        ixs, vals = compute_new_a_hat_uv(edges, node_ixs, edges_set, twohop_ixs, values_before, degrees,
+                                         potential_edges.astype(np.int32), target_node)
+        ixs_arr = np.array(ixs)
+        a_hat_uv = sp.coo_matrix((vals, (ixs_arr[:, 0], ixs_arr[:, 1])), shape=[len(potential_edges), self.nnodes])
+        return a_hat_uv
+@jit(nopython=True)
+def connected_after(u, v, connected_before, delta):
+    if u == v:
+        if delta == -1:
+            return False
+        else:
+            return True
+    else:
+        return connected_before
+@jit(nopython=True)
+def compute_new_a_hat_uv(edge_ixs, node_nb_ixs, edges_set, twohop_ixs, values_before, degs, potential_edges, u):
+    """
+    Compute the new values [A_hat_square]_u for every potential edge, where u is the target node. C.f. Theorem 5.1
+    equation 17.
+    """
+    N = degs.shape[0]
+    twohop_u = twohop_ixs[twohop_ixs[:, 0] == u, 1]
+    nbs_u = edge_ixs[edge_ixs[:, 0] == u, 1]
+    nbs_u_set = set(nbs_u)
+    return_ixs = []
+    return_values = []
+    for ix in range(len(potential_edges)):
+        edge = potential_edges[ix]
+        edge_set = set(edge)
+        degs_new = degs.copy()
+        delta = -2 * ((edge[0], edge[1]) in edges_set) + 1
+        degs_new[edge] += delta
+        nbs_edge0 = edge_ixs[edge_ixs[:, 0] == edge[0], 1]
+        nbs_edge1 = edge_ixs[edge_ixs[:, 0] == edge[1], 1]
+        affected_nodes = set(np.concatenate((twohop_u, nbs_edge0, nbs_edge1)))
+        affected_nodes = affected_nodes.union(edge_set)
+        a_um = edge[0] in nbs_u_set
+        a_un = edge[1] in nbs_u_set
+        a_un_after = connected_after(u, edge[0], a_un, delta)
+        a_um_after = connected_after(u, edge[1], a_um, delta)
+        for v in affected_nodes:
+            a_uv_before = v in nbs_u_set
+            a_uv_before_sl = a_uv_before or v == u
+            if v in edge_set and u in edge_set and u != v:
+                if delta == -1:
+                    a_uv_after = False
+                else:
+                    a_uv_after = True
+            else:
+                a_uv_after = a_uv_before
+            a_uv_after_sl = a_uv_after or v == u
+            from_ix = node_nb_ixs[v]
+            to_ix = node_nb_ixs[v + 1] if v < N - 1 else len(edge_ixs)
+            node_nbs = edge_ixs[from_ix:to_ix, 1]
+            node_nbs_set = set(node_nbs)
+            a_vm_before = edge[0] in node_nbs_set
+            a_vn_before = edge[1] in node_nbs_set
+            a_vn_after = connected_after(v, edge[0], a_vn_before, delta)
+            a_vm_after = connected_after(v, edge[1], a_vm_before, delta)
+            mult_term = 1 / np.sqrt(degs_new[u] * degs_new[v])
+            sum_term1 = np.sqrt(degs[u] * degs[v]) * values_before[v] - a_uv_before_sl / degs[u] - a_uv_before / \
+                        degs[v]
+            sum_term2 = a_uv_after / degs_new[v] + a_uv_after_sl / degs_new[u]
+            sum_term3 = -((a_um and a_vm_before) / degs[edge[0]]) + (a_um_after and a_vm_after) / degs_new[edge[0]]
+            sum_term4 = -((a_un and a_vn_before) / degs[edge[1]]) + (a_un_after and a_vn_after) / degs_new[edge[1]]
+            new_val = mult_term * (sum_term1 + sum_term2 + sum_term3 + sum_term4)
+            return_ixs.append((ix, v))
+            return_values.append(new_val)
+    return return_ixs, return_values
+def filter_singletons(edges, adj):
+    """
+    Filter edges that, if removed, would turn one or more nodes into singleton nodes.
+    """
+    degs = np.squeeze(np.array(np.sum(adj,0)))
+    existing_edges = np.squeeze(np.array(adj.tocsr()[tuple(edges.T)]))
+    if existing_edges.size > 0:
+        edge_degrees = degs[np.array(edges)] + 2*(1-existing_edges[:,None]) - 1
+    else:
+        edge_degrees = degs[np.array(edges)] + 1
+    zeros = edge_degrees == 0
+    zeros_sum = zeros.sum(1)
+    return zeros_sum == 0
+def compute_alpha(n, S_d, d_min):
+    """
+    Approximate the alpha of a power law distribution.
+    """
+    return n / (S_d - n * np.log(d_min - 0.5)) + 1
+def update_Sx(S_old, n_old, d_old, d_new, d_min):
+    """
+    Update on the sum of log degrees S_d and n based on degree distribution resulting from inserting or deleting
+    a single edge.
+    """
+    old_in_range = d_old >= d_min
+    new_in_range = d_new >= d_min
+    d_old_in_range = np.multiply(d_old, old_in_range)
+    d_new_in_range = np.multiply(d_new, new_in_range)
+    new_S_d = S_old - np.log(np.maximum(d_old_in_range, 1)).sum(1) + np.log(np.maximum(d_new_in_range, 1)).sum(1)
+    new_n = n_old - np.sum(old_in_range, 1) + np.sum(new_in_range, 1)
+    return new_S_d, new_n
+def compute_log_likelihood(n, alpha, S_d, d_min):
+    """
+    Compute log likelihood of the powerlaw fit.
+    """
+    return n * np.log(alpha) + n * alpha * np.log(d_min) - (alpha + 1) * S_d
+def filter_chisquare(ll_ratios, cutoff):
+    return ll_ratios < cutoff

deeprobust/graph/targeted_attack/rnd.py ADDED Viewed

	@@ -0,0 +1,139 @@

+import torch
+from deeprobust.graph.targeted_attack import BaseAttack
+from torch.nn.parameter import Parameter
+from copy import deepcopy
+from deeprobust.graph import utils
+import torch.nn.functional as F
+import numpy as np
+from copy import deepcopy
+import scipy.sparse as sp
+class RND(BaseAttack):
+    """As is described in Adversarial Attacks on Neural Networks for Graph Data (KDD'19),
+    'Rnd is an attack in which we modify the structure of the graph. Given our target node v,
+    in each step we randomly sample nodes u whose lable is different from v and
+    add the edge u,v to the graph structure
+    Parameters
+    ----------
+    model :
+        model to attack
+    nnodes : int
+        number of nodes in the input graph
+    attack_structure : bool
+        whether to attack graph structure
+    attack_features : bool
+        whether to attack node features
+    device: str
+        'cpu' or 'cuda'
+    Examples
+    --------
+    >>> from deeprobust.graph.data import Dataset
+    >>> from deeprobust.graph.targeted_attack import RND
+    >>> data = Dataset(root='/tmp/', name='cora')
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    >>> # Setup Attack Model
+    >>> target_node = 0
+    >>> model = RND()
+    >>> # Attack
+    >>> model.attack(adj, labels, idx_train, target_node, n_perturbations=5)
+    >>> modified_adj = model.modified_adj
+    >>> # # You can also inject nodes
+    >>> # model.add_nodes(features, adj, labels, idx_train, target_node, n_added=10, n_perturbations=100)
+    >>> # modified_adj = model.modified_adj
+    """
+    def __init__(self, model=None, nnodes=None, attack_structure=True, attack_features=False, device='cpu'):
+        super(RND, self).__init__(model, nnodes, attack_structure=attack_structure, attack_features=attack_features, device=device)
+        assert not self.attack_features, 'RND does NOT support attacking features except adding nodes'
+    def attack(self, ori_adj, labels, idx_train, target_node, n_perturbations, **kwargs):
+        """
+        Randomly sample nodes u whose lable is different from v and
+        add the edge u,v to the graph structure. This baseline only
+        has access to true class labels in training set
+        Parameters
+        ----------
+        ori_adj : scipy.sparse.csr_matrix
+            Original (unperturbed) adjacency matrix
+        labels :
+            node labels
+        idx_train :
+            node training indices
+        target_node : int
+            target node index to be attacked
+        n_perturbations : int
+            Number of perturbations on the input graph. Perturbations could
+            be edge removals/additions or feature removals/additions.
+        """
+        # ori_adj: sp.csr_matrix
+        print('number of pertubations: %s' % n_perturbations)
+        modified_adj = ori_adj.tolil()
+        row = ori_adj[target_node].todense().A1
+        diff_label_nodes = [x for x in idx_train if labels[x] != labels[target_node] \
+                            and row[x] == 0]
+        diff_label_nodes = np.random.permutation(diff_label_nodes)
+        if len(diff_label_nodes) >= n_perturbations:
+            changed_nodes = diff_label_nodes[: n_perturbations]
+            modified_adj[target_node, changed_nodes] = 1
+            modified_adj[changed_nodes, target_node] = 1
+        else:
+            changed_nodes = diff_label_nodes
+            unlabeled_nodes = [x for x in range(ori_adj.shape[0]) if x not in idx_train and row[x] == 0]
+            unlabeled_nodes = np.random.permutation(unlabeled_nodes)
+            changed_nodes = np.concatenate([changed_nodes,
+                            unlabeled_nodes[: n_perturbations-len(diff_label_nodes)]])
+            modified_adj[target_node, changed_nodes] = 1
+            modified_adj[changed_nodes, target_node] = 1
+        self.check_adj(modified_adj)
+        self.modified_adj = modified_adj
+        # self.modified_features = modified_features
+    def add_nodes(self, features, ori_adj, labels, idx_train, target_node, n_added=1, n_perturbations=10, **kwargs):
+        """
+        For each added node, first connect the target node with added fake nodes.
+        Then randomly connect the fake nodes with other nodes whose label is
+        different from target node. As for the node feature, simply copy arbitary node
+        """
+        # ori_adj: sp.csr_matrix
+        print('number of pertubations: %s' % n_perturbations)
+        N = ori_adj.shape[0]
+        D = features.shape[1]
+        modified_adj = self.reshape_mx(ori_adj, shape=(N+n_added, N+n_added))
+        modified_features = self.reshape_mx(features, shape=(N+n_added, D))
+        diff_labels = [l for l in range(labels.max()+1) if l != labels[target_node]]
+        diff_labels = np.random.permutation(diff_labels)
+        possible_nodes = [x for x in idx_train if labels[x] == diff_labels[0]]
+        for fake_node in range(N, N+n_added):
+            sampled_nodes = np.random.permutation(possible_nodes)[: n_perturbations]
+            # connect the fake node with target node
+            modified_adj[fake_node, target_node] = 1
+            modified_adj[target_node, fake_node] = 1
+            # connect the fake node with other nodes
+            for node in sampled_nodes:
+                modified_adj[fake_node, node] = 1
+                modified_adj[node, fake_node] = 1
+            modified_features[fake_node] = features[node]
+        self.check_adj(modified_adj)
+        self.modified_adj = modified_adj
+        self.modified_features = modified_features
+        # return modified_adj, modified_features
+    def reshape_mx(self, mx, shape):
+        indices = mx.nonzero()
+        return sp.csr_matrix((mx.data, (indices[0], indices[1])), shape=shape).tolil()

deeprobust/graph/targeted_attack/sga.py ADDED Viewed

	@@ -0,0 +1,323 @@

+import torch
+import torch.nn.functional as F
+import numpy as np
+import scipy.sparse as sp
+from collections import namedtuple
+from functools import lru_cache
+from torch_scatter import scatter_add
+from torch_geometric.utils import k_hop_subgraph
+from deeprobust.graph.targeted_attack import BaseAttack
+from deeprobust.graph import utils
+SubGraph = namedtuple('SubGraph', ['edge_index', 'non_edge_index',
+                                   'self_loop', 'self_loop_weight',
+                                   'edge_weight', 'non_edge_weight',
+                                   'edges_all'])
+class SGAttack(BaseAttack):
+    """SGAttack proposed in `Adversarial Attack on Large Scale Graph` TKDE 2021
+    <https://arxiv.org/abs/2009.03488>
+    SGAttack follows these steps::
+    + training a surrogate SGC model with hop K
+    + extrack a K-hop subgraph centered at target node
+    + choose top-N attacker nodes that belong to the best wrong classes of the target node
+    + compute gradients w.r.t to the subgraph to add or remove edges iteratively
+    Parameters
+    ----------
+    model :
+        model to attack
+    nnodes : int
+        number of nodes in the input graph
+    attack_structure : bool
+        whether to attack graph structure
+    attack_features : bool
+        whether to attack node features
+    device: str
+        'cpu' or 'cuda'
+    Examples
+    --------
+    >>> from deeprobust.graph.data import Dataset
+    >>> from deeprobust.graph.defense import SGC
+    >>> data = Dataset(root='/tmp/', name='cora')
+    >>> adj, features, labels = data.adj, data.features, data.labels
+    >>> idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    >>> surrogate = SGC(nfeat=features.shape[1], K=3, lr=0.1,
+              nclass=labels.max().item() + 1, device='cuda')
+    >>> surrogate = surrogate.to('cuda')
+    >>> pyg_data = Dpr2Pyg(data) # convert deeprobust dataset to pyg dataset
+    >>> surrogate.fit(pyg_data, train_iters=200, patience=200, verbose=True) # train with earlystopping
+    >>> from deeprobust.graph.targeted_attack import SGAttack
+    >>> # Setup Attack Model
+    >>> target_node = 0
+    >>> model = SGAttack(surrogate, attack_structure=True, device=device)
+    >>> # Attack
+    >>> model.attack(features, adj, labels, target_node, n_perturbations=5)
+    >>> modified_adj = model.modified_adj
+    >>> modified_features = model.modified_features
+    """
+    def __init__(self, model, nnodes=None, attack_structure=True, attack_features=False, device='cpu'):
+        super(SGAttack, self).__init__(model=None, nnodes=nnodes,
+                                       attack_structure=attack_structure, attack_features=attack_features, device=device)
+        self.target_node = None
+        self.logits = model.predict()
+        self.K = model.conv1.K
+        W = model.conv1.lin.weight.to(device)
+        b = model.conv1.lin.bias
+        if b is not None:
+            b = b.to(device)
+        self.weight, self.bias = W, b
+    @lru_cache(maxsize=1)
+    def compute_XW(self):
+        return F.linear(self.modified_features, self.weight)
+    def attack(self, features, adj, labels, target_node, n_perturbations, direct=True, n_influencers=3, **kwargs):
+        """Generate perturbations on the input graph.
+        Parameters
+        ----------
+        features :
+            Original (unperturbed) node feature matrix
+        adj :
+            Original (unperturbed) adjacency matrix
+        labels :
+            node labels
+        target_node : int
+            target_node node index to be attacked
+        n_perturbations : int
+            Number of perturbations on the input graph. Perturbations could
+            be edge removals/additions or feature removals/additions.
+        direct: bool
+            whether to conduct direct attack
+        n_influencers : int
+            number of the top influencers to choose. For direct attack, it will set as `n_perturbations`.
+        """
+        if sp.issparse(features):
+            # to dense numpy matrix
+            features = features.A
+        if not torch.is_tensor(features):
+            features = torch.tensor(features, device=self.device)
+        if torch.is_tensor(adj):
+            adj = utils.to_scipy(adj).csr()
+        self.modified_features = features.requires_grad_(bool(self.attack_features))
+        target_label = torch.LongTensor([labels[target_node]])
+        best_wrong_label = torch.LongTensor([(self.logits[target_node].cpu() - 1000 * torch.eye(self.logits.size(1))[target_label]).argmax()])
+        self.selfloop_degree = torch.tensor(adj.sum(1).A1 + 1, device=self.device)
+        self.target_label = target_label.to(self.device)
+        self.best_wrong_label = best_wrong_label.to(self.device)
+        self.n_perturbations = n_perturbations
+        self.ori_adj = adj
+        self.target_node = target_node
+        self.direct = direct
+        attacker_nodes = torch.where(torch.as_tensor(labels) == best_wrong_label)[0]
+        subgraph = self.get_subgraph(attacker_nodes, n_influencers)
+        if not direct:
+            # for indirect attack, the edges adjacent to targeted node should not be considered
+            mask = torch.logical_or(subgraph.edge_index[0] == target_node, subgraph.edge_index[1] == target_node).to(self.device)
+        structure_perturbations = []
+        feature_perturbations = []
+        num_features = features.shape[-1]
+        for _ in range(n_perturbations):
+            edge_grad, non_edge_grad, features_grad = self.compute_gradient(subgraph)
+            max_structure_score = max_feature_score = 0.
+            if self.attack_structure:
+                edge_grad *= (-2 * subgraph.edge_weight + 1)
+                non_edge_grad *= -2 * subgraph.non_edge_weight + 1
+                min_grad = min(edge_grad.min().item(), non_edge_grad.min().item())
+                edge_grad -= min_grad
+                non_edge_grad -= min_grad
+                if not direct:
+                    edge_grad[mask] = 0.
+                max_edge_grad, max_edge_idx = torch.max(edge_grad, dim=0)
+                max_non_edge_grad, max_non_edge_idx = torch.max(non_edge_grad, dim=0)
+                max_structure_score = max(max_edge_grad.item(), max_non_edge_grad.item())
+            if self.attack_features:
+                features_grad *= -2 * self.modified_features + 1
+                features_grad -= features_grad.min()
+                if not direct:
+                    features_grad[target_node] = 0.
+                max_feature_grad, max_feature_idx = torch.max(features_grad.view(-1), dim=0)
+                max_feature_score = max_feature_grad.item()
+            if max_structure_score >= max_feature_score:
+                if max_edge_grad > max_non_edge_grad:
+                    # remove one edge
+                    best_edge = subgraph.edge_index[:, max_edge_idx]
+                    subgraph.edge_weight.data[max_edge_idx] = 0.0
+                    self.selfloop_degree[best_edge] -= 1.0
+                else:
+                    # add one edge
+                    best_edge = subgraph.non_edge_index[:, max_non_edge_idx]
+                    subgraph.non_edge_weight.data[max_non_edge_idx] = 1.0
+                    self.selfloop_degree[best_edge] += 1.0
+                u, v = best_edge.tolist()
+                structure_perturbations.append((u, v))
+            else:
+                u, v = divmod(max_feature_idx.item(), num_features)
+                feature_perturbations.append((u, v))
+                self.modified_features[u, v].data.fill_(1. - self.modified_features[u, v].data)
+        if structure_perturbations:
+            modified_adj = adj.tolil(copy=True)
+            row, col = list(zip(*structure_perturbations))
+            modified_adj[row, col] = modified_adj[col, row] = 1 - modified_adj[row, col].A
+            modified_adj = modified_adj.tocsr(copy=False)
+            modified_adj.eliminate_zeros()
+        else:
+            modified_adj = adj.copy()
+        self.modified_adj = modified_adj
+        self.modified_features = self.modified_features.detach().cpu().numpy()
+        self.structure_perturbations = structure_perturbations
+        self.feature_perturbations = feature_perturbations
+    def get_subgraph(self, attacker_nodes, n_influencers=None):
+        target_node = self.target_node
+        neighbors = self.ori_adj[target_node].indices
+        sub_nodes, sub_edges = self.ego_subgraph()
+        if self.direct or n_influencers is not None:
+            influencers = [target_node]
+            attacker_nodes = np.setdiff1d(attacker_nodes, neighbors)
+        else:
+            influencers = neighbors
+        subgraph = self.subgraph_processing(influencers, attacker_nodes, sub_nodes, sub_edges)
+        if n_influencers is not None and self.attack_structure:
+            if self.direct:
+                influencers = [target_node]
+                attacker_nodes = self.get_topk_influencers(subgraph, k=self.n_perturbations + 1)
+            else:
+                influencers = neighbors
+                attacker_nodes = self.get_topk_influencers(subgraph, k=n_influencers)
+            subgraph = self.subgraph_processing(influencers, attacker_nodes, sub_nodes, sub_edges)
+        return subgraph
+    def get_topk_influencers(self, subgraph, k):
+        _, non_edge_grad, _ = self.compute_gradient(subgraph)
+        _, topk_nodes = torch.topk(non_edge_grad, k=k, sorted=False)
+        influencers = subgraph.non_edge_index[1][topk_nodes.cpu()]
+        return influencers.cpu().numpy()
+    def subgraph_processing(self, influencers, attacker_nodes, sub_nodes, sub_edges):
+        if not self.attack_structure:
+            self_loop = sub_nodes.repeat((2, 1))
+            edges_all = torch.cat([sub_edges, sub_edges[[1, 0]], self_loop], dim=1)
+            edge_weight = torch.ones(edges_all.size(1), device=self.device)
+            return SubGraph(edge_index=sub_edges, non_edge_index=None,
+                            self_loop=None, edges_all=edges_all,
+                            edge_weight=edge_weight, non_edge_weight=None,
+                            self_loop_weight=None)
+        row = np.repeat(influencers, len(attacker_nodes))
+        col = np.tile(attacker_nodes, len(influencers))
+        non_edges = np.row_stack([row, col])
+        if len(influencers) > 1:
+            mask = self.ori_adj[non_edges[0],
+                                non_edges[1]].A1 == 0
+            non_edges = non_edges[:, mask]
+        non_edges = torch.as_tensor(non_edges, device=self.device)
+        unique_nodes = np.union1d(sub_nodes.tolist(), attacker_nodes)
+        unique_nodes = torch.as_tensor(unique_nodes, device=self.device)
+        self_loop = unique_nodes.repeat((2, 1))
+        edges_all = torch.cat([sub_edges, sub_edges[[1, 0]],
+                               non_edges, non_edges[[1, 0]], self_loop], dim=1)
+        edge_weight = torch.ones(sub_edges.size(1), device=self.device).requires_grad_(bool(self.attack_structure))
+        non_edge_weight = torch.zeros(non_edges.size(1), device=self.device).requires_grad_(bool(self.attack_structure))
+        self_loop_weight = torch.ones(self_loop.size(1), device=self.device)
+        edge_index = sub_edges
+        non_edge_index = non_edges
+        self_loop = self_loop
+        subgraph = SubGraph(edge_index=edge_index, non_edge_index=non_edge_index,
+                            self_loop=self_loop, edges_all=edges_all,
+                            edge_weight=edge_weight, non_edge_weight=non_edge_weight,
+                            self_loop_weight=self_loop_weight)
+        return subgraph
+    def SGCCov(self, x, edge_index, edge_weight):
+        row, col = edge_index
+        for _ in range(self.K):
+            src = x[row] * edge_weight.view(-1, 1)
+            x = scatter_add(src, col, dim=-2, dim_size=x.size(0))
+        return x
+    def compute_gradient(self, subgraph, eps=5.0):
+        if self.attack_structure:
+            edge_weight = subgraph.edge_weight
+            non_edge_weight = subgraph.non_edge_weight
+            self_loop_weight = subgraph.self_loop_weight
+            weights = torch.cat([edge_weight, edge_weight,
+                                non_edge_weight, non_edge_weight,
+                                self_loop_weight], dim=0)
+        else:
+            weights = subgraph.edge_weight
+        weights = self.gcn_norm(subgraph.edges_all, weights, self.selfloop_degree)
+        logit = self.SGCCov(self.compute_XW(), subgraph.edges_all, weights)
+        logit = logit[self.target_node]
+        if self.bias is not None:
+            logit += self.bias
+        # model calibration
+        logit = F.log_softmax(logit.view(1, -1) / eps, dim=1)
+        loss = F.nll_loss(logit, self.target_label) - F.nll_loss(logit, self.best_wrong_label)
+        edge_grad = non_edge_grad = features_grad = None
+        if self.attack_structure and self.attack_features:
+            edge_grad, non_edge_grad, features_grad = torch.autograd.grad(loss, [edge_weight, non_edge_weight, self.modified_features], create_graph=False)
+        elif self.attack_structure:
+            edge_grad, non_edge_grad = torch.autograd.grad(loss, [edge_weight, non_edge_weight], create_graph=False)
+        else:
+            features_grad = torch.autograd.grad(loss, self.modified_features, create_graph=False)[0]
+        if self.attack_features:
+            self.compute_XW.cache_clear()
+        return edge_grad, non_edge_grad, features_grad
+    def ego_subgraph(self):
+        edge_index = np.asarray(self.ori_adj.nonzero())
+        edge_index = torch.as_tensor(edge_index, dtype=torch.long, device=self.device)
+        sub_nodes, sub_edges, *_ = k_hop_subgraph(int(self.target_node), self.K, edge_index)
+        sub_edges = sub_edges[:, sub_edges[0] < sub_edges[1]]
+        return sub_nodes, sub_edges
+    @ staticmethod
+    def gcn_norm(edge_index, weights, degree):
+        row, col = edge_index
+        inv_degree = torch.pow(degree, -0.5)
+        normed_weights = weights * inv_degree[row] * inv_degree[col]
+        return normed_weights

deeprobust/graph/targeted_attack/ugba.py ADDED Viewed

	@@ -0,0 +1,913 @@

+import numpy as np
+import scipy.sparse as sp
+import time
+import copy
+import torch
+import torch.nn.functional as F
+from torch.nn.parameter import Parameter
+from torch_geometric.utils import degree
+from sklearn.cluster import KMeans
+from copy import deepcopy
+# from deeprobust.graph.defense_pyg import GCN, SAGE, GAT
+from deeprobust.graph.targeted_attack import BaseAttack
+from deeprobust.graph import utils
+class UGBA(BaseAttack):
+    """
+    Modified from Unnoticeable Backdoor Attacks on Graph Neural Networks (WWW 2023).
+    see example in examples/graph/test_ugba.py
+    Parameters
+    ----------
+    vs_number: int
+        number of selected poisoned for training backdoor model
+    device: str
+        'cpu' or 'cuda'
+    target_class: int
+        the class that the attacker aim to misclassify into
+    trigger_size: int
+        the number of nodes in a trigger
+    target_loss_weight: float
+    homo_loss_weight: float
+        the weight of homophily loss
+    homo_boost_thrd: float
+        the upper bound of similarity
+    train_epochs: int
+        the number of epochs when training GCN encoder
+    trojan_epochs: int
+        the number of epochs when training trigger generator
+    """
+    def __init__(self, data, vs_number,
+                 target_class = 0, trigger_size = 3, target_loss_weight = 1,
+                 homo_loss_weight = 100, homo_boost_thrd = 0.8, train_epochs = 200, trojan_epochs = 800, dis_weight = 1,
+                 inner = 1, thrd=0.5, lr = 0.01, hidden = 32, weight_decay = 5e-4,
+                 seed = 10, debug = True, device='cpu'):
+        self.device = device
+        self.data = data
+        self.size = vs_number
+        # self.test_model = model
+        self.target_class = target_class
+        self.trigger_size = trigger_size
+        self.target_loss_weight = target_loss_weight
+        self.homo_loss_weight = homo_loss_weight
+        self.homo_boost_thrd = homo_boost_thrd
+        self.train_epochs = train_epochs
+        self.trojan_epochs = trojan_epochs
+        self.dis_weight = dis_weight
+        self.inner = inner
+        self.thrd = thrd
+        self.lr = lr
+        self.hidden = hidden
+        self.weight_decay = weight_decay
+        self.seed = seed
+        self.debug = debug
+        # filter out the unlabeled nodes except from training nodes and testing nodes, nonzero() is to get index, flatten is to get 1-d tensor
+        self.unlabeled_idx = (torch.bitwise_not(data.test_mask)&torch.bitwise_not(data.train_mask)).nonzero().flatten()
+        self.idx_val = utils.index_to_mask(data.val_mask, size=data.x.shape[0])
+    def attack(self, target_node, x, y, edge_index, edge_weights = None):
+        '''
+        inject the generated trigger to the target node (a single node)
+        Parameters
+        ----------
+        target_node: int
+            the index of target node
+        x: tensor:
+            features of nodes
+        y: tensor:
+            node labels
+        edge_index: tensor:
+            edge index of the graph
+        edge_weights: tensor:
+            the weights of edges
+        '''
+        idx_target = torch.tensor([target_node])
+        print(idx_target)
+        if(edge_weights == None):
+            edge_weights = torch.ones([edge_index.shape[1]]).to(self.device)
+        x, edge_index, edge_weights, y = self.inject_trigger(idx_target, x, y, edge_index, edge_weights)
+        return x, edge_index, edge_weights, y
+    def get_poisoned_graph(self):
+        '''
+        Obtain the poisoned training graph for training backdoor GNN
+        '''
+        assert self.trigger_generator, "please first use train_trigger_generator() to train trigger generator and get poisoned nodes"
+        poison_x, poison_edge_index, poison_edge_weights, poison_labels = self.trigger_generator.get_poisoned()
+        # add poisoned nodes into training nodes
+        idx_bkd_tn = torch.cat([self.idx_train,self.idx_attach]).to(self.device)
+        poison_data = copy.deepcopy(self.data)
+        idx_val = poison_data.val_mask.nonzero().flatten()
+        idx_test = poison_data.test_mask.nonzero().flatten()
+        poison_data.x, poison_data.edge_index, poison_data.edge_weights, poison_data.y = poison_x, poison_edge_index, poison_edge_weights, poison_labels
+        poison_data.train_mask = utils.index_to_mask(idx_bkd_tn, poison_data.x.shape[0])
+        poison_data.val_mask = utils.index_to_mask(idx_val, poison_data.x.shape[0])
+        poison_data.test_mask = utils.index_to_mask(idx_test, poison_data.x.shape[0])
+        return poison_data
+    def train_trigger_generator(self, idx_train, edge_index, edge_weights = None, selection_method = 'cluster', **kwargs):
+        """
+        Train the adpative trigger generator
+        Parameters
+        ----------
+        idx_train: tensor:
+            indexs of training nodes
+        edge_index: tensor:
+            edge index of the graph
+        edge_weights: tensor:
+            the weights of edges
+        selection method : ['none', 'cluster']
+            the method to select poisoned nodes
+        """
+        self.idx_train = idx_train
+        # self.data = data
+        idx_attach = self.select_idx_attach(selection_method, edge_index, edge_weights).to(self.device)
+        self.idx_attach = idx_attach
+        print("idx_attach: {}".format(idx_attach))
+        # train trigger generator
+        trigger_generator = Backdoor(self.target_class, self.trigger_size, self.target_loss_weight,
+                                     self.homo_loss_weight, self.homo_boost_thrd, self.trojan_epochs,
+                                     self.inner, self.thrd, self.lr, self.hidden, self.weight_decay,
+                                     self.seed, self.debug, self.device)
+        self.trigger_generator = trigger_generator
+        self.trigger_generator.fit(self.data.x, edge_index, edge_weights, self.data.y, idx_train,idx_attach, self.unlabeled_idx)
+        return self.trigger_generator, idx_attach
+    def inject_trigger(self, idx_attach, x, y, edge_index, edge_weights):
+        """
+        Attach the generated triggers with the attachde nodes
+        Parameters
+        ----------
+        idx_attach: tensor:
+            indexs of to-be attached nodes
+        x: tensor:
+            features of nodes
+        y: tensor:
+            node labels
+        edge_index: tensor:
+            edge index of the graph
+        edge_weights: tensor:
+            the weights of edges
+        """
+        assert self.trigger_generator, "please first use train_trigger_generator() to train trigger generator"
+        update_x, update_edge_index,update_edge_weights, update_y = self.trigger_generator.inject_trigger(idx_attach,x,edge_index,edge_weights,y,self.device)
+        return update_x, update_edge_index,update_edge_weights, update_y
+    def select_idx_attach(self, selection_method, edge_index, edge_weights = None):
+        if(selection_method == 'none'):
+            idx_attach = self.obtain_attach_nodes(self.unlabeled_idx,self.size)
+        elif(selection_method == 'cluster'):
+            idx_attach = self.cluster_selection(self.data,self.idx_train,self.idx_val,self.unlabeled_idx,self.size,edge_index,edge_weights)
+            idx_attach = torch.LongTensor(idx_attach).to(self.device)
+        return idx_attach
+    def obtain_attach_nodes(self,node_idxs, size):
+        ### current random to implement
+        size = min(len(node_idxs),size)
+        rs = np.random.RandomState(self.seed)
+        choice = np.arange(len(node_idxs))
+        rs.shuffle(choice)
+        return node_idxs[choice[:size]]
+    def cluster_selection(self,data,idx_train,idx_val,unlabeled_idx,size,edge_index,edge_weights = None):
+        gcn_encoder = GCN_Encoder(nfeat=data.x.shape[1],
+                            nhid=32,
+                            nclass= int(data.y.max()+1),
+                            dropout=0.5,
+                            lr=0.01,
+                            weight_decay=5e-4,
+                            device=self.device,
+                            use_ln=False,
+                            layer_norm_first=False).to(self.device)
+        t_total = time.time()
+        # edge_weights = torch.ones([data.edge_index.shape[1]],device=device,dtype=torch.float)
+        print("Length of training set: {}".format(len(idx_train)))
+        gcn_encoder.fit(data.x, edge_index, edge_weights, data.y, idx_train, idx_val= idx_val,train_iters=self.train_epochs,verbose=True)
+        print("Training encoder Finished!")
+        print("Total time elapsed: {:.4f}s".format(time.time() - t_total))
+        seen_node_idx = torch.concat([idx_train,unlabeled_idx])
+        nclass = np.unique(data.y.cpu().numpy()).shape[0]
+        encoder_x = gcn_encoder.get_h(data.x, edge_index,edge_weights).clone().detach()
+        kmeans = KMeans(n_clusters=nclass,random_state=1)
+        kmeans.fit(encoder_x[seen_node_idx].detach().cpu().numpy())
+        cluster_centers = kmeans.cluster_centers_
+        y_pred = kmeans.predict(encoder_x.cpu().numpy())
+        # encoder_output = gcn_encoder(data.x,train_edge_index,None)
+        idx_attach = self.obtain_attach_nodes_by_cluster_degree_all(edge_index,y_pred,cluster_centers,unlabeled_idx.cpu().tolist(),encoder_x,size).astype(int)
+        idx_attach = idx_attach[:size]
+        return idx_attach
+    def obtain_attach_nodes_by_cluster_degree_all(self,edge_index,y_pred,cluster_centers,node_idxs,x,size):
+        dis_weight = self.dis_weight
+        degrees = (degree(edge_index[0])  + degree(edge_index[1])).cpu().numpy()
+        distances = []
+        for id in range(x.shape[0]):
+            tmp_center_label = y_pred[id]
+            tmp_center_x = cluster_centers[tmp_center_label]
+            dis = np.linalg.norm(tmp_center_x - x[id].detach().cpu().numpy())
+            distances.append(dis)
+        distances = np.array(distances)
+        print(y_pred)
+        nontarget_nodes = np.where(y_pred!=self.target_class)[0]
+        non_target_node_idxs = np.array(list(set(nontarget_nodes) & set(node_idxs)))
+        node_idxs = np.array(non_target_node_idxs)
+        candiadate_distances = distances[node_idxs]
+        candiadate_degrees = degrees[node_idxs]
+        candiadate_distances = self.max_norm(candiadate_distances)
+        candiadate_degrees = self.max_norm(candiadate_degrees)
+        dis_score = candiadate_distances + dis_weight * candiadate_degrees
+        candidate_nid_index = np.argsort(dis_score)
+        sorted_node_idex = np.array(node_idxs[candidate_nid_index])
+        selected_nodes = sorted_node_idex
+        return selected_nodes
+    def max_norm(self,data):
+        _range = np.max(data) - np.min(data)
+        return (data - np.min(data)) / _range
+from copy import deepcopy
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+def accuracy(output, labels):
+    """Return accuracy of output compared to labels.
+    Parameters
+    ----------
+    output : torch.Tensor
+        output from model
+    labels : torch.Tensor or numpy.array
+        node labels
+    Returns
+    -------
+    float
+        accuracy
+    """
+    if not hasattr(labels, '__len__'):
+        labels = [labels]
+    if type(labels) is not torch.Tensor:
+        labels = torch.LongTensor(labels)
+    preds = output.max(1)[1].type_as(labels)
+    correct = preds.eq(labels).double()
+    correct = correct.sum()
+    return correct / len(labels)
+#%%
+class GradWhere(torch.autograd.Function):
+    """
+    We can implement our own custom autograd Functions by subclassing
+    torch.autograd.Function and implementing the forward and backward passes
+    which operate on Tensors.
+    """
+    @staticmethod
+    def forward(ctx, input, thrd, device):
+        """
+        In the forward pass we receive a Tensor containing the input and return
+        a Tensor containing the output. ctx is a context object that can be used
+        to stash information for backward computation. You can cache arbitrary
+        objects for use in the backward pass using the ctx.save_for_backward method.
+        """
+        ctx.save_for_backward(input)
+        rst = torch.where(input>thrd, torch.tensor(1.0, device=device, requires_grad=True),
+                                      torch.tensor(0.0, device=device, requires_grad=True))
+        return rst
+    @staticmethod
+    def backward(ctx, grad_output):
+        """
+        In the backward pass we receive a Tensor containing the gradient of the loss
+        with respect to the output, and we need to compute the gradient of the loss
+        with respect to the input.
+        """
+        input, = ctx.saved_tensors
+        grad_input = grad_output.clone()
+        """
+        Return results number should corresponding with .forward inputs (besides ctx),
+        for each input, return a corresponding backward grad
+        """
+        return grad_input, None, None
+class GraphTrojanNet(nn.Module):
+    # In the furture, we may use a GNN model to generate backdoor
+    def __init__(self, device, nfeat, nout, layernum=1, dropout=0.00):
+        super(GraphTrojanNet, self).__init__()
+        layers = []
+        if dropout > 0:
+            layers.append(nn.Dropout(p=dropout))
+        for l in range(layernum-1):
+            layers.append(nn.Linear(nfeat, nfeat))
+            layers.append(nn.ReLU(inplace=True))
+            if dropout > 0:
+                layers.append(nn.Dropout(p=dropout))
+        self.layers = nn.Sequential(*layers).to(device)
+        self.feat = nn.Linear(nfeat,nout*nfeat)
+        self.edge = nn.Linear(nfeat, int(nout*(nout-1)/2))
+        self.device = device
+    def forward(self, input, thrd):
+        """
+        "input", "mask" and "thrd", should already in cuda before sent to this function.
+        If using sparse format, corresponding tensor should already in sparse format before
+        sent into this function
+        """
+        GW = GradWhere.apply
+        self.layers = self.layers
+        h = self.layers(input)
+        feat = self.feat(h)
+        edge_weight = self.edge(h)
+        # feat = GW(feat, thrd, self.device)
+        edge_weight = GW(edge_weight, thrd, self.device)
+        return feat, edge_weight
+class HomoLoss(nn.Module):
+    def __init__(self,device):
+        super(HomoLoss, self).__init__()
+        self.device = device
+    def forward(self,trigger_edge_index,trigger_edge_weights,x,thrd):
+        trigger_edge_index = trigger_edge_index[:,trigger_edge_weights>0.0]
+        edge_sims = F.cosine_similarity(x[trigger_edge_index[0]],x[trigger_edge_index[1]])
+        loss = torch.relu(thrd - edge_sims).mean()
+        # print(edge_sims.min())
+        return loss
+#%%
+import numpy as np
+class Backdoor:
+    def __init__(self, target_class, trigger_size, target_loss_weight, homo_loss_weight, homo_boost_thrd, trojan_epochs, inner, thrd, lr, hidden, weight_decay, seed, debug, device):
+        self.device = device
+        self.weights = None
+        self.trigger_size = trigger_size
+        self.thrd = thrd
+        self.trigger_index = self.get_trigger_index(self.trigger_size)
+        self.hidden = hidden
+        self.target_class =target_class
+        self.lr = lr
+        self.weight_decay = weight_decay
+        self.trojan_epochs = trojan_epochs
+        self.inner = inner
+        self.seed = seed
+        self.target_loss_weight = target_loss_weight
+        self.homo_boost_thrd = homo_boost_thrd
+        self.homo_loss_weight = homo_loss_weight
+        self.debug = debug
+    def get_trigger_index(self,trigger_size):
+        edge_list = []
+        edge_list.append([0,0])
+        for j in range(trigger_size):
+            for k in range(j):
+                edge_list.append([j,k])
+        edge_index = torch.tensor(edge_list,device=self.device).long().T
+        return edge_index
+    def get_trojan_edge(self,start, idx_attach, trigger_size):
+        edge_list = []
+        for idx in idx_attach:
+            edges = self.trigger_index.clone()
+            edges[0,0] = idx
+            edges[1,0] = start
+            edges[:,1:] = edges[:,1:] + start
+            edge_list.append(edges)
+            start += trigger_size
+        edge_index = torch.cat(edge_list,dim=1)
+        # to undirected
+        # row, col = edge_index
+        row = torch.cat([edge_index[0], edge_index[1]])
+        col = torch.cat([edge_index[1],edge_index[0]])
+        edge_index = torch.stack([row,col])
+        return edge_index
+    def inject_trigger(self, idx_attach, features,edge_index,edge_weight,y,device):
+        self.trojan = self.trojan.to(device)
+        idx_attach = idx_attach.to(device)
+        features = features.to(device)
+        edge_index = edge_index.to(device)
+        edge_weight = edge_weight.to(device)
+        self.trojan.eval()
+        trojan_feat, trojan_weights = self.trojan(features[idx_attach],self.thrd) # may revise the process of generate
+        trojan_weights = torch.cat([torch.ones([len(idx_attach),1],dtype=torch.float,device=device),trojan_weights],dim=1)
+        trojan_weights = trojan_weights.flatten()
+        trojan_feat = trojan_feat.view([-1,features.shape[1]])
+        trojan_edge = self.get_trojan_edge(len(features),idx_attach,self.trigger_size).to(device)
+        update_edge_weights = torch.cat([edge_weight,trojan_weights,trojan_weights])
+        update_feat = torch.cat([features,trojan_feat])
+        update_edge_index = torch.cat([edge_index,trojan_edge],dim=1)
+        # update label set
+        update_y = torch.cat([y,-1*torch.ones([len(idx_attach)*self.trigger_size],dtype=torch.long,device=device)])
+        self.trojan = self.trojan.cpu()
+        idx_attach = idx_attach.cpu()
+        features = features.cpu()
+        edge_index = edge_index.cpu()
+        edge_weight = edge_weight.cpu()
+        return update_feat, update_edge_index, update_edge_weights, update_y
+    def fit(self, features, edge_index, edge_weight, labels, idx_train, idx_attach,idx_unlabeled):
+        if edge_weight is None:
+            edge_weight = torch.ones([edge_index.shape[1]],device=self.device,dtype=torch.float)
+        self.idx_attach = idx_attach
+        self.features = features
+        self.edge_index = edge_index
+        self.edge_weights = edge_weight
+        # initial a shadow model
+        self.shadow_model = GCN(nfeat=features.shape[1],
+                         nhid=self.hidden,
+                         nclass=labels.max().item() + 1,
+                         dropout=0.0, device=self.device).to(self.device)
+        # initalize a trojanNet to generate trigger
+        self.trojan = GraphTrojanNet(self.device, features.shape[1], self.trigger_size, layernum=2).to(self.device)
+        self.homo_loss = HomoLoss(self.device)
+        optimizer_shadow = optim.Adam(self.shadow_model.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        optimizer_trigger = optim.Adam(self.trojan.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        # change the labels of the poisoned node to the target class
+        self.labels = labels.clone()
+        self.labels[idx_attach] = self.target_class
+        # get the trojan edges, which include the target-trigger edge and the edges among trigger
+        trojan_edge = self.get_trojan_edge(len(features),idx_attach,self.trigger_size).to(self.device)
+        # update the poisoned graph's edge index
+        poison_edge_index = torch.cat([edge_index,trojan_edge],dim=1)
+        # furture change it to bilevel optimization
+        loss_best = 1e8
+        for i in range(self.trojan_epochs):
+            self.trojan.train()
+            for j in range(self.inner):
+                optimizer_shadow.zero_grad()
+                trojan_feat, trojan_weights = self.trojan(features[idx_attach],self.thrd) # may revise the process of generate
+                trojan_weights = torch.cat([torch.ones([len(trojan_feat),1],dtype=torch.float,device=self.device),trojan_weights],dim=1)
+                trojan_weights = trojan_weights.flatten()
+                trojan_feat = trojan_feat.view([-1,features.shape[1]])
+                poison_edge_weights = torch.cat([edge_weight,trojan_weights,trojan_weights]).detach() # repeat trojan weights beacuse of undirected edge
+                poison_x = torch.cat([features,trojan_feat]).detach()
+                output = self.shadow_model(poison_x, poison_edge_index, poison_edge_weights)
+                loss_inner = F.nll_loss(output[torch.cat([idx_train,idx_attach])], self.labels[torch.cat([idx_train,idx_attach])]) # add our adaptive loss
+                loss_inner.backward()
+                optimizer_shadow.step()
+            acc_train_clean = accuracy(output[idx_train], self.labels[idx_train])
+            acc_train_attach = accuracy(output[idx_attach], self.labels[idx_attach])
+            # involve unlabeled nodes in outter optimization
+            self.trojan.eval()
+            optimizer_trigger.zero_grad()
+            rs = np.random.RandomState(self.seed)
+            idx_outter = torch.cat([idx_attach,idx_unlabeled[rs.choice(len(idx_unlabeled),size=512,replace=False)]])
+            trojan_feat, trojan_weights = self.trojan(features[idx_outter],self.thrd) # may revise the process of generate
+            trojan_weights = torch.cat([torch.ones([len(idx_outter),1],dtype=torch.float,device=self.device),trojan_weights],dim=1)
+            trojan_weights = trojan_weights.flatten()
+            trojan_feat = trojan_feat.view([-1,features.shape[1]])
+            trojan_edge = self.get_trojan_edge(len(features),idx_outter,self.trigger_size).to(self.device)
+            update_edge_weights = torch.cat([edge_weight,trojan_weights,trojan_weights])
+            update_feat = torch.cat([features,trojan_feat])
+            update_edge_index = torch.cat([edge_index,trojan_edge],dim=1)
+            output = self.shadow_model(update_feat, update_edge_index, update_edge_weights)
+            labels_outter = labels.clone()
+            labels_outter[idx_outter] = self.target_class
+            loss_target = self.target_loss_weight *F.nll_loss(output[torch.cat([idx_train,idx_outter])],
+                                    labels_outter[torch.cat([idx_train,idx_outter])])
+            loss_homo = 0.0
+            if(self.homo_loss_weight > 0):
+                loss_homo = self.homo_loss(trojan_edge[:,:int(trojan_edge.shape[1]/2)],\
+                                            trojan_weights,\
+                                            update_feat,\
+                                            self.homo_boost_thrd)
+            loss_outter = loss_target + self.homo_loss_weight * loss_homo
+            loss_outter.backward()
+            optimizer_trigger.step()
+            acc_train_outter =(output[idx_outter].argmax(dim=1)==self.target_class).float().mean()
+            if loss_outter<loss_best:
+                self.weights = deepcopy(self.trojan.state_dict())
+                loss_best = float(loss_outter)
+            if self.debug and i % 10 == 0:
+                print('Epoch {}, loss_inner: {:.5f}, loss_target: {:.5f}, homo loss: {:.5f} '\
+                        .format(i, loss_inner, loss_target, loss_homo))
+                print("acc_train_clean: {:.4f}, ASR_train_attach: {:.4f}, ASR_train_outter: {:.4f}"\
+                        .format(acc_train_clean,acc_train_attach,acc_train_outter))
+        if self.debug:
+            print("load best weight based on the loss outter")
+        self.trojan.load_state_dict(self.weights)
+        self.trojan.eval()
+        # torch.cuda.empty_cache()
+    def get_poisoned(self):
+        with torch.no_grad():
+            poison_x, poison_edge_index, poison_edge_weights, poison_labels = self.inject_trigger(self.idx_attach,self.features,self.edge_index,self.edge_weights, self.labels, self.device)
+        # poison_labels = self.labels
+        poison_edge_index = poison_edge_index[:,poison_edge_weights>0.0]
+        poison_edge_weights = poison_edge_weights[poison_edge_weights>0.0]
+        return poison_x, poison_edge_index, poison_edge_weights, poison_labels
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+from copy import deepcopy
+from torch_geometric.nn import GCNConv
+import numpy as np
+import scipy.sparse as sp
+class GCN_Encoder(nn.Module):
+    def __init__(self, nfeat, nhid, nclass, dropout=0.5, lr=0.01, weight_decay=5e-4, layer=2,device=None,use_ln=False,layer_norm_first=False):
+        super(GCN_Encoder, self).__init__()
+        assert device is not None, "Please specify 'device'!"
+        self.device = device
+        self.nfeat = nfeat
+        self.hidden_sizes = [nhid]
+        self.nclass = nclass
+        self.use_ln = use_ln
+        self.layer_norm_first = layer_norm_first
+        # self.convs = nn.ModuleList()
+        # self.convs.append(GCNConv(nfeat, nhid))
+        # for _ in range(layer-2):
+        #     self.convs.append(GCNConv(nhid,nhid))
+        # self.gc2 = GCNConv(nhid, nclass)
+        self.body = GCN_body(nfeat, nhid, dropout, layer,device=None,use_ln=use_ln,layer_norm_first=layer_norm_first)
+        self.fc = nn.Linear(nhid,nclass)
+        self.dropout = dropout
+        self.lr = lr
+        self.output = None
+        self.edge_index = None
+        self.edge_weight = None
+        self.features = None
+        self.weight_decay = weight_decay
+    def forward(self, x, edge_index, edge_weight=None):
+        x = self.body(x, edge_index,edge_weight)
+        x = self.fc(x)
+        return F.log_softmax(x,dim=1)
+    def get_h(self, x, edge_index,edge_weight):
+        self.eval()
+        x = self.body(x, edge_index,edge_weight)
+        return x
+    def fit(self, features, edge_index, edge_weight, labels, idx_train, idx_val=None, train_iters=200, verbose=False):
+        """Train the gcn model, when idx_val is not None, pick the best model according to the validation loss.
+        Parameters
+        ----------
+        features :
+            node features
+        adj :
+            the adjacency matrix. The format could be torch.tensor or scipy matrix
+        labels :
+            node labels
+        idx_train :
+            node training indices
+        idx_val :
+            node validation indices. If not given (None), GCN training process will not adpot early stopping
+        train_iters : int
+            number of training epochs
+        initialize : bool
+            whether to initialize parameters before training
+        verbose : bool
+            whether to show verbose logs
+        """
+        self.edge_index, self.edge_weight = edge_index, edge_weight
+        self.features = features.to(self.device)
+        self.labels = labels.to(self.device)
+        if idx_val is None:
+            self._train_without_val(self.labels, idx_train, train_iters, verbose)
+        else:
+            self._train_with_val(self.labels, idx_train, idx_val, train_iters, verbose)
+    def _train_without_val(self, labels, idx_train, train_iters, verbose):
+        self.train()
+        optimizer = optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        for i in range(train_iters):
+            optimizer.zero_grad()
+            output = self.forward(self.features, self.edge_index, self.edge_weight)
+            loss_train = F.nll_loss(output[idx_train], labels[idx_train])
+            loss_train.backward()
+            optimizer.step()
+            if verbose and i % 10 == 0:
+                print('Epoch {}, training loss: {}'.format(i, loss_train.item()))
+        self.eval()
+        output = self.forward(self.features, self.edge_index, self.edge_weight)
+        self.output = output
+    def _train_with_val(self, labels, idx_train, idx_val, train_iters, verbose):
+        if verbose:
+            print('=== training gcn model ===')
+        optimizer = optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        best_loss_val = 100
+        best_acc_val = 0
+        for i in range(train_iters):
+            self.train()
+            optimizer.zero_grad()
+            output = self.forward(self.features, self.edge_index, self.edge_weight)
+            loss_train = F.nll_loss(output[idx_train], labels[idx_train])
+            loss_train.backward()
+            optimizer.step()
+            self.eval()
+            output = self.forward(self.features, self.edge_index, self.edge_weight)
+            loss_val = F.nll_loss(output[idx_val], labels[idx_val])
+            acc_val = accuracy(output[idx_val], labels[idx_val])
+            if verbose and i % 10 == 0:
+                print('Epoch {}, training loss: {}'.format(i, loss_train.item()))
+                print("acc_val: {:.4f}".format(acc_val))
+            if acc_val > best_acc_val:
+                best_acc_val = acc_val
+                self.output = output
+                weights = deepcopy(self.state_dict())
+        if verbose:
+            print('=== picking the best model according to the performance on validation ===')
+        self.load_state_dict(weights)
+    def test(self, features, edge_index, edge_weight, labels,idx_test):
+        """Evaluate GCN performance on test set.
+        Parameters
+        ----------
+        idx_test :
+            node testing indices
+        """
+        self.eval()
+        with torch.no_grad():
+            output = self.forward(features, edge_index, edge_weight)
+            acc_test = accuracy(output[idx_test], labels[idx_test])
+        return float(acc_test)
+    def test_with_correct_nodes(self, features, edge_index, edge_weight, labels,idx_test):
+        self.eval()
+        output = self.forward(features, edge_index, edge_weight)
+        correct_nids = (output.argmax(dim=1)[idx_test]==labels[idx_test]).nonzero().flatten()   # return a tensor
+        acc_test = accuracy(output[idx_test], labels[idx_test])
+        return acc_test,correct_nids
+class GCN_body(nn.Module):
+    def __init__(self,nfeat, nhid, dropout=0.5, layer=2,device=None,layer_norm_first=False,use_ln=False):
+        super(GCN_body, self).__init__()
+        self.device = device
+        self.nfeat = nfeat
+        self.hidden_sizes = [nhid]
+        self.dropout = dropout
+        self.convs = nn.ModuleList()
+        self.convs.append(GCNConv(nfeat, nhid))
+        self.lns = nn.ModuleList()
+        self.lns.append(torch.nn.LayerNorm(nfeat))
+        for _ in range(layer-1):
+            self.convs.append(GCNConv(nhid,nhid))
+            self.lns.append(nn.LayerNorm(nhid))
+        self.lns.append(torch.nn.LayerNorm(nhid))
+        self.layer_norm_first = layer_norm_first
+        self.use_ln = use_ln
+    def forward(self,x, edge_index,edge_weight=None):
+        if(self.layer_norm_first):
+            x = self.lns[0](x)
+        i=0
+        for conv in self.convs:
+            x = F.relu(conv(x, edge_index,edge_weight))
+            if self.use_ln:
+                x = self.lns[i+1](x)
+            i+=1
+            x = F.dropout(x, self.dropout, training=self.training)
+        return x
+class GCN(nn.Module):
+    def __init__(self, nfeat, nhid, nclass, dropout=0.5, lr=0.01, weight_decay=5e-4, layer=2,device=None,layer_norm_first=False,use_ln=False):
+        super(GCN, self).__init__()
+        assert device is not None, "Please specify 'device'!"
+        self.device = device
+        self.nfeat = nfeat
+        self.hidden_sizes = [nhid]
+        self.nclass = nclass
+        self.convs = nn.ModuleList()
+        self.convs.append(GCNConv(nfeat, nhid))
+        self.lns = nn.ModuleList()
+        self.lns.append(torch.nn.LayerNorm(nfeat))
+        for _ in range(layer-2):
+            self.convs.append(GCNConv(nhid,nhid))
+            self.lns.append(nn.LayerNorm(nhid))
+        self.lns.append(nn.LayerNorm(nhid))
+        self.gc2 = GCNConv(nhid, nclass)
+        self.dropout = dropout
+        self.lr = lr
+        self.output = None
+        self.edge_index = None
+        self.edge_weight = None
+        self.features = None
+        self.weight_decay = weight_decay
+        self.layer_norm_first = layer_norm_first
+        self.use_ln = use_ln
+    def forward(self, x, edge_index, edge_weight=None):
+        if(self.layer_norm_first):
+            x = self.lns[0](x)
+        i=0
+        for conv in self.convs:
+            x = F.relu(conv(x, edge_index,edge_weight))
+            if self.use_ln:
+                x = self.lns[i+1](x)
+            i+=1
+            x = F.dropout(x, self.dropout, training=self.training)
+        x = self.gc2(x, edge_index,edge_weight)
+        return F.log_softmax(x,dim=1)
+    def get_h(self, x, edge_index):
+        for conv in self.convs:
+            x = F.relu(conv(x, edge_index))
+        return x
+    def fit(self, features, edge_index, edge_weight, labels, idx_train, idx_val=None, train_iters=200, verbose=False):
+        """Train the gcn model, when idx_val is not None, pick the best model according to the validation loss.
+        Parameters
+        ----------
+        features :
+            node features
+        adj :
+            the adjacency matrix. The format could be torch.tensor or scipy matrix
+        labels :
+            node labels
+        idx_train :
+            node training indices
+        idx_val :
+            node validation indices. If not given (None), GCN training process will not adpot early stopping
+        train_iters : int
+            number of training epochs
+        initialize : bool
+            whether to initialize parameters before training
+        verbose : bool
+            whether to show verbose logs
+        """
+        self.edge_index, self.edge_weight = edge_index, edge_weight
+        self.features = features.to(self.device)
+        self.labels = labels.to(self.device)
+        if idx_val is None:
+            self._train_without_val(self.labels, idx_train, train_iters, verbose)
+        else:
+            self._train_with_val(self.labels, idx_train, idx_val, train_iters, verbose)
+        # torch.cuda.empty_cache()
+    def _train_without_val(self, labels, idx_train, train_iters, verbose):
+        self.train()
+        optimizer = optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        for i in range(train_iters):
+            optimizer.zero_grad()
+            output = self.forward(self.features, self.edge_index, self.edge_weight)
+            loss_train = F.nll_loss(output[idx_train], labels[idx_train])
+            loss_train.backward()
+            optimizer.step()
+            if verbose and i % 10 == 0:
+                print('Epoch {}, training loss: {}'.format(i, loss_train.item()))
+        self.eval()
+        output = self.forward(self.features, self.edge_index, self.edge_weight)
+        self.output = output
+        # torch.cuda.empty_cache()
+    def _train_with_val(self, labels, idx_train, idx_val, train_iters, verbose):
+        if verbose:
+            print('=== training gcn model ===')
+        optimizer = optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
+        best_loss_val = 100
+        best_acc_val = 0
+        for i in range(train_iters):
+            self.train()
+            optimizer.zero_grad()
+            output = self.forward(self.features, self.edge_index, self.edge_weight)
+            loss_train = F.nll_loss(output[idx_train], labels[idx_train])
+            loss_train.backward()
+            optimizer.step()
+            self.eval()
+            output = self.forward(self.features, self.edge_index, self.edge_weight)
+            loss_val = F.nll_loss(output[idx_val], labels[idx_val])
+            acc_val = utils.accuracy(output[idx_val], labels[idx_val])
+            if verbose and i % 10 == 0:
+                print('Epoch {}, training loss: {}'.format(i, loss_train.item()))
+                print("acc_val: {:.4f}".format(acc_val))
+            if acc_val > best_acc_val:
+                best_acc_val = acc_val
+                self.output = output
+                weights = deepcopy(self.state_dict())
+        if verbose:
+            print('=== picking the best model according to the performance on validation ===')
+        self.load_state_dict(weights)
+        # torch.cuda.empty_cache()
+    def test(self, features, edge_index, edge_weight, labels,idx_test):
+        """Evaluate GCN performance on test set.
+        Parameters
+        ----------
+        idx_test :
+            node testing indices
+        """
+        self.eval()
+        with torch.no_grad():
+            output = self.forward(features, edge_index, edge_weight)
+            acc_test = utils.accuracy(output[idx_test], labels[idx_test])
+        # torch.cuda.empty_cache()
+        # print("Test set results:",
+        #       "loss= {:.4f}".format(loss_test.item()),
+        #       "accuracy= {:.4f}".format(acc_test.item()))
+        return float(acc_test)
+    def test_with_correct_nodes(self, features, edge_index, edge_weight, labels,idx_test):
+        self.eval()
+        output = self.forward(features, edge_index, edge_weight)
+        correct_nids = (output.argmax(dim=1)[idx_test]==labels[idx_test]).nonzero().flatten()   # return a tensor
+        acc_test = utils.accuracy(output[idx_test], labels[idx_test])
+        # torch.cuda.empty_cache()
+        return acc_test,correct_nids

deeprobust/graph/utils.py ADDED Viewed

	@@ -0,0 +1,778 @@

+import numpy as np
+import scipy.sparse as sp
+import torch
+from sklearn.model_selection import train_test_split
+import torch.sparse as ts
+import torch.nn.functional as F
+import warnings
+def encode_onehot(labels):
+    """Convert label to onehot format.
+    Parameters
+    ----------
+    labels : numpy.array
+        node labels
+    Returns
+    -------
+    numpy.array
+        onehot labels
+    """
+    eye = np.eye(labels.max() + 1)
+    onehot_mx = eye[labels]
+    return onehot_mx
+def tensor2onehot(labels):
+    """Convert label tensor to label onehot tensor.
+    Parameters
+    ----------
+    labels : torch.LongTensor
+        node labels
+    Returns
+    -------
+    torch.LongTensor
+        onehot labels tensor
+    """
+    eye = torch.eye(labels.max() + 1)
+    onehot_mx = eye[labels]
+    return onehot_mx.to(labels.device)
+def preprocess(adj, features, labels, preprocess_adj=False, preprocess_feature=False, sparse=False, device='cpu'):
+    """Convert adj, features, labels from array or sparse matrix to
+    torch Tensor, and normalize the input data.
+    Parameters
+    ----------
+    adj : scipy.sparse.csr_matrix
+        the adjacency matrix.
+    features : scipy.sparse.csr_matrix
+        node features
+    labels : numpy.array
+        node labels
+    preprocess_adj : bool
+        whether to normalize the adjacency matrix
+    preprocess_feature : bool
+        whether to normalize the feature matrix
+    sparse : bool
+       whether to return sparse tensor
+    device : str
+        'cpu' or 'cuda'
+    """
+    if preprocess_adj:
+        adj = normalize_adj(adj)
+    if preprocess_feature:
+        features = normalize_feature(features)
+    labels = torch.LongTensor(labels)
+    if sparse:
+        adj = sparse_mx_to_torch_sparse_tensor(adj)
+        features = sparse_mx_to_torch_sparse_tensor(features)
+    else:
+        if sp.issparse(features):
+            features = torch.FloatTensor(np.array(features.todense()))
+        else:
+            features = torch.FloatTensor(features)
+        adj = torch.FloatTensor(adj.todense())
+    return adj.to(device), features.to(device), labels.to(device)
+def to_tensor(adj, features, labels=None, device='cpu'):
+    """Convert adj, features, labels from array or sparse matrix to
+    torch Tensor.
+    Parameters
+    ----------
+    adj : scipy.sparse.csr_matrix
+        the adjacency matrix.
+    features : scipy.sparse.csr_matrix
+        node features
+    labels : numpy.array
+        node labels
+    device : str
+        'cpu' or 'cuda'
+    """
+    if sp.issparse(adj):
+        adj = sparse_mx_to_torch_sparse_tensor(adj)
+    else:
+        adj = torch.FloatTensor(adj)
+    if sp.issparse(features):
+        features = sparse_mx_to_torch_sparse_tensor(features)
+    else:
+        features = torch.FloatTensor(np.array(features))
+    if labels is None:
+        return adj.to(device), features.to(device)
+    else:
+        labels = torch.LongTensor(labels)
+        return adj.to(device), features.to(device), labels.to(device)
+def normalize_feature(mx):
+    """Row-normalize sparse matrix or dense matrix
+    Parameters
+    ----------
+    mx : scipy.sparse.csr_matrix or numpy.array
+        matrix to be normalized
+    Returns
+    -------
+    scipy.sprase.lil_matrix
+        normalized matrix
+    """
+    if type(mx) is not sp.lil.lil_matrix:
+        try:
+            mx = mx.tolil()
+        except AttributeError:
+            pass
+    rowsum = np.array(mx.sum(1))
+    r_inv = np.power(rowsum, -1).flatten()
+    r_inv[np.isinf(r_inv)] = 0.
+    r_mat_inv = sp.diags(r_inv)
+    mx = r_mat_inv.dot(mx)
+    return mx
+def normalize_adj(mx):
+    """Normalize sparse adjacency matrix,
+    A' = (D + I)^-1/2 * ( A + I ) * (D + I)^-1/2
+    Row-normalize sparse matrix
+    Parameters
+    ----------
+    mx : scipy.sparse.csr_matrix
+        matrix to be normalized
+    Returns
+    -------
+    scipy.sprase.lil_matrix
+        normalized matrix
+    """
+    # TODO: maybe using coo format would be better?
+    if type(mx) is not sp.lil.lil_matrix:
+        mx = mx.tolil()
+    if mx[0, 0] == 0 :
+        mx = mx + sp.eye(mx.shape[0])
+    rowsum = np.array(mx.sum(1))
+    r_inv = np.power(rowsum, -1/2).flatten()
+    r_inv[np.isinf(r_inv)] = 0.
+    r_mat_inv = sp.diags(r_inv)
+    mx = r_mat_inv.dot(mx)
+    mx = mx.dot(r_mat_inv)
+    return mx
+def normalize_sparse_tensor(adj, fill_value=1):
+    """Normalize sparse tensor. Need to import torch_scatter
+    """
+    edge_index = adj._indices()
+    edge_weight = adj._values()
+    num_nodes= adj.size(0)
+    edge_index, edge_weight = add_self_loops(
+	edge_index, edge_weight, fill_value, num_nodes)
+    row, col = edge_index
+    from torch_scatter import scatter_add
+    deg = scatter_add(edge_weight, row, dim=0, dim_size=num_nodes)
+    deg_inv_sqrt = deg.pow(-0.5)
+    deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
+    values = deg_inv_sqrt[row] * edge_weight * deg_inv_sqrt[col]
+    shape = adj.shape
+    return torch.sparse.FloatTensor(edge_index, values, shape)
+def add_self_loops(edge_index, edge_weight=None, fill_value=1, num_nodes=None):
+    # num_nodes = maybe_num_nodes(edge_index, num_nodes)
+    loop_index = torch.arange(0, num_nodes, dtype=torch.long,
+                              device=edge_index.device)
+    loop_index = loop_index.unsqueeze(0).repeat(2, 1)
+    if edge_weight is not None:
+        assert edge_weight.numel() == edge_index.size(1)
+        loop_weight = edge_weight.new_full((num_nodes, ), fill_value)
+        edge_weight = torch.cat([edge_weight, loop_weight], dim=0)
+    edge_index = torch.cat([edge_index, loop_index], dim=1)
+    return edge_index, edge_weight
+def normalize_adj_tensor(adj, sparse=False):
+    """Normalize adjacency tensor matrix.
+    """
+    device = adj.device
+    if sparse:
+        # warnings.warn('If you find the training process is too slow, you can uncomment line 207 in deeprobust/graph/utils.py. Note that you need to install torch_sparse')
+        # TODO if this is too slow, uncomment the following code,
+        # but you need to install torch_scatter
+        # return normalize_sparse_tensor(adj)
+        adj = to_scipy(adj)
+        mx = normalize_adj(adj)
+        return sparse_mx_to_torch_sparse_tensor(mx).to(device)
+    else:
+        mx = adj + torch.eye(adj.shape[0]).to(device)
+        rowsum = mx.sum(1)
+        r_inv = rowsum.pow(-1/2).flatten()
+        r_inv[torch.isinf(r_inv)] = 0.
+        r_mat_inv = torch.diag(r_inv)
+        mx = r_mat_inv @ mx
+        mx = mx @ r_mat_inv
+    return mx
+def degree_normalize_adj(mx):
+    """Row-normalize sparse matrix"""
+    mx = mx.tolil()
+    if mx[0, 0] == 0 :
+        mx = mx + sp.eye(mx.shape[0])
+    rowsum = np.array(mx.sum(1))
+    r_inv = np.power(rowsum, -1).flatten()
+    r_inv[np.isinf(r_inv)] = 0.
+    r_mat_inv = sp.diags(r_inv)
+    # mx = mx.dot(r_mat_inv)
+    mx = r_mat_inv.dot(mx)
+    return mx
+def degree_normalize_sparse_tensor(adj, fill_value=1):
+    """degree_normalize_sparse_tensor.
+    """
+    edge_index = adj._indices()
+    edge_weight = adj._values()
+    num_nodes= adj.size(0)
+    edge_index, edge_weight = add_self_loops(
+	edge_index, edge_weight, fill_value, num_nodes)
+    row, col = edge_index
+    from torch_scatter import scatter_add
+    deg = scatter_add(edge_weight, row, dim=0, dim_size=num_nodes)
+    deg_inv_sqrt = deg.pow(-1)
+    deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
+    values = deg_inv_sqrt[row] * edge_weight
+    shape = adj.shape
+    return torch.sparse.FloatTensor(edge_index, values, shape)
+def degree_normalize_adj_tensor(adj, sparse=True):
+    """degree_normalize_adj_tensor.
+    """
+    device = adj.device
+    if sparse:
+        # return  degree_normalize_sparse_tensor(adj)
+        adj = to_scipy(adj)
+        mx = degree_normalize_adj(adj)
+        return sparse_mx_to_torch_sparse_tensor(mx).to(device)
+    else:
+        mx = adj + torch.eye(adj.shape[0]).to(device)
+        rowsum = mx.sum(1)
+        r_inv = rowsum.pow(-1).flatten()
+        r_inv[torch.isinf(r_inv)] = 0.
+        r_mat_inv = torch.diag(r_inv)
+        mx = r_mat_inv @ mx
+    return mx
+def accuracy(output, labels):
+    """Return accuracy of output compared to labels.
+    Parameters
+    ----------
+    output : torch.Tensor
+        output from model
+    labels : torch.Tensor or numpy.array
+        node labels
+    Returns
+    -------
+    float
+        accuracy
+    """
+    if not hasattr(labels, '__len__'):
+        labels = [labels]
+    if type(labels) is not torch.Tensor:
+        labels = torch.LongTensor(labels)
+    preds = output.max(1)[1].type_as(labels)
+    correct = preds.eq(labels).double()
+    correct = correct.sum()
+    return correct / len(labels)
+def loss_acc(output, labels, targets, avg_loss=True):
+    if type(labels) is not torch.Tensor:
+        labels = torch.LongTensor(labels)
+    preds = output.max(1)[1].type_as(labels)
+    correct = preds.eq(labels).double()[targets]
+    loss = F.nll_loss(output[targets], labels[targets], reduction='mean' if avg_loss else 'none')
+    if avg_loss:
+        return loss, correct.sum() / len(targets)
+    return loss, correct
+    # correct = correct.sum()
+    # return loss, correct / len(labels)
+def get_perf(output, labels, mask, verbose=True):
+    """evalute performance for test masked data"""
+    loss = F.nll_loss(output[mask], labels[mask])
+    acc = accuracy(output[mask], labels[mask])
+    if verbose:
+        print("loss= {:.4f}".format(loss.item()),
+              "accuracy= {:.4f}".format(acc.item()))
+    return loss.item(), acc.item()
+def classification_margin(output, true_label):
+    """Calculate classification margin for outputs.
+    `probs_true_label - probs_best_second_class`
+    Parameters
+    ----------
+    output: torch.Tensor
+        output vector (1 dimension)
+    true_label: int
+        true label for this node
+    Returns
+    -------
+    list
+        classification margin for this node
+    """
+    probs = torch.exp(output)
+    probs_true_label = probs[true_label].clone()
+    probs[true_label] = 0
+    probs_best_second_class = probs[probs.argmax()]
+    return (probs_true_label - probs_best_second_class).item()
+def sparse_mx_to_torch_sparse_tensor(sparse_mx):
+    """Convert a scipy sparse matrix to a torch sparse tensor."""
+    sparse_mx = sparse_mx.tocoo().astype(np.float32)
+    sparserow=torch.LongTensor(sparse_mx.row).unsqueeze(1)
+    sparsecol=torch.LongTensor(sparse_mx.col).unsqueeze(1)
+    sparseconcat=torch.cat((sparserow, sparsecol),1)
+    sparsedata=torch.FloatTensor(sparse_mx.data)
+    return torch.sparse.FloatTensor(sparseconcat.t(),sparsedata,torch.Size(sparse_mx.shape))
+	# slower version....
+    # sparse_mx = sparse_mx.tocoo().astype(np.float32)
+    # indices = torch.from_numpy(
+    #     np.vstack((sparse_mx.row, sparse_mx.col)).astype(np.int64))
+    # values = torch.from_numpy(sparse_mx.data)
+    # shape = torch.Size(sparse_mx.shape)
+    # return torch.sparse.FloatTensor(indices, values, shape)
+def to_scipy(tensor):
+    """Convert a dense/sparse tensor to scipy matrix"""
+    if is_sparse_tensor(tensor):
+        values = tensor._values()
+        indices = tensor._indices()
+        return sp.csr_matrix((values.cpu().numpy(), indices.cpu().numpy()), shape=tensor.shape)
+    else:
+        indices = tensor.nonzero().t()
+        values = tensor[indices[0], indices[1]]
+        return sp.csr_matrix((values.cpu().numpy(), indices.cpu().numpy()), shape=tensor.shape)
+def is_sparse_tensor(tensor):
+    """Check if a tensor is sparse tensor.
+    Parameters
+    ----------
+    tensor : torch.Tensor
+        given tensor
+    Returns
+    -------
+    bool
+        whether a tensor is sparse tensor
+    """
+    # if hasattr(tensor, 'nnz'):
+    if tensor.layout == torch.sparse_coo:
+        return True
+    else:
+        return False
+def get_train_val_test(nnodes, val_size=0.1, test_size=0.8, stratify=None, seed=None):
+    """This setting follows nettack/mettack, where we split the nodes
+    into 10% training, 10% validation and 80% testing data
+    Parameters
+    ----------
+    nnodes : int
+        number of nodes in total
+    val_size : float
+        size of validation set
+    test_size : float
+        size of test set
+    stratify :
+        data is expected to split in a stratified fashion. So stratify should be labels.
+    seed : int or None
+        random seed
+    Returns
+    -------
+    idx_train :
+        node training indices
+    idx_val :
+        node validation indices
+    idx_test :
+        node test indices
+    """
+    assert stratify is not None, 'stratify cannot be None!'
+    if seed is not None:
+        np.random.seed(seed)
+    idx = np.arange(nnodes)
+    train_size = 1 - val_size - test_size
+    idx_train_and_val, idx_test = train_test_split(idx,
+                                                   random_state=None,
+                                                   train_size=train_size + val_size,
+                                                   test_size=test_size,
+                                                   stratify=stratify)
+    if stratify is not None:
+        stratify = stratify[idx_train_and_val]
+    idx_train, idx_val = train_test_split(idx_train_and_val,
+                                          random_state=None,
+                                          train_size=(train_size / (train_size + val_size)),
+                                          test_size=(val_size / (train_size + val_size)),
+                                          stratify=stratify)
+    return idx_train, idx_val, idx_test
+def get_train_test(nnodes, test_size=0.8, stratify=None, seed=None):
+    """This function returns training and test set without validation.
+    It can be used for settings of different label rates.
+    Parameters
+    ----------
+    nnodes : int
+        number of nodes in total
+    test_size : float
+        size of test set
+    stratify :
+        data is expected to split in a stratified fashion. So stratify should be labels.
+    seed : int or None
+        random seed
+    Returns
+    -------
+    idx_train :
+        node training indices
+    idx_test :
+        node test indices
+    """
+    assert stratify is not None, 'stratify cannot be None!'
+    if seed is not None:
+        np.random.seed(seed)
+    idx = np.arange(nnodes)
+    train_size = 1 - test_size
+    idx_train, idx_test = train_test_split(idx, random_state=None,
+                                                train_size=train_size,
+                                                test_size=test_size,
+                                                stratify=stratify)
+    return idx_train, idx_test
+def get_train_val_test_gcn(labels, seed=None):
+    """This setting follows gcn, where we randomly sample 20 instances for each class
+    as training data, 500 instances as validation data, 1000 instances as test data.
+    Note here we are not using fixed splits. When random seed changes, the splits
+    will also change.
+    Parameters
+    ----------
+    labels : numpy.array
+        node labels
+    seed : int or None
+        random seed
+    Returns
+    -------
+    idx_train :
+        node training indices
+    idx_val :
+        node validation indices
+    idx_test :
+        node test indices
+    """
+    if seed is not None:
+        np.random.seed(seed)
+    idx = np.arange(len(labels))
+    nclass = labels.max() + 1
+    idx_train = []
+    idx_unlabeled = []
+    for i in range(nclass):
+        labels_i = idx[labels==i]
+        labels_i = np.random.permutation(labels_i)
+        idx_train = np.hstack((idx_train, labels_i[: 20])).astype(np.int)
+        idx_unlabeled = np.hstack((idx_unlabeled, labels_i[20: ])).astype(np.int)
+    idx_unlabeled = np.random.permutation(idx_unlabeled)
+    idx_val = idx_unlabeled[: 500]
+    idx_test = idx_unlabeled[500: 1500]
+    return idx_train, idx_val, idx_test
+def get_train_test_labelrate(labels, label_rate):
+    """Get train test according to given label rate.
+    """
+    nclass = labels.max() + 1
+    train_size = int(round(len(labels) * label_rate / nclass))
+    print("=== train_size = %s ===" % train_size)
+    idx_train, idx_val, idx_test = get_splits_each_class(labels, train_size=train_size)
+    return idx_train, idx_test
+def get_splits_each_class(labels, train_size):
+    """We randomly sample n instances for class, where n = train_size.
+    """
+    idx = np.arange(len(labels))
+    nclass = labels.max() + 1
+    idx_train = []
+    idx_val = []
+    idx_test = []
+    for i in range(nclass):
+        labels_i = idx[labels==i]
+        labels_i = np.random.permutation(labels_i)
+        idx_train = np.hstack((idx_train, labels_i[: train_size])).astype(np.int)
+        idx_val = np.hstack((idx_val, labels_i[train_size: 2*train_size])).astype(np.int)
+        idx_test = np.hstack((idx_test, labels_i[2*train_size: ])).astype(np.int)
+    return np.random.permutation(idx_train), np.random.permutation(idx_val), \
+           np.random.permutation(idx_test)
+def unravel_index(index, array_shape):
+    rows = torch.div(index, array_shape[1], rounding_mode='trunc')
+    cols = index % array_shape[1]
+    return rows, cols
+def get_degree_squence(adj):
+    try:
+        return adj.sum(0)
+    except:
+        return ts.sum(adj, dim=1).to_dense()
+def likelihood_ratio_filter(node_pairs, modified_adjacency, original_adjacency, d_min, threshold=0.004, undirected=True):
+    """
+    Filter the input node pairs based on the likelihood ratio test proposed by Zügner et al. 2018, see
+    https://dl.acm.org/citation.cfm?id=3220078. In essence, for each node pair return 1 if adding/removing the edge
+    between the two nodes does not violate the unnoticeability constraint, and return 0 otherwise. Assumes unweighted
+    and undirected graphs.
+    """
+    N = int(modified_adjacency.shape[0])
+    # original_degree_sequence = get_degree_squence(original_adjacency)
+    # current_degree_sequence = get_degree_squence(modified_adjacency)
+    original_degree_sequence = original_adjacency.sum(0)
+    current_degree_sequence = modified_adjacency.sum(0)
+    concat_degree_sequence = torch.cat((current_degree_sequence, original_degree_sequence))
+    # Compute the log likelihood values of the original, modified, and combined degree sequences.
+    ll_orig, alpha_orig, n_orig, sum_log_degrees_original = degree_sequence_log_likelihood(original_degree_sequence, d_min)
+    ll_current, alpha_current, n_current, sum_log_degrees_current = degree_sequence_log_likelihood(current_degree_sequence, d_min)
+    ll_comb, alpha_comb, n_comb, sum_log_degrees_combined = degree_sequence_log_likelihood(concat_degree_sequence, d_min)
+    # Compute the log likelihood ratio
+    current_ratio = -2 * ll_comb + 2 * (ll_orig + ll_current)
+    # Compute new log likelihood values that would arise if we add/remove the edges corresponding to each node pair.
+    new_lls, new_alphas, new_ns, new_sum_log_degrees = updated_log_likelihood_for_edge_changes(node_pairs,
+                                                                                               modified_adjacency, d_min)
+    # Combination of the original degree distribution with the distributions corresponding to each node pair.
+    n_combined = n_orig + new_ns
+    new_sum_log_degrees_combined = sum_log_degrees_original + new_sum_log_degrees
+    alpha_combined = compute_alpha(n_combined, new_sum_log_degrees_combined, d_min)
+    new_ll_combined = compute_log_likelihood(n_combined, alpha_combined, new_sum_log_degrees_combined, d_min)
+    new_ratios = -2 * new_ll_combined + 2 * (new_lls + ll_orig)
+    # Allowed edges are only those for which the resulting likelihood ratio measure is < than the threshold
+    allowed_edges = new_ratios < threshold
+    if allowed_edges.is_cuda:
+        filtered_edges = node_pairs[allowed_edges.cpu().numpy().astype(np.bool)]
+    else:
+        filtered_edges = node_pairs[allowed_edges.numpy().astype(np.bool)]
+    allowed_mask = torch.zeros(modified_adjacency.shape)
+    allowed_mask[filtered_edges.T] = 1
+    if undirected:
+        allowed_mask += allowed_mask.t()
+    return allowed_mask, current_ratio
+def degree_sequence_log_likelihood(degree_sequence, d_min):
+    """
+    Compute the (maximum) log likelihood of the Powerlaw distribution fit on a degree distribution.
+    """
+    # Determine which degrees are to be considered, i.e. >= d_min.
+    D_G = degree_sequence[(degree_sequence >= d_min.item())]
+    try:
+        sum_log_degrees = torch.log(D_G).sum()
+    except:
+        sum_log_degrees = np.log(D_G).sum()
+    n = len(D_G)
+    alpha = compute_alpha(n, sum_log_degrees, d_min)
+    ll = compute_log_likelihood(n, alpha, sum_log_degrees, d_min)
+    return ll, alpha, n, sum_log_degrees
+def updated_log_likelihood_for_edge_changes(node_pairs, adjacency_matrix, d_min):
+    """ Adopted from https://github.com/danielzuegner/nettack
+    """
+    # For each node pair find out whether there is an edge or not in the input adjacency matrix.
+    edge_entries_before = adjacency_matrix[node_pairs.T]
+    degree_sequence = adjacency_matrix.sum(1)
+    D_G = degree_sequence[degree_sequence >= d_min.item()]
+    sum_log_degrees = torch.log(D_G).sum()
+    n = len(D_G)
+    deltas = -2 * edge_entries_before + 1
+    d_edges_before = degree_sequence[node_pairs]
+    d_edges_after = degree_sequence[node_pairs] + deltas[:, None]
+    # Sum the log of the degrees after the potential changes which are >= d_min
+    sum_log_degrees_after, new_n = update_sum_log_degrees(sum_log_degrees, n, d_edges_before, d_edges_after, d_min)
+    # Updated estimates of the Powerlaw exponents
+    new_alpha = compute_alpha(new_n, sum_log_degrees_after, d_min)
+    # Updated log likelihood values for the Powerlaw distributions
+    new_ll = compute_log_likelihood(new_n, new_alpha, sum_log_degrees_after, d_min)
+    return new_ll, new_alpha, new_n, sum_log_degrees_after
+def update_sum_log_degrees(sum_log_degrees_before, n_old, d_old, d_new, d_min):
+    # Find out whether the degrees before and after the change are above the threshold d_min.
+    old_in_range = d_old >= d_min
+    new_in_range = d_new >= d_min
+    d_old_in_range = d_old * old_in_range.float()
+    d_new_in_range = d_new * new_in_range.float()
+    # Update the sum by subtracting the old values and then adding the updated logs of the degrees.
+    sum_log_degrees_after = sum_log_degrees_before - (torch.log(torch.clamp(d_old_in_range, min=1))).sum(1) \
+                                 + (torch.log(torch.clamp(d_new_in_range, min=1))).sum(1)
+    # Update the number of degrees >= d_min
+    new_n = n_old - (old_in_range!=0).sum(1) + (new_in_range!=0).sum(1)
+    new_n = new_n.float()
+    return sum_log_degrees_after, new_n
+def compute_alpha(n, sum_log_degrees, d_min):
+    try:
+        alpha =  1 + n / (sum_log_degrees - n * torch.log(d_min - 0.5))
+    except:
+        alpha =  1 + n / (sum_log_degrees - n * np.log(d_min - 0.5))
+    return alpha
+def compute_log_likelihood(n, alpha, sum_log_degrees, d_min):
+    # Log likelihood under alpha
+    try:
+        ll = n * torch.log(alpha) + n * alpha * torch.log(d_min) + (alpha + 1) * sum_log_degrees
+    except:
+        ll = n * np.log(alpha) + n * alpha * np.log(d_min) + (alpha + 1) * sum_log_degrees
+    return ll
+def ravel_multiple_indices(ixs, shape, reverse=False):
+    """
+    "Flattens" multiple 2D input indices into indices on the flattened matrix, similar to np.ravel_multi_index.
+    Does the same as ravel_index but for multiple indices at once.
+    Parameters
+    ----------
+    ixs: array of ints shape (n, 2)
+        The array of n indices that will be flattened.
+    shape: list or tuple of ints of length 2
+        The shape of the corresponding matrix.
+    Returns
+    -------
+    array of n ints between 0 and shape[0]*shape[1]-1
+        The indices on the flattened matrix corresponding to the 2D input indices.
+    """
+    if reverse:
+        return ixs[:, 1] * shape[1] + ixs[:, 0]
+    return ixs[:, 0] * shape[1] + ixs[:, 1]
+def visualize(your_var):
+    """visualize computation graph"""
+    from graphviz import Digraph
+    import torch
+    from torch.autograd import Variable
+    from torchviz import make_dot
+    make_dot(your_var).view()
+def reshape_mx(mx, shape):
+    indices = mx.nonzero()
+    return sp.csr_matrix((mx.data, (indices[0], indices[1])), shape=shape)
+def add_mask(data, dataset):
+    """data: ogb-arxiv pyg data format"""
+    # for arxiv
+    split_idx = dataset.get_idx_split()
+    train_idx, valid_idx, test_idx = split_idx["train"], split_idx["valid"], split_idx["test"]
+    n = data.x.shape[0]
+    data.train_mask = index_to_mask(train_idx, n)
+    data.val_mask = index_to_mask(valid_idx, n)
+    data.test_mask = index_to_mask(test_idx, n)
+    data.y = data.y.squeeze()
+    # data.edge_index = to_undirected(data.edge_index, data.num_nodes)
+def index_to_mask(index, size):
+    mask = torch.zeros((size, ), dtype=torch.bool)
+    mask[index] = 1
+    return mask
+def add_feature_noise(data, noise_ratio, seed):
+    np.random.seed(seed)
+    n, d = data.x.shape
+    # noise = torch.normal(mean=torch.zeros(int(noise_ratio*n), d), std=1)
+    noise = torch.FloatTensor(np.random.normal(0, 1, size=(int(noise_ratio*n), d))).to(data.x.device)
+    indices = np.arange(n)
+    indices = np.random.permutation(indices)[: int(noise_ratio*n)]
+    delta_feat = torch.zeros_like(data.x)
+    delta_feat[indices] = noise - data.x[indices]
+    data.x[indices] = noise
+    mask = np.zeros(n)
+    mask[indices] = 1
+    mask = torch.tensor(mask).bool().to(data.x.device)
+    return delta_feat, mask
+def add_feature_noise_test(data, noise_ratio, seed):
+    np.random.seed(seed)
+    n, d = data.x.shape
+    indices = np.arange(n)
+    test_nodes = indices[data.test_mask.cpu()]
+    selected = np.random.permutation(test_nodes)[: int(noise_ratio*len(test_nodes))]
+    noise = torch.FloatTensor(np.random.normal(0, 1, size=(int(noise_ratio*len(test_nodes)), d)))
+    noise = noise.to(data.x.device)
+    delta_feat = torch.zeros_like(data.x)
+    delta_feat[selected] = noise - data.x[selected]
+    data.x[selected] = noise
+    # mask = np.zeros(len(test_nodes))
+    mask = np.zeros(n)
+    mask[selected] = 1
+    mask = torch.tensor(mask).bool().to(data.x.device)
+    return delta_feat, mask
+# def check_path(file_path):
+#     if not osp.exists(file_path):
+#         os.system(f'mkdir -p {file_path}')

deeprobust/image/README.md ADDED Viewed

	@@ -0,0 +1,45 @@

+# Setup
+```
+git clone https://github.com/DSE-MSU/DeepRobust.git
+cd DeepRobust
+python setup.py install
+```
+# Full README
+[click here](https://github.com/DSE-MSU/DeepRobust)
+# Attack Methods
+|   Attack Methods   | Attack Type | Apply Domain | Links |
+|--------------------|-------------|--------------|------|
+| LBFGS attack | White-Box | Image Classification | [Intriguing Properties of Neural Networks](https://arxiv.org/pdf/1312.6199.pdf?not-changed)|
+| FGSM attack | White-Box | Image Classification | [Explaining and Harnessing Adversarial Examples](https://arxiv.org/pdf/1412.6572.pdf) |
+| PGD attack | White-Box | Image Classification | [Towards Deep Learning Models Resistant to Adversarial Attacks](https://arxiv.org/pdf/1706.06083.pdf) |
+| DeepFool attack | White-Box | Image Classification | [DeepFool: a simple and accurate method to fool deep neural network](https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Moosavi-Dezfooli_DeepFool_A_Simple_CVPR_2016_paper.pdf) |
+| CW attack | White-Box | Image Classification | [Towards Evaluating the Robustness of Neural Networks](https://arxiv.org/pdf/1608.04644.pdf) |
+| One pixel attack | White-Box | Image Classification | [One pixel attack for fooling deep neural networks](https://arxiv.org/pdf/1710.08864.pdf) |
+| BPDA attack | White-Box | Image Classification | [Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples](https://arxiv.org/pdf/1802.00420.pdf) |
+| Universal attack | White-Box | Image Classification | [Universal adversarial perturbations](https://arxiv.org/pdf/1610.08401.pdf) |
+| Nattack | Black-Box | Image Classification | [NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks](https://arxiv.org/pdf/1905.00441.pdf) |
+# Defense Methods
+|   Defense Methods   | Defense Type | Apply Domain | Links |
+|---------------------|--------------|--------------|------|
+| FGSM training | Adverserial Training | Image Classification | [Towards Deep Learning Models Resistant to Adversarial Attacks](https://arxiv.org/pdf/1706.06083.pdf) |
+| Fast(an improved version of FGSM training) | Adverserial Training | Image Classification | [Fast is better than free: Revisiting adversarial training](https://openreview.net/attachment?id=BJx040EFvH&name=original_pdf) |
+| PGD training | Adverserial Training | Image Classification | [Intriguing Properties of Neural Networks](https://arxiv.org/pdf/1312.6199.pdf?not-changed)|
+| YOPO(an improved version of PGD training) | Adverserial Training | Image Classification | [You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle](https://arxiv.org/pdf/1905.00877.pdf) |
+| TRADES | Adverserial Training | Image Classification | [Theoretically Principled Trade-off between Robustness and Accuracy](https://arxiv.org/pdf/1901.08573.pdf) |
+| Thermometer Encoding | Gradient Masking | Image Classification | [Thermometer Encoding:One Hot Way To Resist Adversarial Examples](https://openreview.net/pdf?id=S18Su--CW) |
+| LID-based adversarial classifier | Detection | Image Classification | [Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality](https://arxiv.org/pdf/1801.02613.pdf) |
+# Support Datasets
+- MNIST
+- CIFAR-10
+- ImageNet
+# Support Networks
+- CNN
+- ResNet(ResNet18, ResNet34, ResNet50)
+- VGG
+- DenseNet

deeprobust/image/__init__.py ADDED Viewed

	@@ -0,0 +1,11 @@

+import logging
+from deeprobust.image import attack
+from deeprobust.image import defense
+from deeprobust.image import netmodels
+__all__ = ['attack', 'defense', 'netmodels']
+logging.info("import attack from image")
+logging.info("import defense from defense")
+logging.info("import netmodels from netmodels")

deeprobust/image/attack/Nattack.py ADDED Viewed

	@@ -0,0 +1,181 @@

+import torch
+from torch import optim
+import numpy as np
+import logging
+from deeprobust.image.attack.base_attack import BaseAttack
+from deeprobust.image.utils import onehot_like, arctanh
+class NATTACK(BaseAttack):
+    """
+    Nattack is a black box attack algorithm.
+    """
+    def __init__(self, model, device = 'cuda'):
+        super(NATTACK, self).__init__(model, device)
+        self.model = model
+        self.device = device
+    def generate(self, **kwargs):
+        """
+        Call this function to generate adversarial examples.
+        Parameters
+        ----------
+        kwargs :
+            user defined paremeters
+        """
+        assert self.parse_params(**kwargs)
+        return attack(self.model, self.dataloader, self.classnum,
+                           self.clip_max, self.clip_min, self.epsilon,
+                           self.population, self.max_iterations,
+                           self.learning_rate, self.sigma, self.target_or_not)
+        assert self.check_type_device(self.dataloader)
+    def parse_params(self,
+                     dataloader,
+                     classnum,
+                     target_or_not = False,
+                     clip_max = 1,
+                     clip_min = 0,
+                     epsilon = 0.2,
+                     population = 300,
+                     max_iterations = 400,
+                     learning_rate = 2,
+                     sigma = 0.1
+                     ):
+        """parse_params.
+        Parameters
+        ----------
+        dataloader :
+            dataloader
+        classnum :
+            classnum
+        target_or_not :
+            target_or_not
+        clip_max :
+            maximum pixel value
+        clip_min :
+            minimum pixel value
+        epsilon :
+            perturb constraint
+        population :
+            population
+        max_iterations :
+            maximum number of iterations
+        learning_rate :
+            learning rate
+        sigma :
+            sigma
+        """
+        self.dataloader = dataloader
+        self.classnum = classnum
+        self.target_or_not = target_or_not
+        self.clip_max = clip_max
+        self.clip_min = clip_min
+        self.epsilon = epsilon
+        self.population = population
+        self.max_iterations = max_iterations
+        self.learning_rate = learning_rate
+        self.sigma = sigma
+        return True
+def attack(model, loader, classnum, clip_max, clip_min, epsilon, population, max_iterations, learning_rate, sigma, target_or_not):
+    #initialization
+    totalImages = 0
+    succImages = 0
+    faillist = []
+    successlist = []
+    printlist = []
+    for i, (inputs, targets) in enumerate(loader):
+        success = False
+        print('attack picture No. ' + str(i))
+        c = inputs.size(1)  # chanel
+        l = inputs.size(2)  # length
+        w = inputs.size(3)  # width
+        mu = arctanh((inputs * 2) - 1)
+        #mu = torch.from_numpy(np.random.randn(1, c, l, w) * 0.001).float()  # random initialize mean
+        predict = model.forward(inputs)
+        ## skip wrongly classified samples
+        if  predict.argmax(dim = 1, keepdim = True) != targets:
+            print('skip the wrong example ', i)
+            continue
+        totalImages += 1
+        ## finding most possible mean
+        for runstep in range(max_iterations):
+            # sample points from normal distribution
+            eps = torch.from_numpy(np.random.randn(population, c, l, w)).float()
+            z = mu.repeat(population, 1, 1, 1) + sigma * eps
+            # calculate g_z
+            g_z = np.tanh(z) * 1 / 2 + 1 / 2
+            # testing whether exists successful attack every 10 iterations.
+            if runstep % 10 == 0:
+                realdist = g_z - inputs
+                realclipdist = np.clip(realdist, -epsilon, epsilon).float()
+                realclipinput = realclipdist + inputs
+                info = 'inputs.shape__' + str(inputs.shape)
+                logging.debug(info)
+                predict = model.forward(realclipinput)
+                #pending attack
+                if (target_or_not == False):
+                    if sum(predict.argmax(dim = 1, keepdim = True)[0] != targets) > 0 and (np.abs(realclipdist).max() <= epsilon):
+                        succImages += 1
+                        success = True
+                        print('succeed attack Images: '+str(succImages)+'     totalImages: '+str(totalImages))
+                        print('steps: '+ str(runstep))
+                        successlist.append(i)
+                        printlist.append(runstep)
+                        break
+            # calculate distance
+            dist = g_z - inputs
+            clipdist = np.clip(dist, -epsilon, epsilon)
+            proj_g_z = inputs + clipdist
+            proj_g_z = proj_g_z.float()
+            outputs = model.forward(proj_g_z)
+            # get cw loss on sampled images
+            target_onehot = np.zeros((1,classnum))
+            target_onehot[0][targets]=1.
+            real = (target_onehot * outputs.detach().numpy()).sum(1)
+            other = ((1. - target_onehot) * outputs.detach().numpy() - target_onehot * 10000.).max(1)
+            loss1 = np.clip(real - other, a_min= 0, a_max= 1e10)
+            Reward = 0.5 * loss1
+            # update mean by nes
+            A = ((Reward - np.mean(Reward)) / (np.std(Reward)+1e-7))
+            A = np.array(A, dtype= np.float32)
+            mu = mu - torch.from_numpy((learning_rate/(population*sigma)) *
+                                               ((np.dot(eps.reshape(population,-1).T, A)).reshape(1, 1, 28, 28)))
+        if not success:
+            faillist.append(i)
+            print('failed:',faillist.__len__())
+            print('....................................')
+        else:
+            #print('succeed:',successlist.__len__())
+            print('....................................')

deeprobust/image/attack/fgsm.py ADDED Viewed

	@@ -0,0 +1,121 @@

+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+import numpy as np
+from numpy import linalg as LA
+from deeprobust.image.attack.base_attack import BaseAttack
+class FGSM(BaseAttack):
+    """
+    FGSM attack is an one step gradient descent method.
+    """
+    def __init__(self, model, device = 'cuda'):
+        super(FGSM, self).__init__(model, device)
+    def generate(self, image, label, **kwargs):
+        """"
+        Call this function to generate FGSM adversarial examples.
+        Parameters
+        ----------
+        image :
+            original image
+        label :
+            target label
+        kwargs :
+            user defined paremeters
+        """
+        label = label.type(torch.FloatTensor)
+        ## check and parse parameters for attack
+        assert self.check_type_device(image, label)
+        assert self.parse_params(**kwargs)
+        return fgm(self.model,
+                   self.image,
+                   self.label,
+                   self.epsilon,
+                   self.order,
+                   self.clip_min,
+                   self.clip_max,
+                   self.device)
+    def parse_params(self,
+                     epsilon = 0.2,
+                     order = np.inf,
+                     clip_max = None,
+                     clip_min = None):
+        """
+        Parse the user defined parameters.
+        :param model: victim model
+        :param image: original attack images
+        :param label: target labels
+        :param epsilon: perturbation constraint
+        :param order: constraint type
+        :param clip_min: minimum pixel value
+        :param clip_max: maximum pixel value
+        :param device: device type, cpu or gpu
+        :type image: [N*C*H*W],floatTensor
+        :type label: int
+        :type epsilon: float
+        :type order: int
+        :type clip_min: float
+        :type clip_max: float
+        :type device: string('cpu' or 'cuda')
+        :return: perturbed images
+        :rtype: [N*C*H*W], floatTensor
+        """
+        self.epsilon = epsilon
+        self.order = order
+        self.clip_max = clip_max
+        self.clip_min = clip_min
+        return True
+def fgm(model, image, label, epsilon, order, clip_min, clip_max, device):
+    imageArray = image.cpu().detach().numpy()
+    X_fgsm = torch.tensor(imageArray).to(device)
+    #print(image.data)
+    X_fgsm.requires_grad = True
+    opt = optim.SGD([X_fgsm], lr=1e-3)
+    opt.zero_grad()
+    loss = nn.CrossEntropyLoss()(model(X_fgsm), label)
+    loss.backward()
+    #print(X_fgsm)
+    #print(X_fgsm.grad)
+    if order == np.inf:
+        d = epsilon * X_fgsm.grad.data.sign()
+    elif order == 2:
+        gradient = X_fgsm.grad
+        d = torch.zeros(gradient.shape, device = device)
+        for i in range(gradient.shape[0]):
+            norm_grad = gradient[i].data/LA.norm(gradient[i].data.cpu().numpy())
+            d[i] = norm_grad * epsilon
+    else:
+        raise ValueError('Other p norms may need other algorithms')
+    x_adv = X_fgsm + d
+    if clip_max == None and clip_min == None:
+        clip_max = np.inf
+        clip_min = -np.inf
+    x_adv = torch.clamp(x_adv,clip_min, clip_max)
+    return x_adv

deeprobust/image/attack/onepixel.py ADDED Viewed

	@@ -0,0 +1,186 @@

+import numpy as np
+import argparse
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torch.nn.functional as F
+import torch.backends.cudnn as cudnn
+import torchvision
+import torchvision.transforms as transforms
+from torch.autograd import Variable
+from deeprobust.image.optimizer import differential_evolution
+from deeprobust.image.attack.base_attack import BaseAttack
+from deeprobust.image.utils import progress_bar
+class Onepixel(BaseAttack):
+    """
+    Onepixel attack is an algorithm that allow attacker to only manipulate one (or a few) pixel to mislead classifier.
+    This is a re-implementation of One pixel attack.
+    Copyright (c) 2018 Debang Li
+    References
+    ----------
+    Akhtar, N., & Mian, A. (2018).Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey: A Survey. IEEE Access, 6, 14410-14430.
+    Reference code: https://github.com/DebangLi/one-pixel-attack-pytorch
+    """
+    def __init__(self, model, device = 'cuda'):
+        super(Onepixel, self).__init__(model, device)
+    def generate(self, image, label, **kwargs):
+        """
+        Call this function to generate Onepixel adversarial examples.
+        Parameters
+        ----------
+        image :1*3*W*H
+            original image
+        label :
+            target label
+        kwargs :
+            user defined paremeters
+        """
+        label = label.type(torch.FloatTensor)
+        ## check and parse parameters for attack
+        assert self.check_type_device(image, label)
+        assert self.parse_params(**kwargs)
+        return self.one_pixel(self.image,
+                         self.label,
+                         self.targeted_attack,
+                         self.pixels,
+                         self.maxiter,
+                         self.popsize,
+                         self.print_log)
+    def get_pred():
+        return self.adv_pred
+    def parse_params(self,
+             pixels = 1,
+             maxiter = 100,
+             popsize = 400,
+             samples = 100,
+             targeted_attack = False,
+             print_log = True,
+             target = 0):
+        """
+        Parse the user-defined params.
+        Parameters
+        ----------
+        pixels :
+            maximum number of manipulated pixels
+        maxiter :
+            maximum number of iteration
+        popsize :
+            population size
+        samples :
+            samples
+        targeted_attack :
+            targeted attack or not
+        print_log :
+            Set print_log = True to print out details in the searching algorithm
+        target :
+            target label (if targeted attack is set to be True)
+        """
+        self.pixels = pixels
+        self.maxiter = maxiter
+        self.popsize = popsize
+        self.samples = samples
+        self.targeted_attack = targeted_attack
+        self.print_log = print_log
+        self.target = target
+        return True
+    def one_pixel(self, img, label, targeted_attack = False, target = 0, pixels = 1, maxiter = 75, popsize = 400, print_log = False):
+        # label: a number
+        target_calss = target if targeted_attack else label
+        bounds = [(0,32), (0,32), (0,255), (0,255), (0,255)] * pixels
+        popmul = max(1, popsize/len(bounds))
+        predict_fn = lambda xs: predict_classes(
+            xs, img, target_calss, self.model, targeted_attack, self.device)
+        callback_fn = lambda x, convergence: attack_success(
+            x, img, target_calss, self.model, targeted_attack, print_log, self.device)
+        inits = np.zeros([popmul*len(bounds), len(bounds)])
+        for init in inits:
+            for i in range(pixels):
+                init[i*5+0] = np.random.random()*32
+                init[i*5+1] = np.random.random()*32
+                init[i*5+2] = np.random.normal(128,127)
+                init[i*5+3] = np.random.normal(128,127)
+                init[i*5+4] = np.random.normal(128,127)
+        attack_result = differential_evolution(predict_fn, bounds, maxiter = maxiter, popsize = popmul,
+            recombination = 1, atol = -1, callback = callback_fn, polish = False, init = inits)
+        attack_image = perturb_image(attack_result.x, img)
+        attack_var = Variable(attack_image, volatile=True).cuda()
+        predicted_probs = F.softmax(self.model(attack_var)).data.cpu().numpy()[0]
+        predicted_class = np.argmax(predicted_probs)
+        if (not targeted_attack and predicted_class != label) or (targeted_attack and predicted_class == target_calss):
+            self.adv_pred = predicted_class
+            return attack_image
+        return [None]
+def perturb_image(xs, img):
+    if xs.ndim < 2:
+        xs = np.array([xs])
+    batch = len(xs)
+    imgs = img.repeat(batch, 1, 1, 1)
+    xs = xs.astype(int)
+    count = 0
+    for x in xs:
+        pixels = np.split(x, len(x)/5)
+        for pixel in pixels:
+            x_pos, y_pos, r, g, b = pixel
+            imgs[count, 0, x_pos, y_pos] = (r/255.0-0.4914)/0.2023
+            imgs[count, 1, x_pos, y_pos] = (g/255.0-0.4822)/0.1994
+            imgs[count, 2, x_pos, y_pos] = (b/255.0-0.4465)/0.2010
+        count += 1
+    return imgs
+def predict_classes(xs, img, target_calss, net, minimize=True, device = 'cuda'):
+    imgs_perturbed = perturb_image(xs, img.clone()).to(device)
+    predictions = F.softmax(net(imgs_perturbed)).data.cpu().numpy()[:, target_calss]
+    return predictions if minimize else 1 - predictions
+def attack_success(x, img, target_calss, net, targeted_attack = False, print_log=False, device = 'cuda'):
+    attack_image = perturb_image(x, img.clone()).to(device)
+    confidence = F.softmax(net(attack_image)).data.cpu().numpy()[0]
+    pred = np.argmax(confidence)
+    if (print_log):
+        print("Confidence: %.4f"%confidence[target_calss])
+    if (targeted_attack and pred == target_calss) or (not targeted_attack and pred != target_calss):
+        return True

deeprobust/image/defense/AWP.py ADDED Viewed

	@@ -0,0 +1,301 @@

+"""
+This is an implementation of pgd adversarial training.
+References
+----------
+..[1]Mądry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2017).
+Towards Deep Learning Models Resistant to Adversarial Attacks. stat, 1050, 9.
+"""
+import os
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from torchvision import datasets, transforms
+import torch.nn.functional as F
+import numpy as np
+from PIL import Image
+from deeprobust.image.attack.pgd import PGD
+from deeprobust.image.netmodels.CNN import Net
+from deeprobust.image.defense.base_defense import BaseDefense
+EPS = 1E-20
+def diff_in_weights(model, proxy):
+    diff_dict = OrderedDict()
+    model_state_dict = model.state_dict()
+    proxy_state_dict = proxy.state_dict()
+    for (old_k, old_w), (new_k, new_w) in zip(model_state_dict.items(), proxy_state_dict.items()):
+        if len(old_w.size()) <= 1:
+            continue
+        if 'weight' in old_k:
+            diff_w = new_w - old_w
+            diff_dict[old_k] = old_w.norm() / (diff_w.norm() + EPS) * diff_w
+    return diff_dict
+def add_into_weights(model, diff, coeff=1.0):
+    names_in_diff = diff.keys()
+    with torch.no_grad():
+        for name, param in model.named_parameters():
+            if name in names_in_diff:
+                param.add_(coeff * diff[name])
+class pgd_AWP(object):
+    def __init__(self, model, proxy, proxy_optim, gamma):
+        super(pgd_AWP, self).__init__()
+        self.model = model
+        self.proxy = proxy
+        self.proxy_optim = proxy_optim
+        self.gamma = gamma
+    def calc_awp(self, adv_samples, clean_samples, labels, weight, weight1, temp, adv_connect, adv_upweight):
+        self.proxy.load_state_dict(self.model.state_dict())
+        self.proxy.train()
+        # compute adv loss
+        logits_clean, features_clean = self.proxy(clean_samples, feat = True)
+        #loss_clean = F.cross_entropy(logits_clean, labels)
+        # compute adv loss
+        logits_adv, features_adv = self.proxy(adv_samples, feat = True)
+        #loss_adv = F.cross_entropy(logits_adv, labels)
+        loss = F.cross_entropy(logits_adv, labels)
+        # final loss
+        loss = - 1 * loss
+        self.proxy_optim.zero_grad()
+        loss.backward()
+        self.proxy_optim.step()
+        # the adversary weight perturb
+        diff = diff_in_weights(self.model, self.proxy)
+        return diff
+    def perturb(self, diff):
+        add_into_weights(self.model, diff, coeff=1.0 * self.gamma)
+    def restore(self, diff):
+        add_into_weights(self.model, diff, coeff=-1.0 * self.gamma)
+class AWP_AT(BaseDefense):
+    """
+    PGD adversarial training with adversarial weight perturbation.
+    """
+    def __init__(self, model, device):
+        if not torch.cuda.is_available():
+            print('CUDA not availiable, using cpu...')
+            self.device = 'cpu'
+        else:
+            self.device = device
+        self.model = model
+    def generate(self, train_loader, test_loader, **kwargs):
+        """Call this function to generate robust model.
+        Parameters
+        ----------
+        train_loader :
+            training data loader
+        test_loader :
+            testing data loader
+        kwargs :
+            kwargs
+        """
+        self.parse_params(**kwargs)
+        torch.manual_seed(100)
+        device = torch.device(self.device)
+        optimizer = optim.Adam(self.model.parameters(), self.lr)
+        lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[75, 100], gamma = 0.1)
+        save_model = True
+        for epoch in range(1, self.epoch + 1):
+            print('Training epoch: ', epoch, flush = True)
+            self.train(self.device, train_loader, optimizer, epoch)
+            self.test(self.model, self.device, test_loader)
+            if (self.save_model and epoch % self.save_per_epoch == 0):
+                if os.path.isdir(str(self.save_dir)):
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name + '_epoch' + str(epoch) + '.pth'))
+                    print("model saved in " + str(self.save_dir))
+                else:
+                    print("make new directory and save model in " + str(self.save_dir))
+                    os.mkdir('./' + str(self.save_dir))
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name + '_epoch' + str(epoch) + '.pth'))
+            scheduler.step()
+        return self.model
+    def parse_params(self,
+                     epoch_num = 100,
+                     save_dir = "./defense_models",
+                     save_name = "AWP_pgdtraining_0.3",
+                     save_model = True,
+                     epsilon = 8.0 / 255.0,
+                     num_steps = 10,
+                     perturb_step_size = 0.01,
+                     lr = 0.1,
+                     momentum = 0.1,
+                     save_per_epoch = 10):
+        """Parameter parser.
+        Parameters
+        ----------
+        epoch_num : int
+            epoch
+        save_dir : str
+            model dir
+        save_name : str
+            model name
+        save_model : bool
+            Whether to save model
+        epsilon : float
+            attack constraint
+        num_steps : int
+            PGD attack iteration time
+        perturb_step_size : float
+            perturb step size
+        lr : float
+            learning rate for adversary training process
+        momentum : float
+            momentum for optimizor
+        """
+        self.epoch = epoch_num
+        self.save_model = True
+        self.save_dir = save_dir
+        self.save_name = save_name
+        self.epsilon = epsilon
+        self.num_steps = num_steps
+        self.perturb_step_size = perturb_step_size
+        self.lr = lr
+        self.momentum = momentum
+        self.save_per_epoch = save_per_epoch
+    def train(self, device, train_loader, optimizer, epoch):
+        """
+        training process.
+        Parameters
+        ----------
+        device :
+            device
+        train_loader :
+            training data loader
+        optimizer :
+            optimizer
+        epoch :
+            training epoch
+        """
+        self.model.train()
+        correct = 0
+        bs = train_loader.batch_size
+        #scheduler = StepLR(optimizer, step_size = 10, gamma = 0.5)
+        scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones = [70], gamma = 0.1)
+        awp_adversary = pgd_AWP(model = self.model, proxy = proxy, proxy_optim = proxy_optim, gamma=opt.awp_gamma)
+        for batch_idx, (data, target) in enumerate(train_loader):
+            optimizer.zero_grad()
+            data, target = data.to(device), target.to(device)
+            data_adv, output = self.adv_data(data, target, ep = self.epsilon, num_steps = self.num_steps, perturb_step_size = self.perturb_step_size)
+            awp = awp_adversary.calc_awp(adv_samples = adv_samples, clean_samples=           clean_samples, labels = labels, weight = opt.weight, weight1 = opt.weight1, temp = opt.temp,     adv_connect = opt.adv_connect, adv_upweight = opt.adv_upweight)
+            awp_adversary.perturb(awp)
+            loss = self.calculate_loss(output, target)
+            loss.backward()
+            optimizer.step()
+            pred = output.argmax(dim = 1, keepdim = True)
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            #print every 10
+            if batch_idx % 20 == 0:
+                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}\tAccuracy:{:.2f}%'.format(
+                    epoch, batch_idx * len(data), len(train_loader.dataset),
+                       100. * batch_idx / len(train_loader), loss.item(), 100 * correct/(bs)))
+            correct = 0
+            scheduler.step()
+    def test(self, model, device, test_loader):
+        """
+        testing process.
+        Parameters
+        ----------
+        model :
+            model
+        device :
+            device
+        test_loader :
+            testing dataloder
+        """
+        model.eval()
+        test_loss = 0
+        correct = 0
+        test_loss_adv = 0
+        correct_adv = 0
+        for data, target in test_loader:
+            data, target = data.to(device), target.to(device)
+            # print clean accuracy
+            output = model(data)
+            test_loss += F.cross_entropy(output, target, reduction='sum').item()  # sum up batch loss
+            pred = output.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            # print adversarial accuracy
+            data_adv, output_adv = self.adv_data(data, target, ep = self.epsilon, num_steps = self.num_steps)
+            test_loss_adv += self.calculate_loss(output_adv, target, redmode = 'sum').item()  # sum up batch loss
+            pred_adv = output_adv.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct_adv += pred_adv.eq(target.view_as(pred_adv)).sum().item()
+        test_loss /= len(test_loader.dataset)
+        test_loss_adv /= len(test_loader.dataset)
+        print('\nTest set: Clean loss: {:.3f}, Clean Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss, correct, len(test_loader.dataset),
+            100. * correct / len(test_loader.dataset)))
+        print('\nTest set: Adv loss: {:.3f}, Adv Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss_adv, correct_adv, len(test_loader.dataset),
+            100. * correct_adv / len(test_loader.dataset)))
+    def adv_data(self, data, output, ep = 0.3, num_steps = 10, perturb_step_size = 0.01):
+        """
+        Generate input(adversarial) data for training.
+        """
+        adversary = PGD(self.model)
+        data_adv = adversary.generate(data, output.flatten(), epsilon = ep, num_steps = num_steps, step_size = perturb_step_size)
+        output = self.model(data_adv)
+        return data_adv, output
+    def calculate_loss(self, output, target, redmode = 'mean'):
+        """
+        Calculate loss for training.
+        """
+        loss = F.cross_entropy(output, target, reduction = redmode)
+        return loss

deeprobust/image/defense/TherEncoding.py ADDED Viewed

	@@ -0,0 +1,203 @@

+"""
+This is an implementation of Thermometer Encoding.
+References
+----------
+.. [1] Buckman, Jacob, Aurko Roy, Colin Raffel, and Ian Goodfellow. "Thermometer encoding: One hot way to resist adversarial examples." In International Conference on Learning Representations. 2018.
+"""
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torch.nn.functional as F
+import numpy as np
+from torchvision import datasets, transforms
+from deeprobust.image.netmodels.CNN import Net
+import logging
+## TODO
+# class ther_attack(pgd_attack):
+#     """
+#     PGD attacks in response to thermometer encoding models
+#     """
+## TODO
+# def adv_train():
+#     """
+#     adversarial training for thermomoter encoding
+#     """
+def train(model, device, train_loader, optimizer, epoch):
+    """training process.
+    Parameters
+    ----------
+    model :
+        model
+    device :
+        device
+    train_loader :
+        training data loader
+    optimizer :
+        optimizer
+    epoch :
+        epoch
+    """
+    logger.info('trainging')
+    model.train()
+    correct = 0
+    bs = train_loader.batch_size
+    for batch_idx, (data, target) in enumerate(train_loader):
+        optimizer.zero_grad()
+        data, target = data.to(device), target.to(device)
+        encoding = Thermometer(data, LEVELS)
+        encoding = encoding.permute(0, 2, 3, 1, 4)
+        encoding = torch.flatten(encoding, start_dim = 3)
+        encoding = encoding.permute(0, 3, 1, 2)
+        #print(encoding.size())
+        #ipdb.set_trace()
+        output = model(encoding)
+        loss = F.nll_loss(output, target)
+        loss.backward()
+        optimizer.step()
+        pred = output.argmax(dim = 1, keepdim = True)
+        correct += pred.eq(target.view_as(pred)).sum().item()
+        #print(pred,target)
+        #print every 10
+        if batch_idx % 10 == 0:
+            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}\tAccuracy:{:.2f}%'.format(
+                epoch, batch_idx * len(data), len(train_loader.dataset),
+                       100. * batch_idx / len(train_loader), loss.item(), 100 * correct/(10*bs)))
+            correct = 0
+        a = input()
+def test(model, device, test_loader):
+    model.eval()
+    test_loss = 0
+    correct = 0
+    with torch.no_grad():
+        for data, target in test_loader:
+            data, target = data.to(device), target.to(device)
+            encoding = Thermometer(data, LEVELS)
+            encoding = encoding.permute(0, 2, 3, 1, 4)
+            encoding = torch.flatten(encoding, start_dim=3)
+            encoding = encoding.permute(0, 3, 1, 2)
+            # print clean accuracy
+            output = model(encoding)
+            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
+            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
+            correct += pred.eq(target.view_as(pred)).sum().item()
+    test_loss /= len(test_loader.dataset)
+    print('\nTest set: Clean loss: {:.3f}, Clean Accuracy: {}/{} ({:.0f}%)\n'.format(
+        test_loss, correct, len(test_loader.dataset),
+        100. * correct / len(test_loader.dataset)))
+def Thermometer(x, levels, flattened = False):
+    """
+    Output
+    ------
+    Thermometer Encoding of the input.
+    """
+    onehot = one_hot(x, levels)
+    thermometer = one_hot_to_thermometer(onehot, levels)
+    return thermometer
+def one_hot(x, levels):
+    """
+    Output
+    ------
+    One hot Encoding of the input.
+    """
+    batch_size, channel, H, W = x.size()
+    x = x.unsqueeze_(4)
+    x = torch.ceil(x * (LEVELS-1)).long()
+    onehot = torch.zeros(batch_size, channel, H, W, levels).float().to('cuda').scatter_(4, x, 1)
+    #print(onehot)
+    return onehot
+def one_hot_to_thermometer(x, levels, flattened = False):
+    """
+    Convert One hot Encoding to Thermometer Encoding.
+    """
+    if flattened:
+        pass
+        #TODO: check how to flatten
+    thermometer = torch.cumsum(x , dim = 4)
+    if flattened:
+        pass
+    return thermometer
+if __name__ =='__main__':
+    logger = logging.getLogger('Thermometer Encoding')
+    handler = logging.StreamHandler()  # Handler for the logger
+    handler.setFormatter(logging.Formatter('%(asctime)s'))
+    logger.addHandler(handler)
+    logger.setLevel(logging.DEBUG)
+    logger.info('Start attack.')
+    torch.manual_seed(100)
+    device = torch.device("cuda")
+    #ipdb.set_trace()
+    logger.info('Load trainset.')
+    train_loader = torch.utils.data.DataLoader(
+        datasets.MNIST('deeprobust/image/data', train=True, download=True,
+                     transform=transforms.Compose([transforms.ToTensor()])),
+        batch_size=100,
+        shuffle=True)
+    test_loader = torch.utils.data.DataLoader(
+        datasets.MNIST('deeprobust/image/data', train=False,
+                    transform=transforms.Compose([transforms.ToTensor()])),
+        batch_size=1000,
+        shuffle=True)
+    #ipdb.set_trace()
+    #TODO: change the channel according to the dataset.
+    LEVELS = 10
+    channel = 1
+    model = Net(in_channel1 = channel * LEVELS, out_channel1= 32 * LEVELS, out_channel2= 64 * LEVELS).to(device)
+    optimizer = optim.SGD(model.parameters(), lr = 0.0001, momentum = 0.2)
+    logger.info('Load model.')
+    save_model = True
+    for epoch in range(1, 50 + 1):     ## 5 batches
+        print('Running epoch ', epoch)
+        train(model, device, train_loader, optimizer, epoch)
+        test(model, device, test_loader)
+        if (save_model):
+            torch.save(model.state_dict(), "deeprobust/image/save_models/thermometer_encoding.pt")

deeprobust/image/defense/YOPO.py ADDED Viewed

	@@ -0,0 +1,410 @@

+"""
+This is an implementation of adversarial training variant: YOPO.
+References
+----------
+.. [1] Zhang, D., Zhang, T., Lu, Y., Zhu, Z., & Dong, B. (2019).
+You only propagate once: Painless adversarial training using maximal principle.
+arXiv preprint arXiv:1905.00877.
+.. [2] Original code: https://github.com/a1600012888/YOPO-You-Only-Propagate-Once
+"""
+import torch
+import torch.nn as nn
+from torch.nn.modules.loss import _Loss
+from torch import optim
+from collections import OrderedDict
+import torch
+from tqdm import tqdm
+from typing import Tuple, List, Dict
+import numpy as np
+import argparse
+import json
+import math
+import os
+from deeprobust.image.netmodels import YOPOCNN
+from deeprobust.image import utils
+from deeprobust.image.attack import YOPOpgd
+from deeprobust.image.defense.base_defense import BaseDefense
+import time
+from tensorboardX import SummaryWriter
+class PieceWiseConstantLrSchedulerMaker(object):
+    def __init__(self, milestones:List[int], gamma:float = 0.1):
+        self.milestones = milestones
+        self.gamma = gamma
+    def __call__(self, optimizer):
+        return torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=self.milestones, gamma=self.gamma)
+class IPGDAttackMethodMaker(object):
+    def __init__(self, eps, sigma, nb_iters, norm, mean, std):
+        self.eps = eps
+        self.sigma = sigma
+        self.nb_iters = nb_iters
+        self.norm = norm
+        self.mean = mean
+        self.std = std
+    def __call__(self, DEVICE):
+        return YOPOpgd.FASTPGD(self.eps, self.sigma, self.nb_iters, self.norm, DEVICE, self.mean, self.std)
+def torch_accuracy(output, target, topk=(1,)) -> List[torch.Tensor]:
+    '''
+    param output, target: should be torch Variable
+    '''
+    # assert isinstance(output, torch.cuda.Tensor), 'expecting Torch Tensor'
+    # assert isinstance(target, torch.Tensor), 'expecting Torch Tensor'
+    # print(type(output))
+    topn = max(topk)
+    batch_size = output.size(0)
+    _, pred = output.topk(topn, 1, True, True)
+    pred = pred.t()
+    is_correct = pred.eq(target.view(1, -1).expand_as(pred))
+    ans = []
+    for i in topk:
+        is_correct_i = is_correct[:i].view(-1).float().sum(0, keepdim=True)
+        ans.append(is_correct_i.mul_(100.0 / batch_size))
+    return ans
+class AvgMeter(object):
+    name = 'No name'
+    def __init__(self, name='No name'):
+        self.name = name
+        self.reset()
+    def reset(self):
+        self.sum = 0
+        self.mean = 0
+        self.num = 0
+        self.now = 0
+    def update(self, mean_var, count=1):
+        if math.isnan(mean_var):
+            mean_var = 1e6
+            print('Avgmeter getting Nan!')
+        self.now = mean_var
+        self.num += count
+        self.sum += mean_var * count
+        self.mean = float(self.sum) / self.num
+def load_checkpoint(file_name, net = None, optimizer = None, lr_scheduler = None):
+    if os.path.isfile(file_name):
+        print("=> loading checkpoint '{}'".format(file_name))
+        check_point = torch.load(file_name)
+        if net is not None:
+            print('Loading network state dict')
+            net.load_state_dict(check_point['state_dict'])
+        if optimizer is not None:
+            print('Loading optimizer state dict')
+            optimizer.load_state_dict(check_point['optimizer_state_dict'])
+        if lr_scheduler is not None:
+            print('Loading lr_scheduler state dict')
+            lr_scheduler.load_state_dict(check_point['lr_scheduler_state_dict'])
+        return check_point['epoch']
+    else:
+        print("=> no checkpoint found at '{}'".format(file_name))
+def make_symlink(source, link_name):
+    if os.path.exists(link_name):
+        #print("Link name already exist! Removing '{}' and overwriting".format(link_name))
+        os.remove(link_name)
+    if os.path.exists(source):
+        os.symlink(source, link_name)
+        return
+    else:
+        print('Source path not exists')
+    #print('SymLink Wrong!')
+def add_path(path):
+    if path not in sys.path:
+        print('Adding {}'.format(path))
+        sys.path.append(path)
+class Hamiltonian(_Loss):
+    def __init__(self, layer, reg_cof = 1e-4):
+        super(Hamiltonian, self).__init__()
+        self.layer = layer
+        self.reg_cof = 0
+    def forward(self, x, p):
+        y = self.layer(x)
+        H = torch.sum(y * p)
+        return H
+class CrossEntropyWithWeightPenlty(_Loss):
+    def __init__(self, module, DEVICE, reg_cof = 1e-4):
+        super(CrossEntropyWithWeightPenlty, self).__init__()
+        self.reg_cof = reg_cof
+        self.criterion = nn.CrossEntropyLoss().to(DEVICE)
+        self.module = module
+    def __call__(self, pred, label):
+        cross_loss = self.criterion(pred, label)
+        weight_loss = cal_l2_norm(self.module)
+        loss = cross_loss + self.reg_cof * weight_loss
+        return loss
+def cal_l2_norm(layer: torch.nn.Module):
+ loss = 0.
+ for name, param in layer.named_parameters():
+     if name == 'weight':
+         loss = loss + 0.5 * torch.norm(param,) ** 2
+ return loss
+class FastGradientLayerOneTrainer(object):
+    def __init__(self, Hamiltonian_func, param_optimizer,
+                    inner_steps=2, sigma = 0.008, eps = 0.03):
+        self.inner_steps = inner_steps
+        self.sigma = sigma
+        self.eps = eps
+        self.Hamiltonian_func = Hamiltonian_func
+        self.param_optimizer = param_optimizer
+    def step(self, inp, p, eta):
+        p = p.detach()
+        for i in range(self.inner_steps):
+            tmp_inp = inp + eta
+            tmp_inp = torch.clamp(tmp_inp, 0, 1)
+            H = self.Hamiltonian_func(tmp_inp, p)
+            eta_grad_sign = torch.autograd.grad(H, eta, only_inputs=True, retain_graph=False)[0].sign()
+            eta = eta - eta_grad_sign * self.sigma
+            eta = torch.clamp(eta, -1.0 * self.eps, self.eps)
+            eta = torch.clamp(inp + eta, 0.0, 1.0) - inp
+            eta = eta.detach()
+            eta.requires_grad_()
+            eta.retain_grad()
+        #self.param_optimizer.zero_grad()
+        yofo_inp = eta + inp
+        yofo_inp = torch.clamp(yofo_inp, 0, 1)
+        loss = -1.0 * self.Hamiltonian_func(yofo_inp, p)
+        loss.backward()
+        #self.param_optimizer.step()
+        #self.param_optimizer.zero_grad()
+        return yofo_inp, eta
+def eval_one_epoch(net, batch_generator,  DEVICE=torch.device('cuda:0'), AttackMethod = None):
+    net.eval()
+    pbar = tqdm(batch_generator)
+    clean_accuracy = AvgMeter()
+    adv_accuracy = AvgMeter()
+    pbar.set_description('Evaluating')
+    for (data, label) in pbar:
+        data = data.to(DEVICE)
+        label = label.to(DEVICE)
+        with torch.no_grad():
+            pred = net(data)
+            acc = torch_accuracy(pred, label, (1,))
+            clean_accuracy.update(acc[0].item())
+        if AttackMethod is not None:
+            adv_inp = AttackMethod.attack(net, data, label)
+            with torch.no_grad():
+                pred = net(adv_inp)
+                acc = torch_accuracy(pred, label, (1,))
+                adv_accuracy.update(acc[0].item())
+        pbar_dic = OrderedDict()
+        pbar_dic['CleanAcc'] = '{:.2f}'.format(clean_accuracy.mean)
+        pbar_dic['AdvAcc'] = '{:.2f}'.format(adv_accuracy.mean)
+        pbar.set_postfix(pbar_dic)
+        adv_acc = adv_accuracy.mean if AttackMethod is not None else 0
+    return clean_accuracy.mean, adv_acc
+class SGDOptimizerMaker(object):
+    def __init__(self, lr = 0.1, momentum = 0.9, weight_decay = 1e-4):
+        self.lr = lr
+        self.momentum = momentum
+        self.weight_decay = weight_decay
+    def __call__(self, params):
+        return torch.optim.SGD(params, lr=self.lr, momentum=self.momentum, weight_decay=self.weight_decay)
+def main():
+    num_epochs = 40
+    val_interval = 1
+    weight_decay = 5e-4
+    inner_iters = 10
+    K = 5
+    sigma = 0.01
+    eps = 0.3
+    lr = 1e-2
+    momentum = 0.9
+    create_optimizer = SGDOptimizerMaker(lr =1e-2 / K, momentum = 0.9, weight_decay = weight_decay)
+    create_lr_scheduler = PieceWiseConstantLrSchedulerMaker(milestones = [30, 35, 39], gamma = 0.1)
+    create_loss_function = None
+    create_attack_method = None
+    create_evaluation_attack_method = IPGDAttackMethodMaker(eps = 0.3, sigma = 0.01, nb_iters = 40, norm = np.inf,
+                                      mean=torch.tensor(np.array([0]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]),
+                                      std=torch.tensor(np.array([1]).astype(np.float32)[np.newaxis, :, np.newaxis, np.newaxis]))
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--model_dir',default = "./trained_models")
+    parser.add_argument('--resume', default=None, type=str, metavar='PATH',
+                    help='path to latest checkpoint (default: none)')
+    parser.add_argument('-b', '--batch_size', default=256, type=int,
+                    metavar='N', help='mini-batch size')
+    parser.add_argument('-d', type=int, default=0, help='Which gpu to use')
+    parser.add_argument('-adv_coef', default=1.0, type = float,
+                        help = 'Specify the weight for adversarial loss')
+    parser.add_argument('--auto-continue', default=False, action = 'store_true',
+                        help = 'Continue from the latest checkpoint')
+    args = parser.parse_args()
+    DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    net = YOPOCNN.Net()
+    net.to(DEVICE)
+    criterion = CrossEntropyWithWeightPenlty(net.other_layers, DEVICE, weight_decay)#.to(DEVICE)
+    optimizer = create_optimizer(net.other_layers.parameters())
+    lr_scheduler = create_lr_scheduler(optimizer)
+    Hamiltonian_func = Hamiltonian(net.layer_one, weight_decay)
+    layer_one_optimizer = optim.SGD(net.layer_one.parameters(), lr = lr_scheduler.get_lr()[0], momentum=0.9, weight_decay=5e-4)
+    lyaer_one_optimizer_lr_scheduler = optim.lr_scheduler.MultiStepLR(layer_one_optimizer,
+                                                                    milestones = [15, 19], gamma = 0.1)
+    LayerOneTrainer = FastGradientLayerOneTrainer(Hamiltonian_func, layer_one_optimizer,
+                                                inner_iters, sigma, eps)
+    ds_train = utils.create_train_dataset(args.batch_size)
+    ds_val = utils.create_test_dataset(args.batch_size)
+    EvalAttack = create_evaluation_attack_method(DEVICE)
+    now_epoch = 0
+    if args.auto_continue:
+        args.resume = os.path.join(args.model_dir, 'last.checkpoint')
+    if args.resume is not None and os.path.isfile(args.resume):
+        now_epoch = load_checkpoint(args.resume, net, optimizer,lr_scheduler)
+    now_train_time = 0
+    while True:
+        if now_epoch > num_epochs:
+            break
+        now_epoch = now_epoch + 1
+        descrip_str = 'Training epoch:{}/{} -- lr:{}'.format(now_epoch, num_epochs,
+                                                                        lr_scheduler.get_lr()[0])
+        s_time = time.time()
+        #train
+        acc, yopoacc = train_one_epoch(net, ds_train, optimizer, eps, criterion, LayerOneTrainer, K,
+                        DEVICE, descrip_str)
+        now_train_time = now_train_time + time.time() - s_time
+        tb_train_dic = {'Acc':acc, 'YoPoAcc':yopoacc}
+        print(tb_train_dic)
+        lr_scheduler.step()
+        lyaer_one_optimizer_lr_scheduler.step()
+        utils.save_checkpoint(now_epoch, net, optimizer, lr_scheduler,
+                        file_name = os.path.join(args.model_dir, 'epoch-{}.checkpoint'.format(now_epoch)))
+def train_one_epoch(net, batch_generator, optimizer, eps,
+                    criterion, LayerOneTrainner, K,
+                    DEVICE=torch.device('cuda:0'),descrip_str='Training'):
+    '''
+    :param attack_freq:  Frequencies of training with adversarial examples. -1 indicates natural training
+    :param AttackMethod: the attack method, None represents natural training
+    :return:  None    #(clean_acc, adv_acc)
+    '''
+    net.train()
+    pbar = tqdm(batch_generator)
+    yofoacc = -1
+    cleanacc = -1
+    cleanloss = -1
+    pbar.set_description(descrip_str)
+    for i, (data, label) in enumerate(pbar):
+        data = data.to(DEVICE)
+        label = label.to(DEVICE)
+        eta = torch.FloatTensor(*data.shape).uniform_(-eps, eps)
+        eta = eta.to(label.device)
+        eta.requires_grad_()
+        optimizer.zero_grad()
+        LayerOneTrainner.param_optimizer.zero_grad()
+        for j in range(K):
+            pbar_dic = OrderedDict()
+            TotalLoss = 0
+            pred = net(data + eta.detach())
+            loss = criterion(pred, label)
+            TotalLoss = TotalLoss + loss
+            wgrad = net.conv1.weight.grad
+            TotalLoss.backward()
+            net.conv1.weight.grad = wgrad
+            p = -1.0 * net.layer_one_out.grad
+            yofo_inp, eta = LayerOneTrainner.step(data, p, eta)
+            with torch.no_grad():
+                if j == 0:
+                    acc = torch_accuracy(pred, label, (1,))
+                    cleanacc = acc[0].item()
+                    cleanloss = loss.item()
+                if j == K - 1:
+                    yofo_pred = net(yofo_inp)
+                    yofoacc = torch_accuracy(yofo_pred, label, (1,))[0].item()
+        optimizer.step()
+        LayerOneTrainner.param_optimizer.step()
+        optimizer.zero_grad()
+        LayerOneTrainner.param_optimizer.zero_grad()
+        pbar_dic['Acc'] = '{:.2f}'.format(cleanacc)
+        pbar_dic['loss'] = '{:.2f}'.format(cleanloss)
+        pbar_dic['YoPoAcc'] = '{:.2f}'.format(yofoacc)
+        pbar.set_postfix(pbar_dic)
+    return cleanacc, yofoacc
+if __name__ == "__main__":
+    main()

deeprobust/image/defense/__init__.py ADDED Viewed

	@@ -0,0 +1,6 @@

+from deeprobust.image.defense import base_defense
+from deeprobust.image.defense import pgdtraining
+from deeprobust.image.defense import fgsmtraining
+from deeprobust.image.defense import TherEncoding
+from deeprobust.image.defense import trades
+from deeprobust.image.defense import YOPO

deeprobust/image/defense/base_defense.py ADDED Viewed

	@@ -0,0 +1,100 @@

+from abc import ABCMeta
+import torch
+class BaseDefense(object):
+    """
+    Defense base class.
+    """
+    __metaclass__ = ABCMeta
+    def __init__(self, model, device):
+        self.model = model
+        self.device = device
+    def parse_params(self, **kwargs):
+        """
+        Parse user defined parameters
+        """
+        return True
+    def generate(self, train_loader, test_loader, **kwargs):
+        """generate.
+        Parameters
+        ----------
+        train_loader :
+            training data
+        test_loader :
+            testing data
+        kwargs :
+            user defined parameters
+        """
+        self.train_loader = train_loader
+        self.test_loader = test_loader
+        return
+    def train(self, train_loader, optimizer, epoch):
+        """train.
+        Parameters
+        ----------
+        train_loader :
+            training data
+        optimizer :
+            training optimizer
+        epoch :
+            training epoch
+        """
+        return True
+    def test(self, test_loader):
+        """test.
+        Parameters
+        ----------
+        test_loader :
+            testing data
+        """
+        return True
+    def adv_data(self, model, data, target, **kwargs):
+        """
+        Generate adversarial examples for adversarial training.
+        Overide this function to generate customize adv examples.
+        Parameters
+        ----------
+        model :
+            victim model
+        data :
+            original data
+        target :
+            target labels
+        kwargs :
+            parameters
+        """
+        return True
+    def loss(self, output, target):
+        """
+        Calculate training loss.
+        Overide this function to customize loss.
+        Parameters
+        ----------
+        output :
+            model outputs
+        target :
+            true labels
+        """
+        return True
+    def generate(self):
+        return True
+    def save_model(self):
+        """
+        Save model.
+        """
+        return True

deeprobust/image/defense/fast.py ADDED Viewed

	@@ -0,0 +1,169 @@

+"""
+This is an implementation of adversarial training variant: fast
+References
+----------
+.. [1] Wong, Eric, Leslie Rice, and J. Zico Kolter. "Fast is better than free: Revisiting adversarial training." arXiv preprint arXiv:2001.03994 (2020).
+"""
+import os
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torchvision
+from torchvision import datasets, transforms
+from torch.utils.data import DataLoader, Dataset
+from torch import optim
+from deeprobust.image.defense.base_defense import BaseDefense
+from deeprobust.image.attack.fgsm import FGSM
+class Fast(BaseDefense):
+    def __init__(self, model, device):
+        if not torch.cuda.is_available():
+            print('CUDA not availiable, using cpu...')
+            self.device = 'cpu'
+        else:
+            self.device = device
+        self.model = model
+    def generate(self, train_loader, test_loader, **kwargs):
+        """
+        FGSM defense process:
+        """
+        self.parse_params(**kwargs)
+        torch.manual_seed(100)
+        device = torch.device(self.device)
+        optimizer = optim.Adam(self.model.parameters(), self.lr_train)
+        for epoch in range(1, self.epoch_num + 1):
+            print(epoch, flush = True)
+            self.train(self.device, train_loader, optimizer, epoch)
+            self.test(self.model, self.device, test_loader)
+            if (self.save_model):
+                if os.path.isdir('./' + self.save_dir):
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name))
+                    print("model saved in " + './' + self.save_dir)
+                else:
+                    print("make new directory and save model in " + './' + self.save_dir)
+                    os.mkdir('./' + self.save_dir)
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name))
+        return self.model
+    def parse_params(self,
+                     save_dir = "defense_models",
+                     save_model = True,
+                     save_name = "fast_mnist_fgsmtraining_0.2.pt",
+                     epsilon = 0.2,
+                     epoch_num = 30,
+                     lr_train = 0.005,
+                     momentum = 0.1):
+        # """
+        # Set parameters for fast training.
+        # """
+        self.save_model = True
+        self.save_dir = save_dir
+        self.save_name = save_name
+        self.epsilon = epsilon
+        self.epoch_num = epoch_num
+        self.lr_train = lr_train
+        self.momentum = momentum
+    def train(self, device, train_loader, optimizer, epoch):
+        """
+        Training process.
+        """
+        self.model.train()
+        correct = 0
+        bs = train_loader.batch_size
+        for batch_idx, (data, target) in enumerate(train_loader):
+            optimizer.zero_grad()
+            data, target = data.to(device), target.to(device)
+            data_adv, output = self.adv_data(data, target, ep = self.epsilon)
+            loss = self.calculate_loss(output, target)
+            loss.backward()
+            optimizer.step()
+            pred = output.argmax(dim = 1, keepdim = True)
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            #print every 10
+            if batch_idx % 10 == 0:
+                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}\tAccuracy:{:.2f}%'.format(
+                    epoch, batch_idx * len(data), len(train_loader.dataset),
+                       100. * batch_idx / len(train_loader), loss.item(), 100 * correct/(10*bs)))
+                correct = 0
+    def test(self, model, device, test_loader):
+        """
+        Testing process.
+        """
+        model.eval()
+        test_loss = 0
+        correct = 0
+        test_loss_adv = 0
+        correct_adv = 0
+        for data, target in test_loader:
+            data, target = data.to(device), target.to(device)
+            # print clean accuracy
+            output = model(data)
+            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
+            pred = output.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            # print adversarial accuracy
+            data_adv, output_adv = self.adv_data(data, target, ep = self.epsilon)
+            test_loss_adv += self.calculate_loss(output_adv, target, redmode = 'sum').item()  # sum up batch loss
+            pred_adv = output_adv.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct_adv += pred_adv.eq(target.view_as(pred_adv)).sum().item()
+        test_loss /= len(test_loader.dataset)
+        test_loss_adv /= len(test_loader.dataset)
+        print('\nTest set: Clean loss: {:.3f}, Clean Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss, correct, len(test_loader.dataset),
+            100. * correct / len(test_loader.dataset)))
+        print('\nTest set: Adv loss: {:.3f}, Adv Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss_adv, correct_adv, len(test_loader.dataset),
+            100. * correct_adv / len(test_loader.dataset)))
+    def adv_data(self, data, output, ep = 0.3, num_steps = 40):
+        # """
+        # Generate input(adversarial) data for training.
+        # """
+        delta = torch.zeros_like(data).uniform_(-ep, ep).to(self.device)
+        data = delta + data
+        adversary = FGSM(self.model)
+        data_adv = adversary.generate(data, output.flatten(), epsilon = ep)
+        output = self.model(data_adv)
+        return data_adv, output
+    def calculate_loss(self, output, target, redmode = 'mean'):
+        """
+        Calculate loss for training.
+        """
+        loss = F.nll_loss(output, target, reduction = redmode)
+        return loss

deeprobust/image/defense/fgsmtraining.py ADDED Viewed

	@@ -0,0 +1,227 @@

+"""
+This is the implementation of fgsm training.
+References
+ ----------
+..[1]Szegedy, C., Zaremba, W., Sutskever, I., Estrach, J. B., Erhan, D., Goodfellow, I., & Fergus, R. (2014, January).
+Intriguing properties of neural networks.
+"""
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from torchvision import datasets, transforms
+import torch.nn.functional as F
+import numpy as np
+from PIL import Image
+import os
+from deeprobust.image.netmodels import CNN
+from deeprobust.image.attack.fgsm import FGSM
+from deeprobust.image.defense.base_defense import BaseDefense
+class FGSMtraining(BaseDefense):
+    """
+    FGSM adversarial training.
+    """
+    def __init__(self, model, device):
+        if not torch.cuda.is_available():
+            print('CUDA not availiable, using cpu...')
+            self.device = 'cpu'
+        else:
+            self.device = device
+        self.model = model
+    def generate(self, train_loader, test_loader, **kwargs):
+        """FGSM adversarial training process.
+        Parameters
+        ----------
+        train_loader :
+            training data loader
+        test_loader :
+            testing data loader
+        kwargs :
+            kwargs
+        """
+        self.parse_params(**kwargs)
+        torch.manual_seed(100)
+        device = torch.device(self.device)
+        optimizer = optim.Adam(self.model.parameters(), self.lr_train)
+        for epoch in range(1, self.epoch_num + 1):
+            print(epoch, flush = True)
+            self.train(self.device, train_loader, optimizer, epoch)
+            self.test(self.model, self.device, test_loader)
+            if (self.save_model):
+                if os.path.isdir('./' + self.save_dir):
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name))
+                    print("model saved in " + './' + self.save_dir)
+                else:
+                    print("make new directory and save model in " + './' + self.save_dir)
+                    os.mkdir('./' + self.save_dir)
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name))
+        return self.model
+    def parse_params(self,
+                     save_dir = "defense_models",
+                     save_model = True,
+                     save_name = "mnist_fgsmtraining_0.2.pt",
+                     epsilon = 0.2,
+                     epoch_num = 50,
+                     lr_train = 0.005,
+                     momentum = 0.1):
+        """parse_params.
+        Parameters
+        ----------
+        save_dir :
+            dir
+        save_model :
+            Whether to save model
+        save_name :
+            model name
+        epsilon :
+            attack perturbation constraint
+        epoch_num :
+            number of training epoch
+        lr_train :
+            training learning rate
+        momentum :
+            momentum for optimizor
+        """
+        self.save_model = True
+        self.save_dir = save_dir
+        self.save_name = save_name
+        self.epsilon = epsilon
+        self.epoch_num = epoch_num
+        self.lr_train = lr_train
+        self.momentum = momentum
+    def train(self, device, train_loader, optimizer, epoch):
+        """
+        training process.
+        Parameters
+        ----------
+        device :
+            device
+        train_loader :
+            training data loader
+        optimizer :
+            optimizer
+        epoch :
+            training epoch
+        """
+        self.model.train()
+        correct = 0
+        bs = train_loader.batch_size
+        for batch_idx, (data, target) in enumerate(train_loader):
+            optimizer.zero_grad()
+            data, target = data.to(device), target.to(device)
+            data_adv, output = self.adv_data(data, target, ep = self.epsilon)
+            loss = self.calculate_loss(output, target)
+            loss.backward()
+            optimizer.step()
+            pred = output.argmax(dim = 1, keepdim = True)
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            #print every 10
+            if batch_idx % 10 == 0:
+                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}\tAccuracy:{:.2f}%'.format(
+                    epoch, batch_idx * len(data), len(train_loader.dataset),
+                       100. * batch_idx / len(train_loader), loss.item(), 100 * correct/(10*bs)))
+                correct = 0
+    def test(self, model, device, test_loader):
+        """
+        testing process.
+        Parameters
+        ----------
+        model :
+            model
+        device :
+            device
+        test_loader :
+            testing dataloder
+        """
+        model.eval()
+        test_loss = 0
+        correct = 0
+        test_loss_adv = 0
+        correct_adv = 0
+        for data, target in test_loader:
+            data, target = data.to(device), target.to(device)
+            # print clean accuracy
+            output = model(data)
+            test_loss += F.cross_entropy(output, target, reduction='sum').item()  # sum up batch loss
+            pred = output.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            # print adversarial accuracy
+            data_adv, output_adv = self.adv_data(data, target, ep = self.epsilon)
+            test_loss_adv += self.calculate_loss(output_adv, target, redmode = 'sum').item()  # sum up batch loss
+            pred_adv = output_adv.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct_adv += pred_adv.eq(target.view_as(pred_adv)).sum().item()
+        test_loss /= len(test_loader.dataset)
+        test_loss_adv /= len(test_loader.dataset)
+        print('\nTest set: Clean loss: {:.3f}, Clean Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss, correct, len(test_loader.dataset),
+            100. * correct / len(test_loader.dataset)))
+        print('\nTest set: Adv loss: {:.3f}, Adv Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss_adv, correct_adv, len(test_loader.dataset),
+            100. * correct_adv / len(test_loader.dataset)))
+    def adv_data(self, data, output, ep = 0.3, num_steps = 40):
+        """Generate adversarial data for training.
+        Parameters
+        ----------
+        data :
+            data
+        output :
+            output
+        ep :
+            epsilon, perturbation budget.
+        num_steps :
+            iteration steps
+        """
+        # """
+        # Generate input(adversarial) data for training.
+        # """
+        adversary = FGSM(self.model)
+        data_adv = adversary.generate(data, output.flatten(), epsilon = ep)
+        output = self.model(data_adv)
+        return data_adv, output
+    def calculate_loss(self, output, target, redmode = 'mean'):
+        """
+        Calculate loss for training.
+        """
+        loss = F.cross_entropy(output, target, reduction = redmode)
+        return loss

deeprobust/image/defense/pgdtraining.py ADDED Viewed

	@@ -0,0 +1,229 @@

+"""
+This is an implementation of pgd adversarial training.
+References
+----------
+..[1]Mądry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2017).
+Towards Deep Learning Models Resistant to Adversarial Attacks. stat, 1050, 9.
+"""
+import os
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from torchvision import datasets, transforms
+import torch.nn.functional as F
+import numpy as np
+from PIL import Image
+from deeprobust.image.attack.pgd import PGD
+from deeprobust.image.netmodels.CNN import Net
+from deeprobust.image.defense.base_defense import BaseDefense
+class PGDtraining(BaseDefense):
+    """
+    PGD adversarial training.
+    """
+    def __init__(self, model, device):
+        if not torch.cuda.is_available():
+            print('CUDA not availiable, using cpu...')
+            self.device = 'cpu'
+        else:
+            self.device = device
+        self.model = model
+    def generate(self, train_loader, test_loader, **kwargs):
+        """Call this function to generate robust model.
+        Parameters
+        ----------
+        train_loader :
+            training data loader
+        test_loader :
+            testing data loader
+        kwargs :
+            kwargs
+        """
+        self.parse_params(**kwargs)
+        torch.manual_seed(100)
+        device = torch.device(self.device)
+        optimizer = optim.Adam(self.model.parameters(), self.lr)
+        scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[75, 100], gamma = 0.1)
+        save_model = True
+        for epoch in range(1, self.epoch + 1):
+            print('Training epoch: ', epoch, flush = True)
+            self.train(self.device, train_loader, optimizer, epoch)
+            self.test(self.model, self.device, test_loader)
+            if (self.save_model and epoch % self.save_per_epoch == 0):
+                if os.path.isdir(str(self.save_dir)):
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name + '_epoch' + str(epoch) + '.pth'))
+                    print("model saved in " + str(self.save_dir))
+                else:
+                    print("make new directory and save model in " + str(self.save_dir))
+                    os.mkdir('./' + str(self.save_dir))
+                    torch.save(self.model.state_dict(), os.path.join(self.save_dir, self.save_name + '_epoch' + str(epoch) + '.pth'))
+            scheduler.step()
+        return self.model
+    def parse_params(self,
+                     epoch_num = 100,
+                     save_dir = "./defense_models",
+                     save_name = "mnist_pgdtraining_0.3",
+                     save_model = True,
+                     epsilon = 8.0 / 255.0,
+                     num_steps = 10,
+                     perturb_step_size = 0.01,
+                     lr = 0.1,
+                     momentum = 0.1,
+                     save_per_epoch = 10):
+        """Parameter parser.
+        Parameters
+        ----------
+        epoch_num : int
+            epoch
+        save_dir : str
+            model dir
+        save_name : str
+            model name
+        save_model : bool
+            Whether to save model
+        epsilon : float
+            attack constraint
+        num_steps : int
+            PGD attack iteration time
+        perturb_step_size : float
+            perturb step size
+        lr : float
+            learning rate for adversary training process
+        momentum : float
+            momentum for optimizor
+        """
+        self.epoch = epoch_num
+        self.save_model = True
+        self.save_dir = save_dir
+        self.save_name = save_name
+        self.epsilon = epsilon
+        self.num_steps = num_steps
+        self.perturb_step_size = perturb_step_size
+        self.lr = lr
+        self.momentum = momentum
+        self.save_per_epoch = save_per_epoch
+    def train(self, device, train_loader, optimizer, epoch):
+        """
+        training process.
+        Parameters
+        ----------
+        device :
+            device
+        train_loader :
+            training data loader
+        optimizer :
+            optimizer
+        epoch :
+            training epoch
+        """
+        self.model.train()
+        correct = 0
+        bs = train_loader.batch_size
+        #scheduler = StepLR(optimizer, step_size = 10, gamma = 0.5)
+        for batch_idx, (data, target) in enumerate(train_loader):
+            optimizer.zero_grad()
+            data, target = data.to(device), target.to(device)
+            data_adv, output = self.adv_data(data, target, ep = self.epsilon, num_steps = self.num_steps, perturb_step_size = self.perturb_step_size)
+            loss = self.calculate_loss(output, target)
+            loss.backward()
+            optimizer.step()
+            pred = output.argmax(dim = 1, keepdim = True)
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            #print every 10
+            if batch_idx % 20 == 0:
+                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}\tAccuracy:{:.2f}%'.format(
+                    epoch, batch_idx * len(data), len(train_loader.dataset),
+                       100. * batch_idx / len(train_loader), loss.item(), 100 * correct/(bs)))
+            correct = 0
+    def test(self, model, device, test_loader):
+        """
+        testing process.
+        Parameters
+        ----------
+        model :
+            model
+        device :
+            device
+        test_loader :
+            testing dataloder
+        """
+        model.eval()
+        test_loss = 0
+        correct = 0
+        test_loss_adv = 0
+        correct_adv = 0
+        for data, target in test_loader:
+            data, target = data.to(device), target.to(device)
+            # print clean accuracy
+            output = model(data)
+            test_loss += F.cross_entropy(output, target, reduction='sum').item()  # sum up batch loss
+            pred = output.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct += pred.eq(target.view_as(pred)).sum().item()
+            # print adversarial accuracy
+            data_adv, output_adv = self.adv_data(data, target, ep = self.epsilon, num_steps = self.num_steps)
+            test_loss_adv += self.calculate_loss(output_adv, target, redmode = 'sum').item()  # sum up batch loss
+            pred_adv = output_adv.argmax(dim = 1, keepdim = True)  # get the index of the max log-probability
+            correct_adv += pred_adv.eq(target.view_as(pred_adv)).sum().item()
+        test_loss /= len(test_loader.dataset)
+        test_loss_adv /= len(test_loader.dataset)
+        print('\nTest set: Clean loss: {:.3f}, Clean Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss, correct, len(test_loader.dataset),
+            100. * correct / len(test_loader.dataset)))
+        print('\nTest set: Adv loss: {:.3f}, Adv Accuracy: {}/{} ({:.0f}%)\n'.format(
+            test_loss_adv, correct_adv, len(test_loader.dataset),
+            100. * correct_adv / len(test_loader.dataset)))
+    def adv_data(self, data, output, ep = 0.3, num_steps = 10, perturb_step_size = 0.01):
+        """
+        Generate input(adversarial) data for training.
+        """
+        adversary = PGD(self.model)
+        data_adv = adversary.generate(data, output.flatten(), epsilon = ep, num_steps = num_steps, step_size = perturb_step_size)
+        output = self.model(data_adv)
+        return data_adv, output
+    def calculate_loss(self, output, target, redmode = 'mean'):
+        """
+        Calculate loss for training.
+        """
+        loss = F.cross_entropy(output, target, reduction = redmode)
+        return loss

deeprobust/image/defense/trades.py ADDED Viewed

	@@ -0,0 +1,241 @@

+"""
+This is an implementation of [1]
+References
+---------
+.. [1] Zhang, H., Yu, Y., Jiao, J., Xing, E., El Ghaoui, L., & Jordan, M. (2019, May).
+Theoretically Principled Trade-off between Robustness and Accuracy.
+In International Conference on Machine Learning (pp. 7472-7482).
+This implementation is based on their code: https://github.com/yaodongyu/TRADES
+Copyright (c) 2019 Hongyang Zhang, Yaodong Yu
+"""
+import os
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.autograd import Variable
+import torch.optim as optim
+from torchvision import datasets, transforms
+from deeprobust.image.defense.base_defense import BaseDefense
+from deeprobust.image.netmodels.CNN import Net
+from deeprobust.image.utils import adjust_learning_rate
+class TRADES(BaseDefense):
+    """TRADES.
+    """
+    def __init__(self, model, device = 'cuda'):
+        if not torch.cuda.is_available():
+            print('CUDA not available, using cpu...')
+            self.device = 'cpu'
+        else:
+            self.device = device
+        self.model = model.to(self.device)
+    def generate(self, train_loader, test_loader, **kwargs):
+        """generate robust model.
+        Parameters
+        ----------
+        train_loader :
+            train_loader
+        test_loader :
+            test_loader
+        kwargs :
+            kwargs
+        """
+        self.parse_params(**kwargs)
+        torch.manual_seed(self.seed)
+        loader_kwargs = {'num_workers': 1, 'pin_memory': True} if (self.device == 'cuda') else {}
+        # init model, Net() can be also used here for training
+        optimizer = optim.SGD(self.model.parameters(), lr = self.lr, momentum = self.momentum)
+        for epoch in range(1, self.epochs + 1):
+            # adjust learning rate for SGD
+            optimizer = adjust_learning_rate(optimizer, epoch, self.lr)
+            # adversarial training
+            self.train(self.device, train_loader, optimizer, epoch)
+            # evaluation on natural examples
+            self.test(self.model, self.device, test_loader)
+            # save checkpoint
+            if not os.path.exists(self.save_dir):
+                os.makedirs(self.save_dir)
+            if epoch % self.save_freq == 0:
+                torch.save(self.model.state_dict(),
+                        os.path.join(self.save_dir, 'trade_model-nn-epoch{}.pt'.format(epoch)))
+                torch.save(optimizer.state_dict(),
+                        os.path.join(self.save_dir, 'opt-nn-checkpoint_epoch{}.tar'.format(epoch)))
+    def parse_params(self,
+                     epochs = 100,
+                     lr = 0.01,
+                     momentum = 0.9,
+                     epsilon = 0.3,
+                     num_steps = 40,
+                     step_size = 0.01,
+                     beta = 1.0,
+                     seed = 1,
+                     log_interval = 100,
+                     save_dir = "./defense_model",
+                     save_freq = 10
+                    ):
+        """
+        :param epoch : int
+            - pgd training epoch
+        :param save_dir : str
+            - directory path to save model
+        :param epsilon : float
+            - perturb constraint of pgd adversary example used to train defense model
+        :param num_steps : int
+            - the perturb
+        :param perturb_step_size : float
+            - step_size
+        :param lr : float
+            - learning rate for adversary training process
+        :param momentum : float
+            - parameter for optimizer in training process
+        """
+        self.epochs = epochs
+        self.lr = lr
+        self.momentum = momentum
+        self.epsilon = epsilon
+        self.num_steps = num_steps
+        self.step_size = step_size
+        self.beta = beta
+        self.seed = seed
+        self.log_interval = log_interval
+        self.save_dir = save_dir
+        self.save_freq = save_freq
+    def test(self, model, device, test_loader):
+        model.eval()
+        test_loss = 0
+        correct = 0
+        with torch.no_grad():
+            for data, target in test_loader:
+                data, target = data.to(device), target.to(device)
+                output = model(data)
+                test_loss += F.cross_entropy(output, target, size_average=False).item()
+                pred = output.max(1, keepdim=True)[1]
+                correct += pred.eq(target.view_as(pred)).sum().item()
+        test_loss /= len(test_loader.dataset)
+        print('Test: Clean loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)'.format(
+            test_loss, correct, len(test_loader.dataset),
+            100. * correct / len(test_loader.dataset)))
+        test_accuracy = correct / len(test_loader.dataset)
+        return test_loss, test_accuracy
+    def train(self, device, train_loader, optimizer, epoch):
+        self.model.train()
+        for batch_idx, (data, target) in enumerate(train_loader):
+            optimizer.zero_grad()
+            data, target = data.to(self.device), target.to(self.device)
+            # calculate robust loss
+            loss = self.trades_loss(model = self.model,
+                            x_natural = data,
+                            y = target,
+                            optimizer = optimizer,
+                            step_size = self.step_size,
+                            epsilon = self.epsilon,
+                            perturb_steps = self.num_steps,
+                            beta = self.beta)
+            loss.backward()
+            optimizer.step()
+            # print progress
+            if batch_idx % self.log_interval == 0:
+                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
+                    epoch, batch_idx * len(data), len(train_loader.dataset),
+                    100. * batch_idx / len(train_loader), loss.item()))
+    def trades_loss(self,
+                model,
+                x_natural,
+                y,
+                optimizer,
+                step_size = 0.003,
+                epsilon = 0.031,
+                perturb_steps = 10,
+                beta = 1.0,
+                distance = 'l_inf'):
+        # define KL-loss
+        criterion_kl = nn.KLDivLoss(size_average=False)
+        model.eval()
+        batch_size = len(x_natural)
+        # generate adversarial example
+        x_adv = x_natural.detach() + 0.001 * torch.randn(x_natural.shape).cuda().detach()
+        if distance == 'l_inf':
+            for _ in range(perturb_steps):
+                x_adv.requires_grad_()
+                with torch.enable_grad():
+                    loss_kl = criterion_kl(F.log_softmax(model(x_adv), dim=1),
+                                        F.softmax(model(x_natural), dim=1))
+                grad = torch.autograd.grad(loss_kl, [x_adv])[0]
+                x_adv = x_adv.detach() + step_size * torch.sign(grad.detach())
+                x_adv = torch.min(torch.max(x_adv, x_natural - epsilon), x_natural + epsilon)
+                x_adv = torch.clamp(x_adv, 0.0, 1.0)
+        elif distance == 'l_2':
+            delta = 0.001 * torch.randn(x_natural.shape).cuda().detach()
+            delta = Variable(delta.data, requires_grad=True)
+            # Setup optimizers
+            optimizer_delta = optim.SGD([delta], lr=epsilon / perturb_steps * 2)
+            for _ in range(perturb_steps):
+                adv = x_natural + delta
+                # optimize
+                optimizer_delta.zero_grad()
+                with torch.enable_grad():
+                    loss = (-1) * criterion_kl(F.log_softmax(model(adv), dim=1),
+                                            F.softmax(model(x_natural), dim=1))
+                loss.backward()
+                # renorming gradient
+                grad_norms = delta.grad.view(batch_size, -1).norm(p=2, dim=1)
+                delta.grad.div_(grad_norms.view(-1, 1, 1, 1))
+                # avoid nan or inf if gradient is 0
+                if (grad_norms == 0).any():
+                    delta.grad[grad_norms == 0] = torch.randn_like(delta.grad[grad_norms == 0])
+                optimizer_delta.step()
+                # projection
+                delta.data.add_(x_natural)
+                delta.data.clamp_(0, 1).sub_(x_natural)
+                delta.data.renorm_(p=2, dim=0, maxnorm=epsilon)
+            x_adv = Variable(x_natural + delta, requires_grad=False)
+        else:
+            x_adv = torch.clamp(x_adv, 0.0, 1.0)
+        model.train()
+        x_adv = Variable(torch.clamp(x_adv, 0.0, 1.0), requires_grad=False)
+        # zero gradient
+        optimizer.zero_grad()
+        # calculate robust loss
+        logits = model(x_natural)
+        loss_natural = F.cross_entropy(logits, y)
+        loss_robust = (1.0 / batch_size) * criterion_kl(F.log_softmax(model(x_adv), dim=1),
+                                                        F.softmax(model(x_natural), dim=1))
+        loss = loss_natural + beta * loss_robust
+        return loss

deeprobust/image/optimizer.py ADDED Viewed

	@@ -0,0 +1,914 @@

+"""
+This module include the following optimizer:
+1. differential_evolution:
+The differential evolution global optimization algorithm
+https://github.com/scipy/scipy/blob/70e61dee181de23fdd8d893eaa9491100e2218d7/scipy/optimize/_differentialevolution.py
+modified by:
+https://github.com/DebangLi/one-pixel-attack-pytorch/blob/master/differential_evolution.py
+2. Basic Adam Optimizer
+"""
+from __future__ import division, print_function, absolute_import
+import numpy as np
+from scipy.optimize import OptimizeResult, minimize
+from scipy.optimize.optimize import _status_message
+from scipy._lib._util import check_random_state
+import warnings
+__all__ = ['differential_evolution', 'AdamOptimizer']
+_MACHEPS = np.finfo(np.float64).eps
+def differential_evolution(func, bounds, args=(), strategy='best1bin',
+                           maxiter=1000, popsize=15, tol=0.01,
+                           mutation=(0.5, 1), recombination=0.7, seed=None,
+                           callback=None, disp=False, polish=True,
+                           init='latinhypercube', atol=0):
+    """Finds the global minimum of a multivariate function.
+    Differential Evolution is stochastic in nature (does not use gradient
+    methods) to find the minimium, and can search large areas of candidate
+    space, but often requires larger numbers of function evaluations than
+    conventional gradient based techniques.
+    The algorithm is due to Storn and Price [1]_.
+    Parameters
+    ----------
+    func : callable
+        The objective function to be minimized.  Must be in the form
+        ``f(x, *args)``, where ``x`` is the argument in the form of a 1-D array
+        and ``args`` is a  tuple of any additional fixed parameters needed to
+        completely specify the function.
+    bounds : sequence
+        Bounds for variables.  ``(min, max)`` pairs for each element in ``x``,
+        defining the lower and upper bounds for the optimizing argument of
+        `func`. It is required to have ``len(bounds) == len(x)``.
+        ``len(bounds)`` is used to determine the number of parameters in ``x``.
+    args : tuple, optional
+        Any additional fixed parameters needed to
+        completely specify the objective function.
+    strategy : str, optional
+        The differential evolution strategy to use. Should be one of:
+            - 'best1bin'
+            - 'best1exp'
+            - 'rand1exp'
+            - 'randtobest1exp'
+            - 'currenttobest1exp'
+            - 'best2exp'
+            - 'rand2exp'
+            - 'randtobest1bin'
+            - 'currenttobest1bin'
+            - 'best2bin'
+            - 'rand2bin'
+            - 'rand1bin'
+        The default is 'best1bin'.
+    maxiter : int, optional
+        The maximum number of generations over which the entire population is
+        evolved. The maximum number of function evaluations (with no polishing)
+        is: ``(maxiter + 1) * popsize * len(x)``
+    popsize : int, optional
+        A multiplier for setting the total population size.  The population has
+        ``popsize * len(x)`` individuals (unless the initial population is
+        supplied via the `init` keyword).
+    tol : float, optional
+        Relative tolerance for convergence, the solving stops when
+        ``np.std(pop) <= atol + tol * np.abs(np.mean(population_energies))``,
+        where and `atol` and `tol` are the absolute and relative tolerance
+        respectively.
+    mutation : float or tuple(float, float), optional
+        The mutation constant. In the literature this is also known as
+        differential weight, being denoted by F.
+        If specified as a float it should be in the range [0, 2].
+        If specified as a tuple ``(min, max)`` dithering is employed. Dithering
+        randomly changes the mutation constant on a generation by generation
+        basis. The mutation constant for that generation is taken from
+        ``U[min, max)``. Dithering can help speed convergence significantly.
+        Increasing the mutation constant increases the search radius, but will
+        slow down convergence.
+    recombination : float, optional
+        The recombination constant, should be in the range [0, 1]. In the
+        literature this is also known as the crossover probability, being
+        denoted by CR. Increasing this value allows a larger number of mutants
+        to progress into the next generation, but at the risk of population
+        stability.
+    seed : int or `np.random.RandomState`, optional
+        If `seed` is not specified the `np.RandomState` singleton is used.
+        If `seed` is an int, a new `np.random.RandomState` instance is used,
+        seeded with seed.
+        If `seed` is already a `np.random.RandomState instance`, then that
+        `np.random.RandomState` instance is used.
+        Specify `seed` for repeatable minimizations.
+    disp : bool, optional
+        Display status messages
+    callback : callable, `callback(xk, convergence=val)`, optional
+        A function to follow the progress of the minimization. ``xk`` is
+        the current value of ``x0``. ``val`` represents the fractional
+        value of the population convergence.  When ``val`` is greater than one
+        the function halts. If callback returns `True`, then the minimization
+        is halted (any polishing is still carried out).
+    polish : bool, optional
+        If True (default), then `scipy.optimize.minimize` with the `L-BFGS-B`
+        method is used to polish the best population member at the end, which
+        can improve the minimization slightly.
+    init : str or array-like, optional
+        Specify which type of population initialization is performed. Should be
+        one of:
+            - 'latinhypercube'
+            - 'random'
+            - array specifying the initial population. The array should have
+              shape ``(M, len(x))``, where len(x) is the number of parameters.
+              `init` is clipped to `bounds` before use.
+        The default is 'latinhypercube'. Latin Hypercube sampling tries to
+        maximize coverage of the available parameter space. 'random'
+        initializes the population randomly - this has the drawback that
+        clustering can occur, preventing the whole of parameter space being
+        covered. Use of an array to specify a population subset could be used,
+        for example, to create a tight bunch of initial guesses in an location
+        where the solution is known to exist, thereby reducing time for
+        convergence.
+    atol : float, optional
+        Absolute tolerance for convergence, the solving stops when
+        ``np.std(pop) <= atol + tol * np.abs(np.mean(population_energies))``,
+        where and `atol` and `tol` are the absolute and relative tolerance
+        respectively.
+    Returns
+    -------
+    res : OptimizeResult
+        The optimization result represented as a `OptimizeResult` object.
+        Important attributes are: ``x`` the solution array, ``success`` a
+        Boolean flag indicating if the optimizer exited successfully and
+        ``message`` which describes the cause of the termination. See
+        `OptimizeResult` for a description of other attributes.  If `polish`
+        was employed, and a lower minimum was obtained by the polishing, then
+        OptimizeResult also contains the ``jac`` attribute.
+    Notes
+    -----
+    Differential evolution is a stochastic population based method that is
+    useful for global optimization problems. At each pass through the population
+    the algorithm mutates each candidate solution by mixing with other candidate
+    solutions to create a trial candidate. There are several strategies [2]_ for
+    creating trial candidates, which suit some problems more than others. The
+    'best1bin' strategy is a good starting point for many systems. In this
+    strategy two members of the population are randomly chosen. Their difference
+    is used to mutate the best member (the `best` in `best1bin`), :math:`b_0`,
+    so far:
+    .. math::
+        b' = b_0 + mutation * (population[rand0] - population[rand1])
+    A trial vector is then constructed. Starting with a randomly chosen 'i'th
+    parameter the trial is sequentially filled (in modulo) with parameters from
+    `b'` or the original candidate. The choice of whether to use `b'` or the
+    original candidate is made with a binomial distribution (the 'bin' in
+    'best1bin') - a random number in [0, 1) is generated.  If this number is
+    less than the `recombination` constant then the parameter is loaded from
+    `b'`, otherwise it is loaded from the original candidate.  The final
+    parameter is always loaded from `b'`.  Once the trial candidate is built
+    its fitness is assessed. If the trial is better than the original candidate
+    then it takes its place. If it is also better than the best overall
+    candidate it also replaces that.
+    To improve your chances of finding a global minimum use higher `popsize`
+    values, with higher `mutation` and (dithering), but lower `recombination`
+    values. This has the effect of widening the search radius, but slowing
+    convergence.
+    .. versionadded:: 0.15.0
+    References
+    ----------
+    .. [1] Storn, R and Price, K, Differential Evolution - a Simple and
+           Efficient Heuristic for Global Optimization over Continuous Spaces,
+           Journal of Global Optimization, 1997, 11, 341 - 359.
+    .. [2] http://www1.icsi.berkeley.edu/~storn/code.html
+    .. [3] http://en.wikipedia.org/wiki/Differential_evolution
+    """
+    solver = DifferentialEvolutionSolver(func, bounds, args=args,
+                                         strategy=strategy, maxiter=maxiter,
+                                         popsize=popsize, tol=tol,
+                                         mutation=mutation,
+                                         recombination=recombination,
+                                         seed=seed, polish=polish,
+                                         callback=callback,
+                                         disp=disp, init=init, atol=atol)
+    return solver.solve()
+class DifferentialEvolutionSolver(object):
+    """This class implements the differential evolution solver
+    Parameters
+    ----------
+    func : callable
+        The objective function to be minimized.  Must be in the form
+        ``f(x, *args)``, where ``x`` is the argument in the form of a 1-D array
+        and ``args`` is a  tuple of any additional fixed parameters needed to
+        completely specify the function.
+    bounds : sequence
+        Bounds for variables.  ``(min, max)`` pairs for each element in ``x``,
+        defining the lower and upper bounds for the optimizing argument of
+        `func`. It is required to have ``len(bounds) == len(x)``.
+        ``len(bounds)`` is used to determine the number of parameters in ``x``.
+    args : tuple, optional
+        Any additional fixed parameters needed to
+        completely specify the objective function.
+    strategy : str, optional
+        The differential evolution strategy to use. Should be one of:
+            - 'best1bin'
+            - 'best1exp'
+            - 'rand1exp'
+            - 'randtobest1exp'
+            - 'currenttobest1exp'
+            - 'best2exp'
+            - 'rand2exp'
+            - 'randtobest1bin'
+            - 'currenttobest1bin'
+            - 'best2bin'
+            - 'rand2bin'
+            - 'rand1bin'
+        The default is 'best1bin'
+    maxiter : int, optional
+        The maximum number of generations over which the entire population is
+        evolved. The maximum number of function evaluations (with no polishing)
+        is: ``(maxiter + 1) * popsize * len(x)``
+    popsize : int, optional
+        A multiplier for setting the total population size.  The population has
+        ``popsize * len(x)`` individuals (unless the initial population is
+        supplied via the `init` keyword).
+    tol : float, optional
+        Relative tolerance for convergence, the solving stops when
+        ``np.std(pop) <= atol + tol * np.abs(np.mean(population_energies))``,
+        where and `atol` and `tol` are the absolute and relative tolerance
+        respectively.
+    mutation : float or tuple(float, float), optional
+        The mutation constant. In the literature this is also known as
+        differential weight, being denoted by F.
+        If specified as a float it should be in the range [0, 2].
+        If specified as a tuple ``(min, max)`` dithering is employed. Dithering
+        randomly changes the mutation constant on a generation by generation
+        basis. The mutation constant for that generation is taken from
+        U[min, max). Dithering can help speed convergence significantly.
+        Increasing the mutation constant increases the search radius, but will
+        slow down convergence.
+    recombination : float, optional
+        The recombination constant, should be in the range [0, 1]. In the
+        literature this is also known as the crossover probability, being
+        denoted by CR. Increasing this value allows a larger number of mutants
+        to progress into the next generation, but at the risk of population
+        stability.
+    seed : int or `np.random.RandomState`, optional
+        If `seed` is not specified the `np.random.RandomState` singleton is
+        used.
+        If `seed` is an int, a new `np.random.RandomState` instance is used,
+        seeded with `seed`.
+        If `seed` is already a `np.random.RandomState` instance, then that
+        `np.random.RandomState` instance is used.
+        Specify `seed` for repeatable minimizations.
+    disp : bool, optional
+        Display status messages
+    callback : callable, `callback(xk, convergence=val)`, optional
+        A function to follow the progress of the minimization. ``xk`` is
+        the current value of ``x0``. ``val`` represents the fractional
+        value of the population convergence.  When ``val`` is greater than one
+        the function halts. If callback returns `True`, then the minimization
+        is halted (any polishing is still carried out).
+    polish : bool, optional
+        If True, then `scipy.optimize.minimize` with the `L-BFGS-B` method
+        is used to polish the best population member at the end. This requires
+        a few more function evaluations.
+    maxfun : int, optional
+        Set the maximum number of function evaluations. However, it probably
+        makes more sense to set `maxiter` instead.
+    init : str or array-like, optional
+        Specify which type of population initialization is performed. Should be
+        one of:
+            - 'latinhypercube'
+            - 'random'
+            - array specifying the initial population. The array should have
+              shape ``(M, len(x))``, where len(x) is the number of parameters.
+              `init` is clipped to `bounds` before use.
+        The default is 'latinhypercube'. Latin Hypercube sampling tries to
+        maximize coverage of the available parameter space. 'random'
+        initializes the population randomly - this has the drawback that
+        clustering can occur, preventing the whole of parameter space being
+        covered. Use of an array to specify a population could be used, for
+        example, to create a tight bunch of initial guesses in an location
+        where the solution is known to exist, thereby reducing time for
+        convergence.
+    atol : float, optional
+        Absolute tolerance for convergence, the solving stops when
+        ``np.std(pop) <= atol + tol * np.abs(np.mean(population_energies))``,
+        where and `atol` and `tol` are the absolute and relative tolerance
+        respectively.
+    """
+    # Dispatch of mutation strategy method (binomial or exponential).
+    _binomial = {'best1bin': '_best1',
+                 'randtobest1bin': '_randtobest1',
+                 'currenttobest1bin': '_currenttobest1',
+                 'best2bin': '_best2',
+                 'rand2bin': '_rand2',
+                 'rand1bin': '_rand1'}
+    _exponential = {'best1exp': '_best1',
+                    'rand1exp': '_rand1',
+                    'randtobest1exp': '_randtobest1',
+                    'currenttobest1exp': '_currenttobest1',
+                    'best2exp': '_best2',
+                    'rand2exp': '_rand2'}
+    __init_error_msg = ("The population initialization method must be one of "
+                        "'latinhypercube' or 'random', or an array of shape "
+                        "(M, N) where N is the number of parameters and M>5")
+    def __init__(self, func, bounds, args=(),
+                 strategy='best1bin', maxiter=1000, popsize=15,
+                 tol=0.01, mutation=(0.5, 1), recombination=0.7, seed=None,
+                 maxfun=np.inf, callback=None, disp=False, polish=True,
+                 init='latinhypercube', atol=0):
+        if strategy in self._binomial:
+            self.mutation_func = getattr(self, self._binomial[strategy])
+        elif strategy in self._exponential:
+            self.mutation_func = getattr(self, self._exponential[strategy])
+        else:
+            raise ValueError("Please select a valid mutation strategy")
+        self.strategy = strategy
+        self.callback = callback
+        self.polish = polish
+        # relative and absolute tolerances for convergence
+        self.tol, self.atol = tol, atol
+        # Mutation constant should be in [0, 2). If specified as a sequence
+        # then dithering is performed.
+        self.scale = mutation
+        if (not np.all(np.isfinite(mutation)) or
+                np.any(np.array(mutation) >= 2) or
+                np.any(np.array(mutation) < 0)):
+            raise ValueError('The mutation constant must be a float in '
+                             'U[0, 2), or specified as a tuple(min, max)'
+                             ' where min < max and min, max are in U[0, 2).')
+        self.dither = None
+        if hasattr(mutation, '__iter__') and len(mutation) > 1:
+            self.dither = [mutation[0], mutation[1]]
+            self.dither.sort()
+        self.cross_over_probability = recombination
+        self.func = func
+        self.args = args
+        # convert tuple of lower and upper bounds to limits
+        # [(low_0, high_0), ..., (low_n, high_n]
+        #     -> [[low_0, ..., low_n], [high_0, ..., high_n]]
+        self.limits = np.array(bounds, dtype='float').T
+        if (np.size(self.limits, 0) != 2 or not
+                np.all(np.isfinite(self.limits))):
+            raise ValueError('bounds should be a sequence containing '
+                             'real valued (min, max) pairs for each value'
+                             ' in x')
+        if maxiter is None:  # the default used to be None
+            maxiter = 1000
+        self.maxiter = maxiter
+        if maxfun is None:  # the default used to be None
+            maxfun = np.inf
+        self.maxfun = maxfun
+        # population is scaled to between [0, 1].
+        # We have to scale between parameter <-> population
+        # save these arguments for _scale_parameter and
+        # _unscale_parameter. This is an optimization
+        self.__scale_arg1 = 0.5 * (self.limits[0] + self.limits[1])
+        self.__scale_arg2 = np.fabs(self.limits[0] - self.limits[1])
+        self.parameter_count = np.size(self.limits, 1)
+        self.random_number_generator = check_random_state(seed)
+        # default population initialization is a latin hypercube design, but
+        # there are other population initializations possible.
+        # the minimum is 5 because 'best2bin' requires a population that's at
+        # least 5 long
+        self.num_population_members = max(5, popsize * self.parameter_count)
+        self.population_shape = (self.num_population_members,
+                                 self.parameter_count)
+        self._nfev = 0
+        if isinstance(init, str):
+            if init == 'latinhypercube':
+                self.init_population_lhs()
+            elif init == 'random':
+                self.init_population_random()
+            else:
+                raise ValueError(self.__init_error_msg)
+        else:
+            self.init_population_array(init)
+        self.disp = disp
+    def init_population_lhs(self):
+        """
+        Initializes the population with Latin Hypercube Sampling.
+        Latin Hypercube Sampling ensures that each parameter is uniformly
+        sampled over its range.
+        """
+        rng = self.random_number_generator
+        # Each parameter range needs to be sampled uniformly. The scaled
+        # parameter range ([0, 1)) needs to be split into
+        # `self.num_population_members` segments, each of which has the following
+        # size:
+        segsize = 1.0 / self.num_population_members
+        # Within each segment we sample from a uniform random distribution.
+        # We need to do this sampling for each parameter.
+        samples = (segsize * rng.random_sample(self.population_shape)
+        # Offset each segment to cover the entire parameter range [0, 1)
+                   + np.linspace(0., 1., self.num_population_members,
+                                 endpoint=False)[:, np.newaxis])
+        # Create an array for population of candidate solutions.
+        self.population = np.zeros_like(samples)
+        # Initialize population of candidate solutions by permutation of the
+        # random samples.
+        for j in range(self.parameter_count):
+            order = rng.permutation(range(self.num_population_members))
+            self.population[:, j] = samples[order, j]
+        # reset population energies
+        self.population_energies = (np.ones(self.num_population_members) *
+                                    np.inf)
+        # reset number of function evaluations counter
+        self._nfev = 0
+    def init_population_random(self):
+        """
+        Initialises the population at random.  This type of initialization
+        can possess clustering, Latin Hypercube sampling is generally better.
+        """
+        rng = self.random_number_generator
+        self.population = rng.random_sample(self.population_shape)
+        # reset population energies
+        self.population_energies = (np.ones(self.num_population_members) *
+                                    np.inf)
+        # reset number of function evaluations counter
+        self._nfev = 0
+    def init_population_array(self, init):
+        """
+        Initialises the population with a user specified population.
+        Parameters
+        ----------
+        init : np.ndarray
+            Array specifying subset of the initial population. The array should
+            have shape (M, len(x)), where len(x) is the number of parameters.
+            The population is clipped to the lower and upper `bounds`.
+        """
+        # make sure you're using a float array
+        popn = np.asfarray(init)
+        if (np.size(popn, 0) < 5 or
+                popn.shape[1] != self.parameter_count or
+                len(popn.shape) != 2):
+            raise ValueError("The population supplied needs to have shape"
+                             " (M, len(x)), where M > 4.")
+        # scale values and clip to bounds, assigning to population
+        self.population = np.clip(self._unscale_parameters(popn), 0, 1)
+        self.num_population_members = np.size(self.population, 0)
+        self.population_shape = (self.num_population_members,
+                                 self.parameter_count)
+        # reset population energies
+        self.population_energies = (np.ones(self.num_population_members) *
+                                    np.inf)
+        # reset number of function evaluations counter
+        self._nfev = 0
+    @property
+    def x(self):
+        """
+        The best solution from the solver
+        Returns
+        -------
+        x : ndarray
+            The best solution from the solver.
+        """
+        return self._scale_parameters(self.population[0])
+    @property
+    def convergence(self):
+        """
+        The standard deviation of the population energies divided by their
+        mean.
+        """
+        return (np.std(self.population_energies) /
+                np.abs(np.mean(self.population_energies) + _MACHEPS))
+    def solve(self):
+        """
+        Runs the DifferentialEvolutionSolver.
+        Returns
+        -------
+        res : OptimizeResult
+            The optimization result represented as a ``OptimizeResult`` object.
+            Important attributes are: ``x`` the solution array, ``success`` a
+            Boolean flag indicating if the optimizer exited successfully and
+            ``message`` which describes the cause of the termination. See
+            `OptimizeResult` for a description of other attributes.  If `polish`
+            was employed, and a lower minimum was obtained by the polishing,
+            then OptimizeResult also contains the ``jac`` attribute.
+        """
+        nit, warning_flag = 0, False
+        status_message = _status_message['success']
+        # The population may have just been initialized (all entries are
+        # np.inf). If it has you have to calculate the initial energies.
+        # Although this is also done in the evolve generator it's possible
+        # that someone can set maxiter=0, at which point we still want the
+        # initial energies to be calculated (the following loop isn't run).
+        if np.all(np.isinf(self.population_energies)):
+            self._calculate_population_energies()
+        # do the optimisation.
+        for nit in range(1, self.maxiter + 1):
+            # evolve the population by a generation
+            try:
+                next(self)
+            except StopIteration:
+                warning_flag = True
+                status_message = _status_message['maxfev']
+                break
+            if self.disp:
+                print("differential_evolution step %d: f(x)= %g"
+                      % (nit,
+                         self.population_energies[0]))
+            # should the solver terminate?
+            convergence = self.convergence
+            if (self.callback and
+                    self.callback(self._scale_parameters(self.population[0]),
+                                  convergence=self.tol / convergence) is True):
+                warning_flag = True
+                status_message = ('callback function requested stop early '
+                                  'by returning True')
+                break
+            intol = (np.std(self.population_energies) <=
+                     self.atol +
+                     self.tol * np.abs(np.mean(self.population_energies)))
+            if warning_flag or intol:
+                break
+        else:
+            status_message = _status_message['maxiter']
+            warning_flag = True
+        DE_result = OptimizeResult(
+            x=self.x,
+            fun=self.population_energies[0],
+            nfev=self._nfev,
+            nit=nit,
+            message=status_message,
+            success=(warning_flag is not True))
+        if self.polish:
+            result = minimize(self.func,
+                              np.copy(DE_result.x),
+                              method='L-BFGS-B',
+                              bounds=self.limits.T,
+                              args=self.args)
+            self._nfev += result.nfev
+            DE_result.nfev = self._nfev
+            if result.fun < DE_result.fun:
+                DE_result.fun = result.fun
+                DE_result.x = result.x
+                DE_result.jac = result.jac
+                # to keep internal state consistent
+                self.population_energies[0] = result.fun
+                self.population[0] = self._unscale_parameters(result.x)
+        return DE_result
+    def _calculate_population_energies(self):
+        """
+        Calculate the energies of all the population members at the same time.
+        Puts the best member in first place. Useful if the population has just
+        been initialised.
+        """
+        ##############
+        ## CHANGES: self.func operates on the entire parameters array
+        ##############
+        itersize = max(0, min(len(self.population), self.maxfun - self._nfev + 1))
+        candidates = self.population[:itersize]
+        parameters = np.array([self._scale_parameters(c) for c in candidates]) # TODO: vectorize
+        energies = self.func(parameters, *self.args)
+        self.population_energies = energies
+        self._nfev += itersize
+        # for index, candidate in enumerate(self.population):
+        #     if self._nfev > self.maxfun:
+        #         break
+        #     parameters = self._scale_parameters(candidate)
+        #     self.population_energies[index] = self.func(parameters,
+        #                                                 *self.args)
+        #     self._nfev += 1
+        ##############
+        ##############
+        minval = np.argmin(self.population_energies)
+        # put the lowest energy into the best solution position.
+        lowest_energy = self.population_energies[minval]
+        self.population_energies[minval] = self.population_energies[0]
+        self.population_energies[0] = lowest_energy
+        self.population[[0, minval], :] = self.population[[minval, 0], :]
+    def __iter__(self):
+        return self
+    def __next__(self):
+        """
+        Evolve the population by a single generation
+        Returns
+        -------
+        x : ndarray
+            The best solution from the solver.
+        fun : float
+            Value of objective function obtained from the best solution.
+        """
+        # the population may have just been initialized (all entries are
+        # np.inf). If it has you have to calculate the initial energies
+        if np.all(np.isinf(self.population_energies)):
+            self._calculate_population_energies()
+        if self.dither is not None:
+            self.scale = (self.random_number_generator.rand()
+                          * (self.dither[1] - self.dither[0]) + self.dither[0])
+        ##############
+        ## CHANGES: self.func operates on the entire parameters array
+        ##############
+        itersize = max(0, min(self.num_population_members, self.maxfun - self._nfev + 1))
+        trials = np.array([self._mutate(c) for c in range(itersize)]) # TODO: vectorize
+        for trial in trials: self._ensure_constraint(trial)
+        parameters = np.array([self._scale_parameters(trial) for trial in trials])
+        energies = self.func(parameters, *self.args)
+        self._nfev += itersize
+        for candidate,(energy,trial) in enumerate(zip(energies, trials)):
+            # if the energy of the trial candidate is lower than the
+            # original population member then replace it
+            if energy < self.population_energies[candidate]:
+                self.population[candidate] = trial
+                self.population_energies[candidate] = energy
+                # if the trial candidate also has a lower energy than the
+                # best solution then replace that as well
+                if energy < self.population_energies[0]:
+                    self.population_energies[0] = energy
+                    self.population[0] = trial
+        # for candidate in range(self.num_population_members):
+        #     if self._nfev > self.maxfun:
+        #         raise StopIteration
+        #     # create a trial solution
+        #     trial = self._mutate(candidate)
+        #     # ensuring that it's in the range [0, 1)
+        #     self._ensure_constraint(trial)
+        #     # scale from [0, 1) to the actual parameter value
+        #     parameters = self._scale_parameters(trial)
+        #     # determine the energy of the objective function
+        #     energy = self.func(parameters, *self.args)
+        #     self._nfev += 1
+        #     # if the energy of the trial candidate is lower than the
+        #     # original population member then replace it
+        #     if energy < self.population_energies[candidate]:
+        #         self.population[candidate] = trial
+        #         self.population_energies[candidate] = energy
+        #         # if the trial candidate also has a lower energy than the
+        #         # best solution then replace that as well
+        #         if energy < self.population_energies[0]:
+        #             self.population_energies[0] = energy
+        #             self.population[0] = trial
+        ##############
+        ##############
+        return self.x, self.population_energies[0]
+    def next(self):
+        """
+        Evolve the population by a single generation
+        Returns
+        -------
+        x : ndarray
+            The best solution from the solver.
+        fun : float
+            Value of objective function obtained from the best solution.
+        """
+        # next() is required for compatibility with Python2.7.
+        return self.__next__()
+    def _scale_parameters(self, trial):
+        """
+        scale from a number between 0 and 1 to parameters.
+        """
+        return self.__scale_arg1 + (trial - 0.5) * self.__scale_arg2
+    def _unscale_parameters(self, parameters):
+        """
+        scale from parameters to a number between 0 and 1.
+        """
+        return (parameters - self.__scale_arg1) / self.__scale_arg2 + 0.5
+    def _ensure_constraint(self, trial):
+        """
+        make sure the parameters lie between the limits
+        """
+        for index in np.where((trial < 0) | (trial > 1))[0]:
+            trial[index] = self.random_number_generator.rand()
+    def _mutate(self, candidate):
+        """
+        create a trial vector based on a mutation strategy
+        """
+        trial = np.copy(self.population[candidate])
+        rng = self.random_number_generator
+        fill_point = rng.randint(0, self.parameter_count)
+        if self.strategy in ['currenttobest1exp', 'currenttobest1bin']:
+            bprime = self.mutation_func(candidate,
+                                        self._select_samples(candidate, 5))
+        else:
+            bprime = self.mutation_func(self._select_samples(candidate, 5))
+        if self.strategy in self._binomial:
+            crossovers = rng.rand(self.parameter_count)
+            crossovers = crossovers < self.cross_over_probability
+            # the last one is always from the bprime vector for binomial
+            # If you fill in modulo with a loop you have to set the last one to
+            # true. If you don't use a loop then you can have any random entry
+            # be True.
+            crossovers[fill_point] = True
+            trial = np.where(crossovers, bprime, trial)
+            return trial
+        elif self.strategy in self._exponential:
+            i = 0
+            while (i < self.parameter_count and
+                   rng.rand() < self.cross_over_probability):
+                trial[fill_point] = bprime[fill_point]
+                fill_point = (fill_point + 1) % self.parameter_count
+                i += 1
+            return trial
+    def _best1(self, samples):
+        """
+        best1bin, best1exp
+        """
+        r0, r1 = samples[:2]
+        return (self.population[0] + self.scale *
+                (self.population[r0] - self.population[r1]))
+    def _rand1(self, samples):
+        """
+        rand1bin, rand1exp
+        """
+        r0, r1, r2 = samples[:3]
+        return (self.population[r0] + self.scale *
+                (self.population[r1] - self.population[r2]))
+    def _randtobest1(self, samples):
+        """
+        randtobest1bin, randtobest1exp
+        """
+        r0, r1, r2 = samples[:3]
+        bprime = np.copy(self.population[r0])
+        bprime += self.scale * (self.population[0] - bprime)
+        bprime += self.scale * (self.population[r1] -
+                                self.population[r2])
+        return bprime
+    def _currenttobest1(self, candidate, samples):
+        """
+        currenttobest1bin, currenttobest1exp
+        """
+        r0, r1 = samples[:2]
+        bprime = (self.population[candidate] + self.scale *
+                  (self.population[0] - self.population[candidate] +
+                   self.population[r0] - self.population[r1]))
+        return bprime
+    def _best2(self, samples):
+        """
+        best2bin, best2exp
+        """
+        r0, r1, r2, r3 = samples[:4]
+        bprime = (self.population[0] + self.scale *
+                  (self.population[r0] + self.population[r1] -
+                   self.population[r2] - self.population[r3]))
+        return bprime
+    def _rand2(self, samples):
+        """
+        rand2bin, rand2exp
+        """
+        r0, r1, r2, r3, r4 = samples
+        bprime = (self.population[r0] + self.scale *
+                  (self.population[r1] + self.population[r2] -
+                   self.population[r3] - self.population[r4]))
+        return bprime
+    def _select_samples(self, candidate, number_samples):
+        """
+        obtain random integers from range(self.num_population_members),
+        without replacement.  You can't have the original candidate either.
+        """
+        idxs = list(range(self.num_population_members))
+        idxs.remove(candidate)
+        self.random_number_generator.shuffle(idxs)
+        idxs = idxs[:number_samples]
+        return idxs
+class AdamOptimizer:
+    """Basic Adam optimizer implementation that can minimize w.r.t.
+    a single variable.
+    Parameters
+    ----------
+    shape : tuple
+        shape of the variable w.r.t. which the loss should be minimized
+    """
+    #TODO Add reference or rewrite the function.
+    def __init__(self, shape):
+        self.m = np.zeros(shape)
+        self.v = np.zeros(shape)
+        self.t = 0
+    def __call__(self, gradient, learning_rate, beta1=0.9, beta2=0.999, epsilon=1e-8):
+        """Updates internal parameters of the optimizer and returns
+        the change that should be applied to the variable.
+        Parameters
+        ----------
+        gradient : `np.ndarray`
+            the gradient of the loss w.r.t. to the variable
+        learning_rate: float
+            the learning rate in the current iteration
+        beta1: float
+            decay rate for calculating the exponentially
+            decaying average of past gradients
+        beta2: float
+            decay rate for calculating the exponentially
+            decaying average of past squared gradients
+        epsilon: float
+            small value to avoid division by zero
+        """
+        self.t += 1
+        self.m = beta1 * self.m + (1 - beta1) * gradient
+        self.v = beta2 * self.v + (1 - beta2) * gradient ** 2
+        bias_correction_1 = 1 - beta1 ** self.t
+        bias_correction_2 = 1 - beta2 ** self.t
+        m_hat = self.m / bias_correction_1
+        v_hat = self.v / bias_correction_2
+        return -learning_rate * m_hat / (np.sqrt(v_hat) + epsilon)

deeprobust/image/preprocessing/APE-GAN.py ADDED Viewed

	@@ -0,0 +1,127 @@

+import os
+import argparse
+import torch
+import torch.nn
+from torch.utils.data import TensorDataset
+import torch.backends.cudnn as cudnn
+class Generator(nn.Module):
+    def __init__(self, in_ch):
+        super(Generator, self).__init__()
+        self.conv1 = nn.Conv2d(in_ch, 64, 4, stride=2, padding=1)
+        self.bn1 = nn.BatchNorm2d(64)
+        self.conv2 = nn.Conv2d(64, 128, 4, stride=2, padding=1)
+        self.bn2 = nn.BatchNorm2d(128)
+        self.deconv3 = nn.ConvTranspose2d(128, 64, 4, stride=2, padding=1)
+        self.bn3 = nn.BatchNorm2d(64)
+        self.deconv4 = nn.ConvTranspose2d(64, in_ch, 4, stride=2, padding=1)
+    def forward(self, x):
+        h = F.leaky_relu(self.bn1(self.conv1(x)))
+        h = F.leaky_relu(self.bn2(self.conv2(h)))
+        h = F.leaky_relu(self.bn3(self.deconv3(h)))
+        h = F.tanh(self.deconv4(h))
+        return h
+class Discriminator(nn.Module):
+    def __init__(self, in_ch):
+        super(Discriminator, self).__init__()
+        self.conv1 = nn.Conv2d(in_ch, 64, 3, stride=2)
+        self.conv2 = nn.Conv2d(64, 128, 3, stride=2)
+        self.bn2 = nn.BatchNorm2d(128)
+        self.conv3 = nn.Conv2d(128, 256, 3, stride=2)
+        self.bn3 = nn.BatchNorm2d(256)
+        if in_ch == 1:
+            self.fc4 = nn.Linear(1024, 1)
+        else:
+            self.fc4 = nn.Linear(2304, 1)
+    def forward(self, x):
+        h = F.leaky_relu(self.conv1(x))
+        h = F.leaky_relu(self.bn2(self.conv2(h)))
+        h = F.leaky_relu(self.bn3(self.conv3(h)))
+        h = F.sigmoid(self.fc4(h.view(h.size(0), -1)))
+        return h
+def main(args):
+    #Initialize GAN model
+    G = Generator(in_ch = C).cuda()
+    D = Discriminator(in_ch = C).cuda()
+    #Initialize Generator
+    opt_G = torch.optim.Adam(G.parameters(), lr=lr, betas=(0.5, 0.999))
+    opt_D = torch.optim.Adam(D.parameters(), lr=lr, betas=(0.5, 0.999))
+    loss_bce = nn.BCELoss()
+    loss_mse = nn.MSELoss()
+    cudnn.benchmark = True
+    #Initialize DataLoader
+    train_data = torch.load("./adv_data.tar")
+    train_data = TensorDataset(train_data["normal"], train_data["adv"])
+    train_loader = torch.utils.data.DataLoader(train_data, batch_size=args.batch_size, shuffle=True)
+    #Start Training
+    for i in range(args.epochs):
+        G.eval()
+        x_fake = G(x_adv_temp).data
+        G.train()
+        gen_loss, dis_loss, n = 0, 0, 0
+        for x, x_adv in train_loader:
+            current_size = x.size(0)
+            x, x_adv = x.cuda(), x_adv.cuda()
+            #Train Discriminator
+            t_real = torch.ones(current_size).cuda()
+            t_fake = torch.zeros(current_size).cuda()
+            y_real = D(x).squeeze()
+            x_fake = G(x_adv)
+            y_fake = D(x_fake).squeeze()
+            loss_D = loss_bce(y_real, t_real) + loss_bce(y_fake, t_fake)
+            opt_D.zero_grad()
+            loss_D.backward()
+            opt_D.step()
+            # Train G
+            for _ in range(2):
+                x_fake = G(x_adv)
+                y_fake = D(x_fake).squeeze()
+                loss_G = args.alpha * loss_mse(x_fake, x) + args.beta * loss_bce(y_fake, t_real)
+                opt_G.zero_grad()
+                loss_G.backward()
+                opt_G.step()
+            gen_loss += loss_D.data[0] * x.size(0)
+            dis_loss += loss_G.data[0] * x.size(0)
+            n += x.size(0)
+        print("epoch:{}, LossG:{:.3f}, LossD:{:.3f}".format(i, gen_loss / n, dis_loss / n))
+        torch.save({"generator": G.state_dict(), "discriminator": D.state_dict()},
+                   os.path.join(args.checkpoint, "{}.tar".format(i + 1)))
+    G.eval()
+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--data", type=str, default="mnist")
+    parser.add_argument("--lr", type=float, default=0.0002)
+    parser.add_argument("--epochs", type=int, default=2)
+    parser.add_argument("--alpha", type=float, default=0.7)
+    parser.add_argument("--beta", type=float, default=0.3)
+    parser.add_argument("--checkpoint", type=str, default="./checkpoint/test")
+    args = parser.parse_args()
+    return args
+if __name__ == "__main__":
+    get_args()
+    main(args)

deeprobust/image/preprocessing/prepare_advdata.py ADDED Viewed

	@@ -0,0 +1,62 @@

+"""
+This implementation is used to create adversarial dataset.
+"""
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+from torchvision import datasets,models,transforms
+from PIL import Image
+from deeprobust.image.attack.pgd import PGD
+import deeprobust.image.netmodels.resnet as resnet
+import deeprobust.image.netmodels.CNN as CNN
+from deeprobust.image.config import attack_params
+import matplotlib.pyplot as plt
+from deeprobust.image.config import attack_params
+def main(args):
+    #Load Model.
+    model = resnet.ResNet18().to('cuda')
+    print("Load network")
+    model.load_state_dict(torch.load("~/Documents/deeprobust_model/cifar_res18_120.pt"))
+    model.eval()
+    transform_val = transforms.Compose([
+                    transforms.ToTensor(),
+                    ])
+    train_loader = torch.utils.data.DataLoader(
+                    datasets.MNIST('deeprobust/image/defense/data', train=True, download=True,
+                    transform=transforms.Compose([transforms.ToTensor()])),
+                    batch_size=128,
+                    shuffle=True)
+    test_loader  = torch.utils.data.DataLoader(
+                    datasets.CIFAR10('deeprobust/image/data', train = False, download=True,
+                    transform = transform_val),
+                    batch_size = 128, shuffle=True) #, **kwargs)
+    normal_data, adv_data = None, None
+    adversary = PGD(model)
+    for x, y in train_loader:
+        x, y = x.cuda(), t.cuda()
+        y_pred = model(x)
+        train_acc += accuracy(y_pred, y)
+        x_adv = adversary.generate(x, y, **attack_params['PGD_CIFAR10']).float()
+        y_adv = model(x_adv)
+        adv_acc += accuracy(y_adv, y)
+        train_n += y.size(0)
+        x, x_adv = x.data, x_adv.data
+        if normal_data is None:
+            normal_data, adv_data = x, x_adv
+        else:
+            normal_data = torch.cat((normal_data, x))
+            adv_data = torch.cat((adv_data, x_adv))
+    print("Accuracy(normal) {:.6f}, Accuracy(FGSM) {:.6f}".format(train_acc / train_n * 100, adv_acc / train_n * 100))
+    torch.save({"normal": normal_data, "adv": adv_data}, "data.tar")
+    torch.save({"state_dict": model.state_dict()}, "cnn.tar")

deeprobust/image/utils.py ADDED Viewed

	@@ -0,0 +1,211 @@

+import torch
+import torchvision
+import torchvision.transforms as transforms
+import numpy as np
+import urllib.request
+import os
+def create_train_dataset(batch_size = 128, root = '../data'):
+    """
+    Create different training dataset
+    """
+    transform_train = transforms.Compose([
+    transforms.ToTensor(),
+    ])
+    trainset = torchvision.datasets.MNIST(root=root, train=True, download=True, transform=transform_train)
+    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)
+    return trainloader
+def create_test_dataset(batch_size = 128, root = '../data'):
+    transform_test = transforms.Compose([
+    transforms.ToTensor(),
+    ])
+    testset = torchvision.datasets.MNIST(root=root, train=False, download=True, transform=transform_test)
+    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=2)
+    return testloader
+def download_model(url, file):
+    print('Dowloading from {} to {}'.format(url, file))
+    try:
+        urllib.request.urlretrieve(url, file)
+    except:
+        raise Exception("Download failed! Make sure you have stable Internet connection and enter the right name")
+def save_checkpoint(now_epoch, net, optimizer, lr_scheduler, file_name):
+    checkpoint = {'epoch': now_epoch,
+                  'state_dict': net.state_dict(),
+                  'optimizer_state_dict': optimizer.state_dict(),
+                  'lr_scheduler_state_dict':lr_scheduler.state_dict()}
+    if os.path.exists(file_name):
+        print('Overwriting {}'.format(file_name))
+    torch.save(checkpoint, file_name)
+    # link_name = os.path.join(*file_name.split(os.path.sep)[:-1], 'last.checkpoint')
+    # #print(link_name)
+    # make_symlink(source = file_name, link_name=link_name)
+def load_checkpoint(file_name, net = None, optimizer = None, lr_scheduler = None):
+    if os.path.isfile(file_name):
+        print("=> loading checkpoint '{}'".format(file_name))
+        check_point = torch.load(file_name)
+        if net is not None:
+            print('Loading network state dict')
+            net.load_state_dict(check_point['state_dict'])
+        if optimizer is not None:
+            print('Loading optimizer state dict')
+            optimizer.load_state_dict(check_point['optimizer_state_dict'])
+        if lr_scheduler is not None:
+            print('Loading lr_scheduler state dict')
+            lr_scheduler.load_state_dict(check_point['lr_scheduler_state_dict'])
+        return check_point['epoch']
+    else:
+        print("=> no checkpoint found at '{}'".format(file_name))
+def make_symlink(source, link_name):
+    """
+    Note: overwriting enabled!
+    """
+    if os.path.exists(link_name):
+        print("Link name already exist! Removing '{}' and overwriting".format(link_name))
+        os.remove(link_name)
+    if os.path.exists(source):
+        os.symlink(source, link_name)
+        return
+    else:
+        print('Source path not exists')
+from texttable import Texttable
+def tab_printer(args):
+    """
+    Function to print the logs in a nice tabular format.
+    input:
+        param args: Parameters used for the model.
+    """
+    args = vars(args)
+    keys = sorted(args.keys())
+    t = Texttable()
+    t.add_rows([["Parameter", "Value"]] +  [[k.replace("_"," ").capitalize(), args[k]] for k in keys])
+    print(t.draw())
+def onehot_like(a, index, value=1):
+    """Creates an array like a, with all values
+    set to 0 except one.
+    Parameters
+    ----------
+    a : array_like
+        The returned one-hot array will have the same shape
+        and dtype as this array
+    index : int
+        The index that should be set to `value`
+    value : single value compatible with a.dtype
+        The value to set at the given index
+    Returns
+    -------
+    `numpy.ndarray`
+        One-hot array with the given value at the given
+        location and zeros everywhere else.
+    """
+    #TODO: change the note here.
+    x = np.zeros_like(a)
+    x[index] = value
+    return x
+def reduce_sum(x, keepdim=True):
+    # silly PyTorch, when will you get proper reducing sums/means?
+    for a in reversed(range(1, x.dim())):
+        x = x.sum(a, keepdim=keepdim)
+    return x
+def arctanh(x, eps=1e-6):
+    """
+    Calculate arctanh(x)
+    """
+    x *= (1. - eps)
+    return (np.log((1 + x) / (1 - x))) * 0.5
+def l2r_dist(x, y, keepdim=True, eps=1e-8):
+    d = (x - y)**2
+    d = reduce_sum(d, keepdim=keepdim)
+    d += eps  # to prevent infinite gradient at 0
+    return d.sqrt()
+def l2_dist(x, y, keepdim=True):
+    d = (x - y)**2
+    return reduce_sum(d, keepdim=keepdim)
+def l1_dist(x, y, keepdim=True):
+    d = torch.abs(x - y)
+    return reduce_sum(d, keepdim=keepdim)
+def l2_norm(x, keepdim=True):
+    norm = reduce_sum(x*x, keepdim=keepdim)
+    return norm.sqrt()
+def l1_norm(x, keepdim=True):
+    return reduce_sum(x.abs(), keepdim=keepdim)
+def adjust_learning_rate(optimizer, epoch, learning_rate):
+    """decrease the learning rate"""
+    lr = learning_rate
+    if epoch >= 55:
+        lr = learning_rate * 0.1
+    if epoch >= 75:
+        lr = learning_rate * 0.01
+    if epoch >= 90:
+        lr = learning_rate * 0.001
+    for param_group in optimizer.param_groups:
+        param_group['lr'] = lr
+    return optimizer
+def progress_bar(current, total, msg=None):
+    global last_time, begin_time
+    if current == 0:
+        begin_time = time.time()  # Reset for new bar.
+    cur_len = int(TOTAL_BAR_LENGTH*current/total)
+    rest_len = int(TOTAL_BAR_LENGTH - cur_len) - 1
+    sys.stdout.write(' [')
+    for i in range(cur_len):
+        sys.stdout.write('=')
+    sys.stdout.write('>')
+    for i in range(rest_len):
+        sys.stdout.write('.')
+    sys.stdout.write(']')
+    cur_time = time.time()
+    step_time = cur_time - last_time
+    last_time = cur_time
+    tot_time = cur_time - begin_time
+    L = []
+    L.append('  Step: %s' % format_time(step_time))
+    L.append(' | Tot: %s' % format_time(tot_time))
+    if msg:
+        L.append(' | ' + msg)
+    msg = ''.join(L)
+    sys.stdout.write(msg)
+    for i in range(term_width-int(TOTAL_BAR_LENGTH)-len(msg)-3):
+        sys.stdout.write(' ')
+    # Go back to the center of the bar.
+    for i in range(term_width-int(TOTAL_BAR_LENGTH/2)+2):
+        sys.stdout.write('\b')
+    sys.stdout.write(' %d/%d ' % (current+1, total))
+    if current < total-1:
+        sys.stdout.write('\r')
+    else:
+        sys.stdout.write('\n')
+    sys.stdout.flush()

docs/graph/defense.rst ADDED Viewed

	@@ -0,0 +1,109 @@

+Introduction to Graph Defense with Examples
+=======================
+In this section, we introduce the graph attack algorithms provided
+in DeepRobust.
+.. contents::
+    :local:
+Test your model's robustness on poisoned graph
+-------
+DeepRobust provides a series of defense methods that aim to enhance the robustness
+of GNNs.
+Victim Models:
+- :class:`deeprobust.graph.defense.GCN`
+- :class:`deeprobust.graph.defense.GAT`
+- :class:`deeprobust.graph.defense.ChebNet`
+- :class:`deeprobust.graph.defense.SGC`
+Node Embedding Victim Models: (see more details `here <https://deeprobust.readthedocs.io/en/latest/graph/node_embedding.html>`_)
+- :class:`deeprobust.graph.defense.DeepWalk`
+- :class:`deeprobust.graph.defense.Node2Vec`
+Defense Methods:
+- :class:`deeprobust.graph.defense.GCNJaccard`
+- :class:`deeprobust.graph.defense.GCNSVD`
+- :class:`deeprobust.graph.defense.ProGNN`
+- :class:`deeprobust.graph.defense.RGCN`
+- :class:`deeprobust.graph.defense.SimPGCN`
+- :class:`deeprobust.graph.defense.AdvTraining`
+#. Load pre-attacked graph data
+    .. code-block:: python
+       from deeprobust.graph.data import Dataset, PrePtbDataset
+       # load the prognn splits by using setting='prognn'
+       # because the attacked graphs are generated under prognn splits
+       data = Dataset(root='/tmp/', name='cora', setting='prognn')
+       adj, features, labels = data.adj, data.features, data.labels
+       idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+       # Load meta attacked data
+       perturbed_data = PrePtbDataset(root='/tmp/',
+                           name='cora',
+                           attack_method='meta',
+                           ptb_rate=0.05)
+       perturbed_adj = perturbed_data.adj
+#. You can also choose to load graphs attacked by nettack. See details `here <https://deeprobust.readthedocs.io/en/latest/graph/data.html#attacked-graphs-for-node-classification>`_
+    .. code-block:: python
+       # Load nettack attacked data
+       perturbed_data = PrePtbDataset(root='/tmp/', name='cora',
+               attack_method='nettack',
+               ptb_rate=3.0) # here ptb_rate means number of perturbation per nodes
+       perturbed_adj = perturbed_data.adj
+       idx_test = perturbed_data.target_nodes
+#. Train a victim model (GCN) on clearn/poinsed graph
+    .. code-block:: python
+       from deeprobust.graph.defense import GCN
+       gcn = GCN(nfeat=features.shape[1],
+           nhid=16,
+           nclass=labels.max().item() + 1,
+           dropout=0.5, device='cpu')
+       gcn = gcn.to('cpu')
+       gcn.fit(features, adj, labels, idx_train, idx_val) # train on clean graph with earlystopping
+       gcn.test(idx_test)
+       gcn.fit(features, perturbed_adj, labels, idx_train, idx_val) # train on poisoned graph
+       gcn.test(idx_test)
+#. Train defense models (GCN-Jaccard, RGCN, ProGNN) poinsed graph
+    .. code-block:: python
+       from deeprobust.graph.defense import GCNJaccard
+       model = GCNJaccard(nfeat=features.shape[1],
+                 nhid=16,
+                 nclass=labels.max().item() + 1,
+                 dropout=0.5, device='cpu').to('cpu')
+       model.fit(features, perturbed_adj, labels, idx_train, idx_val, threshold=0.03)
+       model.test(idx_test)
+    .. code-block:: python
+       from deeprobust.graph.defense import GCNJaccard
+       model = RGCN(nnodes=perturbed_adj.shape[0], nfeat=features.shape[1],
+                    nclass=labels.max()+1, nhid=32, device='cpu')
+       model.fit(features, perturbed_adj, labels, idx_train, idx_val,
+                 train_iters=200, verbose=True)
+       model.test(idx_test)
+For details in training ProGNN, please refer to `this page <https://github.com/ChandlerBang/Pro-GNN/blob/master/train.py>`_.
+More Examples
+-----------------------
+More examples can be found in :class:`deeprobust.graph.defense`. You can also find examples in
+`github code examples <https://github.com/DSE-MSU/DeepRobust/tree/master/examples/graph>`_
+and more details in `defense table <https://github.com/DSE-MSU/DeepRobust/tree/master/deeprobust/graph#defense-methods>`_.

docs/graph/node_embedding.rst ADDED Viewed

	@@ -0,0 +1,110 @@

+Node Embedding Attack and Defense
+=======================
+In this section, we introduce the node embedding attack algorithms and
+corresponding victim models provided in DeepRobust.
+.. contents::
+    :local:
+Node Embedding Attack
+-----------------------
+Node embedding attack aims to fool node embedding models produce bad-quality embeddings.
+Specifically, DeepRobust provides the following node attack algorithms:
+- :class:`deeprobust.graph.global_attack.NodeEmbeddingAttack`
+- :class:`deeprobust.graph.global_attack.OtherNodeEmbeddingAttack`
+They only take the adjacency matrix as input and the adjacency
+matrix is in the format of :obj:`scipy.sparse.csr_matrix`. You can specify the attack_type
+to either add edges or remove edges. Let's take a look at an example:
+.. code-block:: python
+    from deeprobust.graph.data import Dataset
+    from deeprobust.graph.global_attack import NodeEmbeddingAttack
+    data = Dataset(root='/tmp/', name='cora_ml', seed=15)
+    adj, features, labels = data.adj, data.features, data.labels
+    model = NodeEmbeddingAttack()
+    model.attack(adj, attack_type="remove")
+    modified_adj = model.modified_adj
+    model.attack(adj, attack_type="remove", min_span_tree=True)
+    modified_adj = model.modified_adj
+    model.attack(adj, attack_type="add", n_candidates=10000)
+    modified_adj = model.modified_adj
+    model.attack(adj, attack_type="add_by_remove", n_candidates=10000)
+    modified_adj = model.modified_adj
+The :obj:`OtherNodeEmbeddingAttack` contains the baseline methods reported in the paper
+Adversarial Attacks on Node Embeddings via Graph Poisoning. Aleksandar Bojchevski and
+Stephan Günnemann, ICML 2019. We can specify the type (chosen from
+`["degree", "eigencentrality", "random"]`) to generate corresponding attacks.
+.. code-block:: python
+    from deeprobust.graph.data import Dataset
+    from deeprobust.graph.global_attack import OtherNodeEmbeddingAttack
+    data = Dataset(root='/tmp/', name='cora_ml', seed=15)
+    adj, features, labels = data.adj, data.features, data.labels
+    model = OtherNodeEmbeddingAttack(type='degree')
+    model.attack(adj, attack_type="remove")
+    modified_adj = model.modified_adj
+    #
+    model = OtherNodeEmbeddingAttack(type='eigencentrality')
+    model.attack(adj, attack_type="remove")
+    modified_adj = model.modified_adj
+    #
+    model = OtherNodeEmbeddingAttack(type='random')
+    model.attack(adj, attack_type="add", n_candidates=10000)
+    modified_adj = model.modified_adj
+Node Embedding Victim Models
+-----------------------
+DeepRobust provides two node embedding victim models, DeepWalk and Node2Vec:
+- :class:`deeprobust.graph.defense.DeepWalk`
+- :class:`deeprobust.graph.defense.Node2Vec`
+There are three major functions in the two classes: :obj:`fit()`, :obj:`evaluate_node_classification()`
+and :obj:`evaluate_link_prediction`. The function :obj:`fit()` will train the node embdding models
+and store the embedding in :obj:`self.embedding`. For example,
+.. code-block:: python
+    from deeprobust.graph.data import Dataset
+    from deeprobust.graph.defense import DeepWalk
+    from deeprobust.graph.global_attack import NodeEmbeddingAttack
+    import numpy as np
+    dataset_str = 'cora_ml'
+    data = Dataset(root='/tmp/', name=dataset_str, seed=15)
+    adj, features, labels = data.adj, data.features, data.labels
+    idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
+    print("Test DeepWalk on clean graph")
+    model = DeepWalk(type="skipgram")
+    model.fit(adj)
+    print(model.embedding)
+After we trained the model, we can then test its performance on node classification and link prediction:
+.. code-block:: python
+    print("Test DeepWalk on node classification...")
+    # model.evaluate_node_classification(labels, idx_train, idx_test, lr_params={"max_iter": 1000})
+    model.evaluate_node_classification(labels, idx_train, idx_test)
+    print("Test DeepWalk on link prediciton...")
+    model.evaluate_link_prediction(adj, np.array(adj.nonzero()).T)
+We can then test its performance on the attacked graph:
+.. code-block:: python
+    # set up the attack model
+    attacker = NodeEmbeddingAttack()
+    attacker.attack(adj, attack_type="remove", n_perturbations=1000)
+    modified_adj = attacker.modified_adj
+    print("Test DeepWalk on attacked graph")
+    model.fit(modified_adj)
+    model.evaluate_node_classification(labels, idx_train, idx_test)