1. pahelix.featurizers¶
Table of Contents
1.1. featurizer¶
Compound datasets from pretrain-gnn.
- class pahelix.featurizers.featurizer.Featurizer[source]¶
This is an abstract class for feature extraction.
It has two steps:
firstly
gen_features
is used to convert a single raw_data to a single datasecondly
collate_fn
is used to aggregate a list of data into a batch data.
1.2. pretrain_gnn_featurizer¶
- class pahelix.featurizers.pretrain_gnn_featurizer.PreGNNAttrMaskFeaturizer(graph_wrapper, atom_type_num=None, mask_ratio=None)[source]¶
Featurizer for attribute mask model of pretrain gnns
- class pahelix.featurizers.pretrain_gnn_featurizer.PreGNNContextPredFeaturizer(substruct_graph_wrapper, context_graph_wrapper, k, l1, l2)[source]¶
Featurizer for context pred model of pretrain gnns
- class pahelix.featurizers.pretrain_gnn_featurizer.PreGNNSupervisedFeaturizer(graph_wrapper)[source]¶
Featurizer for supervised model of pretrain gnns
- pahelix.featurizers.pretrain_gnn_featurizer.graph_data_obj_to_nx_simple(data)[source]¶
Converts graph data object into a network x data object.
NB: Uses simplified atom and bond features, and represent as indices.
NB: possible issues with recapitulating relative stereochemistry since the edges in the nx object are unordered.
- Parameters
data (dict) – a dict of numpy ndarray consists of graph features.
- Returns
a network x object
- Return type
G
- pahelix.featurizers.pretrain_gnn_featurizer.nx_to_graph_data_obj_simple(G)[source]¶
Converts nx graph to graph data. Assume node indices are numbered from 0 to num_nodes - 1.
NB: Uses simplified atom and bond features, and represent as indices.
NB: possible issues with recapitulating relative stereochemistry since the edges in the nx object are unordered.
- Parameters
G – nx graph object
- Returns
a dict of numpy ndarray consists of graph features.
- Return type
data(dict)
- pahelix.featurizers.pretrain_gnn_featurizer.reset_idxes(G)[source]¶
Resets node indices such that they are numbered from 0 to num_nodes - 1
- Parameters
G – network x object.
- Returns
copy of G with relabelled node indices. mapping:
- Return type
new_G
- pahelix.featurizers.pretrain_gnn_featurizer.transform_contextpred(data, k, l1, l2)[source]¶
Randomly selects a node from the data object, and adds attributes that contain the substructure that corresponds to k hop neighbours rooted at the node, and the context substructures that corresponds to the subgraph that is between l1 and l2 hops away from the root node.
1.3. Helpful Link¶
Please refer to our GitHub repo to see the whole module.