1. pahelix.featurizers ¶

Table of Contents

pahelix.featurizers

1.1. featurizer ¶

Compound datasets from pretrain-gnn.

class pahelix.featurizers.featurizer.Featurizer[source]¶

This is an abstract class for feature extraction.

It has two steps:

firstly gen_features is used to convert a single raw_data to a single data

secondly collate_fn is used to aggregate a list of data into a batch data.

collate_fn(batch_data_list)[source]¶

Aggregate batch_data_list into a batch data.

Parameters: batch_data_list (list) – a list of data generated by gen_features.
Returns: a dict of numpy ndarray.
Return type: batch_data(dict)

gen_features(raw_data)[source]¶

Convert raw_data into data, which is usually a process of feature extraction. Return None if failed.

Parameters: raw_data – can be any type.
Returns: a single data of any self-defined type. Return None if failed.

1.2. pretrain_gnn_featurizer ¶

Featurizers for pretrain-gnn.

Adapted from https://github.com/snap-stanford/pretrain-gnns/tree/master/chem/utils.py

class pahelix.featurizers.pretrain_gnn_featurizer.PreGNNAttrMaskFeaturizer(graph_wrapper, atom_type_num=None, mask_ratio=None)[source]¶

Featurizer for attribute mask model of pretrain gnns

collate_fn(batch_data_list)[source]¶: Aggregate a list of graph data into a batch data

gen_features(raw_data)[source]¶

Convert smiles into graph data.

Returns: a dict of numpy ndarray consists of graph features.
Return type: data(dict)

class pahelix.featurizers.pretrain_gnn_featurizer.PreGNNContextPredFeaturizer(substruct_graph_wrapper, context_graph_wrapper, k, l1, l2)[source]¶

Featurizer for context pred model of pretrain gnns

collate_fn(batch_data_list)[source]¶: Aggregate a list of graph data into a batch data

gen_features(raw_data)[source]¶

Convert smiles into graph data.

Returns: a dict of numpy ndarray consists of graph features.
Return type: data(dict)

class pahelix.featurizers.pretrain_gnn_featurizer.PreGNNSupervisedFeaturizer(graph_wrapper)[source]¶

Featurizer for supervised model of pretrain gnns

collate_fn(batch_data_list)[source]¶: Aggregate a list of graph data into a batch data

gen_features(raw_data)[source]¶

Convert smiles into graph data.

Returns: a dict of numpy ndarray consists of graph features.
Return type: data(dict)

pahelix.featurizers.pretrain_gnn_featurizer.graph_data_obj_to_nx_simple(data)[source]¶

Converts graph data object into a network x data object.

NB: Uses simplified atom and bond features, and represent as indices.

NB: possible issues with recapitulating relative stereochemistry since the edges in the nx object are unordered.

Parameters: data (dict) – a dict of numpy ndarray consists of graph features.
Returns: a network x object
Return type: G

pahelix.featurizers.pretrain_gnn_featurizer.nx_to_graph_data_obj_simple(G)[source]¶

Converts nx graph to graph data. Assume node indices are numbered from 0 to num_nodes - 1.

NB: Uses simplified atom and bond features, and represent as indices.

NB: possible issues with recapitulating relative stereochemistry since the edges in the nx object are unordered.

Parameters: G – nx graph object
Returns: a dict of numpy ndarray consists of graph features.
Return type: data(dict)

pahelix.featurizers.pretrain_gnn_featurizer.reset_idxes(G)[source]¶

Resets node indices such that they are numbered from 0 to num_nodes - 1

Parameters: G – network x object.
Returns: copy of G with relabelled node indices. mapping:
Return type: new_G

pahelix.featurizers.pretrain_gnn_featurizer.transform_contextpred(data, k, l1, l2)[source]¶: Randomly selects a node from the data object, and adds attributes that contain the substructure that corresponds to k hop neighbours rooted at the node, and the context substructures that corresponds to the subgraph that is between l1 and l2 hops away from the root node.

1.3. Helpful Link ¶

Please refer to our GitHub repo to see the whole module.

1. pahelix.featurizers¶

1.1. featurizer¶

1.2. pretrain_gnn_featurizer¶

1.3. Helpful Link¶

1. pahelix.featurizers ¶

1.1. featurizer ¶

1.2. pretrain_gnn_featurizer ¶

1.3. Helpful Link ¶