3. pahelix.networks

3.1. gnn_block

Blocks for Graph Neural Network (GNN)
pahelix.networks.gnn_block.gat_layer(gw, feature, edge_features, hidden_size, act, name, num_heads=1, feat_drop=0.1, attn_drop=0.1, is_test=False)[source]

Implementation of graph attention networks (GAT)

Parameters
  • gw (GraphWrapper) – pgl graph wrapper object.

  • feature (tensor) – node features with shape (num_nodes, feature_size).

  • edge_features (tensor) – edges features with shape (num_edges, feature_size).

  • hidden_size (int) – the hidden size for gcn.

  • act (str) – the activation for the output.

  • name (str) – the prefix of layer param names.

  • num_heads (int) – the head number in gat.

  • feat_drop – dropout rate for the feature.

  • attn_drop – dropout rate for the attention.

  • is_test – whether in test phrase.

pahelix.networks.gnn_block.gcn_layer(gw, feature, edge_features, act, name)[source]

Implementation of graph convolutional neural networks (GCN)

Parameters
  • gw (GraphWrapper) – pgl graph wrapper object.

  • feature (tensor) – node features with shape (num_nodes, feature_size).

  • edge_features (tensor) – edges features with shape (num_edges, feature_size).

  • hidden_size (int) – the hidden size for gcn.

  • act (int) – the activation for the output.

  • name (int) – the prefix of layer param names.

pahelix.networks.gnn_block.gin_layer(gw, node_features, edge_features, name)[source]

Implementation of Graph Isomorphism Network (GIN) layer.

Parameters
  • gw (GraphWrapper) – pgl graph wrapper object.

  • node_features (tensor) – node features with shape (num_nodes, feature_size).

  • edge_features (tensor) – edges features with shape (num_edges, feature_size).

  • name (str) – the prefix of layer param names.

pahelix.networks.gnn_block.max_recv(feat)[source]

max pooling

pahelix.networks.gnn_block.mean_recv(feat)[source]

average pooling

pahelix.networks.gnn_block.random() x in the interval [0, 1).
pahelix.networks.gnn_block.sum_recv(feat)[source]

sum pooling

3.2. lstm_block

Lstm block.

pahelix.networks.lstm_block.lstm_encoder(input, hidden_size, n_layer=1, is_bidirectory=True, param_initializer=None, name='lstm')[source]

The encoder is composed of a stack of lstm layers.

Parameters
  • input – The input of lstm encoder.

  • hidden_size – The hidden size of lstm.

  • n_layer – The number of lstm layers.

  • is_bidirectory – True if the lstm is bidirectory.

  • param_initializer – The parameter initializer for lstm encoder.

  • name – The prefix of the parameters’ name in lstm encoder.

Returns

The hidden units of lstm encoder. checkpoints: The checkpoints for recompute mechanism.

Return type

hidden

3.3. optimizer

class pahelix.networks.optimizer.AdamW(*args: Any, **kwargs: Any)[source]

AdamW object for dygraph

apply_optimize(loss, startup_program, params_grads)[source]

Update params with weight decay.

3.4. pre_post_process

pahelix.networks.pre_post_process.pre_post_process_layer(prev_out, out, process_cmd, dropout_rate=0.0, epsilon=1e-05, name='', is_test=False)[source]

Add residual connection, layer normalization and droput to the out tensor optionally according to the value of process_cmd.

This will be used before or after multi-head attention and position-wise feed-forward networks.

3.5. resnet_block

Resnet block.

pahelix.networks.resnet_block.resnet_encoder(input, hidden_size, n_layer=1, filter_size=3, act='gelu', epsilon=1e-06, param_initializer=None, name='resnet')[source]

The encoder is composed of a stack of resnet layers.

Parameters
  • input – The input of resnet encoder.

  • hidden_size – The hidden size of resnet.

  • n_layer – The number of resnet layers.

  • act – The activation function.

  • param_initializer – The parameter initializer for resnet encoder.

  • name – The prefix of the parameters’ name in resnet encoder.

Returns

The hidden units of resnet encoder. checkpoints: The checkpoints for recompute mechanism.

Return type

hidden

3.6. transformer_block

Transformer block.

pahelix.networks.transformer_block.multi_head_attention(queries, keys, values, attn_bias, d_key, d_value, d_model, n_head=1, dropout_rate=0.0, cache=None, gather_idx=None, store=False, param_initializer=None, name='multi_head_att', is_test=False)[source]

Multi-Head Attention.

Note that attn_bias is added to the logit before computing softmax activiation to mask certain selected positions so that they will not considered in attention weights.

pahelix.networks.transformer_block.positionwise_feed_forward(x, d_inner_hid, d_hid, dropout_rate, hidden_act, param_initializer=None, name='ffn', is_test=False)[source]

Position-wise Feed-Forward Networks.

This module consists of two linear transformations with a ReLU activation in between, which is applied to each position separately and identically.

pahelix.networks.transformer_block.transformer_encoder(enc_input, attn_bias, n_layer, n_head, d_key, d_value, d_model, d_inner_hid, prepostprocess_dropout, attention_dropout, act_dropout, hidden_act, preprocess_cmd='n', postprocess_cmd='da', param_initializer=None, name='', epsilon=1e-05, n_layer_per_block=1, param_share='normal', caches=None, gather_idx=None, store=False, is_test=False)[source]

The encoder is composed of a stack of identical layers returned by calling transformer_encoder_layer.

pahelix.networks.transformer_block.transformer_encoder_layer(input, attn_bias, n_head, d_key, d_value, d_model, d_inner_hid, prepostprocess_dropout, attention_dropout, act_dropout, hidden_act, preprocess_cmd='n', postprocess_cmd='da', param_initializer=None, name='', epsilon=1e-05, cache=None, gather_idx=None, store=False, is_test=False)[source]

The encoder layers that can be stacked to form a deep encoder.

This module consits of a multi-head (self) attention followed by position-wise feed-forward networks and both the two components companied with the pre_process_layer / post_process_layer to add residual connection, layer normalization and droput.