2. pahelix.model_zoo

2.1. pretrain_gnns_model

This is an implementation of pretrain gnns: https://arxiv.org/abs/1905.12265

class pahelix.model_zoo.pretrain_gnns_model.AttrmaskModel(*args: Any, **kwargs: Any)[source]

This is a pretraning model used by pretrain gnns for attribute mask training.

Returns:

the loss variance of the model.

Return type:

loss

forward(graphs, masked_node_indice, masked_node_labels)[source]

Build the network.

class pahelix.model_zoo.pretrain_gnns_model.PretrainGNNModel(*args: Any, **kwargs: Any)[source]

The basic GNN Model used in pretrain gnns.

Parameters:

model_config (dict) – a dict of model configurations.

forward(graph)[source]

Build the network.

property graph_dim

the out dim of graph_repr

property node_dim

the out dim of graph_repr

class pahelix.model_zoo.pretrain_gnns_model.SupervisedModel(*args: Any, **kwargs: Any)[source]

This is a pretraning model used by pretrain gnns for supervised training.

Returns:

the loss variance of the model.

Return type:

self.loss

forward(graphs, labels, valids)[source]

Build the network.

2.2. protein_sequence_model

Sequence-based models for protein.

class pahelix.model_zoo.protein_sequence_model.LstmEncoderModel(vocab_size, emb_dim=128, hidden_size=1024, n_layers=3, padding_idx=0, epsilon=1e-05, dropout_rate=0.1)[source]
forward(input, pos)[source]
class pahelix.model_zoo.protein_sequence_model.ResnetEncoderModel(vocab_size, emb_dim=128, hidden_size=256, kernel_size=9, n_layers=35, padding_idx=0, dropout_rate=0.1, epsilon=1e-06)[source]
forward(input, pos)[source]
init_weights(layer)[source]

Initialization hook

class pahelix.model_zoo.protein_sequence_model.TransformerEncoderModel(vocab_size, emb_dim=512, hidden_size=512, n_layers=8, n_heads=8, padding_idx=0, dropout_rate=0.1)[source]
forward(input, pos)[source]
init_weights(layer)[source]

Initialization hook

class pahelix.model_zoo.protein_sequence_model.PretrainTaskModel(class_num, model_config, encoder_model)[source]
forward(input, pos)[source]
class pahelix.model_zoo.protein_sequence_model.SeqClassificationTaskModel(class_num, model_config, encoder_model)[source]
forward(input, pos)[source]
class pahelix.model_zoo.protein_sequence_model.ClassificationTaskModel(class_num, model_config, encoder_model)[source]
forward(input, pos)[source]
class pahelix.model_zoo.protein_sequence_model.RegressionTaskModel(model_config, encoder_model)[source]
forward(input, pos)[source]
class pahelix.model_zoo.protein_sequence_model.ProteinEncoderModel(model_config, name='')[source]

ProteinSequenceModel

forward(input, pos)[source]
class pahelix.model_zoo.protein_sequence_model.ProteinModel(encoder_model, model_config)[source]
forward(input, pos)[source]
class pahelix.model_zoo.protein_sequence_model.ProteinCriterion(model_config)[source]
cal_loss(pred, label)[source]

2.3. seq_vae_model

class pahelix.model_zoo.seq_vae_model.VAE(vocab, model_config)[source]

The sequence VAE model

Parameters:
  • vocab – the vocab object.

  • model_config – the json files of model parameters.

forward(x)[source]

Model forward

forward_decoder(x, z)[source]

decoder

forward_encoder(x)[source]

encoder

sample(n_batch, max_len=100, z=None, temp=1.0)[source]

Generating n_batch samples in eval mode (z could be not on same device)

Parameters:
  • n_batch – number of sentences to generate

  • max_len – max len of samples

  • z – (n_batch, d_z) of floats, latent vector z or None

  • temp – temperature of softmax

Returns:

list of tensors of strings, samples sequence x

sample_z_prior(n_batch)[source]

Sampling z ~ p(z) = N(0, I)

Parameters:

n_batch – number of batches

Returns:

(n_batch, d_z) of floats, sample of latent z

tensor2string(tensor)[source]

convert tensor values to sequence string