Machine Learning NoteBook 0809

Technique Sharing

8/9: to-do

  • finish the improved network
  • do some statistically evaluate on the first version of model and find the problems in data

current status

  • start to use the model with build 2 resnet blocks and a starting block for each of mol and seq

    • each block contain 2 Conv1d module and the starting block will have 1 more Conv1d module with 1*1 core
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    class Block(nn.Module):
    def __init__(self, in_channels, out_channels, use_conv1=False, strides=1):
    super().__init__()

    self.process = nn.Sequential (
    nn.Conv1d(in_channels, out_channels, 3, stride=strides, padding=1),
    nn.BatchNorm1d(out_channels),
    nn.ReLU(inplace=True),
    nn.Conv1d(out_channels, out_channels, 3, padding=1),
    nn.BatchNorm1d(out_channels)
    )

    if use_conv1:
    self.conv1 = nn.Conv1d(in_channels, out_channels, kernel_size=1, stride=strides)
    else:
    self.conv1 = None

    def forward(self, x):
    left = self.process(x)
    right = x if self.conv1 is None else self.conv1(x)

    return F.relu(left + right)

    class cnnNet(nn.Module):
    def __init__(self):
    super().__init__()

    self.pre = nn.Sequential (
    nn.Conv1d(1, 32, 7, stride=2, padding=3, bias=False),
    nn.BatchNorm1d(32),
    nn.ReLU(inplace=True),
    nn.MaxPool1d(3, stride=1, padding=1)
    )

    self.layer1 = self._make_layer(32, 16, 2)

    def _make_layer(self, in_channels, out_channels, block_num, strides=1):

    layers = [Block(in_channels, out_channels, use_conv1=True, strides=strides)] # build the first layer with conv1

    for i in range(block_num):
    layers.append(Block(out_channels, out_channels))

    return nn.Sequential(*layers)

    def forward(self, x):
    x = self.pre(x)
    x = self.layer1(x)

    return x
  • the first version of network is finished

    • the model is overfitting try to add the $p$ value in dropout
  • find a problem that the $Ki$ value in data have some extreme values try to remove those values and train the model again

  • download the PDB database to evaluate the model(may need to change another database)

  • already start to write the method’s data part, finished the Ki figure: (this is another version, the original version cannot be founded)
    Figure

need to do tomorrow

  • make a to-do list for everything need to do tomorrow
  • check the performance of the new model
  • continue working on the Paper’s introduction and conclusion
  • start to make some figures include the statistic of protein sequence data and molecular data

notice

  • during the network, you should add nn.BatchNorm1d() after each layer if nn.Linear

figures and reference for paper

  • figure is shown above