8/17: done
- 9:00 - 10:00 testing the model
- 10:00 - 11:30 nothing happened
- 1:00 - 4:00 finish the main figure 1
current status
- the result from yesterday have a large fluctuation, so we retrain the model with a lower
lr
:
need to do tomorrow
notice
figures and reference for paper
table for draw the figure on after each step the molecular and sequence remain:
step description data mol seq 0 original 2278226 986143 8005 1 drop multichain 2169710 944576 7850 2 only keep data with $K_i$ value 490605 204901 3404 3 calculate the number of time that molecular and sequence occur, remove data with molecular occur less than 3 times and sequence occur less than 6 times 288115 55924 1872 4 remove invalid $K_i$ value(e.g. $K_i$ = 0) 250481 54216 1846 5 embed molecular and sequence, remove the data which cannot be embedded 250344 54177 1844 6 remove $pK_i(log10 K_i)$ with higher than $8$ 249517 54135 1844 - the figure: