Commit
·
5cfddda
1
Parent(s):
ecc320f
Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,10 @@ tags:
|
|
29 |
This model is trained to predict general binding sites of proteins using on the sequence. This is a finetuned version of
|
30 |
`esm2_t6_8M_UR50D`, trained on [this dataset](https://huggingface.co/datasets/AmelieSchreiber/general_binding_sites). The data is
|
31 |
not filtered by family, and thus the model may be overfit to some degree. In the Hugging Face Inference API widget to the right
|
32 |
-
there are three protein sequence examples. The first is a DNA binding protein
|
|
|
|
|
|
|
33 |
a Markov Chain Monte Carlo method of (in silico) directed evolution of proteins based on a form of Gibbs sampling. The mutatant-type
|
34 |
protein sequences in theory should have similar binding sites to the wild-type protein sequence, but perhaps with higher binding affinity.
|
35 |
Testing this out on the model, we see the two proteins indeed have the same binding sites, which validates to some degree that the model
|
|
|
29 |
This model is trained to predict general binding sites of proteins using on the sequence. This is a finetuned version of
|
30 |
`esm2_t6_8M_UR50D`, trained on [this dataset](https://huggingface.co/datasets/AmelieSchreiber/general_binding_sites). The data is
|
31 |
not filtered by family, and thus the model may be overfit to some degree. In the Hugging Face Inference API widget to the right
|
32 |
+
there are three protein sequence examples. The first is a DNA binding protein ([see UniProt entry here](https://www.uniprot.org/uniprotkb/D3ZG52/entry)).
|
33 |
+
Note there is significant overlap in the predicted binding sites and the binding sites given in UniProt.
|
34 |
+
|
35 |
+
The second and third were obtained using [EvoProtGrad](https://github.com/Amelie-Schreiber/sampling_protein_language_models/blob/main/EvoProtGrad_copy.ipynb)
|
36 |
a Markov Chain Monte Carlo method of (in silico) directed evolution of proteins based on a form of Gibbs sampling. The mutatant-type
|
37 |
protein sequences in theory should have similar binding sites to the wild-type protein sequence, but perhaps with higher binding affinity.
|
38 |
Testing this out on the model, we see the two proteins indeed have the same binding sites, which validates to some degree that the model
|