This server is a protein structure prediction tool that processes prediction requests from users and capable of returning various scores for protein sequences.
To install the environment, follow these steps:
git clone https://github.com/Oaklight/protein-score-server.git
cd protein-score-server
conda env create -f environment.yaml
conda activate esm
pip install -r requirements.txtConfiguration File:
- Copy
server.yaml.sampletoserver.yaml:
cp server.yaml.sample server.yaml- Edit
server.yamlwith your settings.
The server uses the server.yaml file for configuration. Currently configurable items include:
api_key: API key for Hugging Face Hub login.history_path: History result storage path.intermediate_pdb_path: Intermediate PDB file storage path.model: Model configurationname: model name,esmfoldorprotenix (bytedances' alphafold3 implementation)replica: GPU device and replications mapping, should be in<device>: <num_replica>format. Foresmfoldcase, use_: <num_replica>instead.
task_queue_size: Task queue size, default to 50.timeout: Timeout for async prediction result retrieval, default to 15 seconds.backbone_pdb:reversed_index: path for reverse index from pdb id to pdb file pathparquet_prefix: path prefix for parquet filespdb_prefix: path prefix for pdb files
For example, see server.yaml
After the config are set, run these commands inside the project folder:
conda activate esm
uvicorn main:app --host 0.0.0.0 --port 8000Users can send POST requests to http://your-host:8000/predict/ to get predictions. The request body comprises of these fields: seq , name , type , seq2 .
seq: String, representing the protein sequence.name: String, representing the name of the reference protein.type: String, representing the task type, currently supports "plddt", "tmscore", "sc-tmscore", "pdb".seq2: String, representing the sequence of the reference protein. Used only forsc-tmscoretask. You may choose to provide eitherseq2orname
- pLDDT
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "plddt"
}- TMscore
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # must provide for tasks that require a reference structure
"type": "tmscore"
}- sc-TMscore
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"seq2": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST", # choose to provide either seq2 or name
"type": "sc-tmscore"
}or
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # choose to provide either seq2 or name
"type": "sc-tmscore"
}- pdb
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "pdb"
}The server will return a JSON response containing two fields: job_id and prediction .
job_id: String, representing the task ID.prediction: String, currently only indicating the prediction is in processing.
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": ... # not very useful at this moment
}Users can send GET requests to http://your-host:8000/result/{job_id} to get prediction results. The header of the request should contain Content-Type: application/json .
The server will return a JSON response containing two fields: job_id and prediction .
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": 0.983124
}When querying for results, use the following guidelines based on the status code:
- 102 Processing: The task is queued. Wait a few seconds before checking again.
- 202 Accepted: The task is being processed. Wait a few seconds before checking again.
- 200 OK: The task is complete. The result is available in the response.
- 404 Not Found: The task ID is invalid. Check the ID or resubmit the task.
- 429 Too Many Requests: The server is busy. Wait and try again later.
- Recommend to use an exponential backoff strategy with a base of 3 when querying for results.
- Example of querying is available in
test.py.
To stop the server, use Ctrl+C in the terminal where the server is running.
This server is licensed under the Apache License 2.0.