API for programmatic remote BLAST to SequenceServer instances
SequenceServer Cloud provides dedicated secure servers to BLAST your data and visualize results. Sometimes, you just want to do this programmatically.
Here, we have examples for Python and the unix command-line. The same principles could be carried over to other languages such as R, Go, or Java.
Python API for remote BLAST
This repository provides a lightweight Python API wrapper to SequenceServer Cloud users to execute BLAST queries against DBs on their SequenceServer Cloud instance.
Here’s an example of how it works:
from sequenceserver.sequenceserver_api import SequenceServerApi
import os
base_url = os.environ.get("SEQUENCESERVER_INSTANCE_URL", "https://YOURINSTANCE.sequenceserver.com")
api_token = os.environ.get("SEQUENCESERVER_API_TOKEN", "REDACTED_OBTAIN_FROM_SUPPORT==")
sequence_server = SequenceServerApi(base_url, api_token=api_token)
# Get instance configuration
print(sequence_server.get_configuration())
# Get databases with the basic set of attributes
print(sequence_server.get_databases())
# Get databases with a full set of attributes
print(sequence_server.get_databases(full_response=True))
# Select databases to query e.g. here we're selecting all nucleotide databases
databases = [db["id"] for db in sequence_server.get_databases() if db["type"] == "nucleotide"]
blast_search_type = "tblastn"
query_seq = "QRPSEEKDRKERRRAQRCAGRRAAYRKGCEKAGKLRRKGVARASREGLKISDATAALDLR\nAQTQAQPDFGQLDHQQPQHHHQQQQPPQQQQQQPPPPQQQQQPQHPQQQHNQNPESRPHH\nHLPQQHHHQHHPGNHLHSGDSGGGIGGGGGGGGGGGGGGGGGGGGGGGGGSAGGVAVVAG"
advanced_opts = "-evalue 1.0e-8"
# Submit a BLAST job and get its ID back
job_id = sequence_server.submit_blast_job(blast_search_type, query_seq, databases, advanced_opts)
# Poll for a response, jobs can take a while depending on the DB size and query complexity
response = sequence_server.poll_job(job_id)
# Get the results of the job in selected format
# Available formats: "xml", "std_tsv", "full_tsv"
print(sequence_server.get_job_result(job_id, "xml"))
Command-line API for SequenceServer
This document describes how to access SequenceServer functionality programmatically using the command line. The concept can be extended to R or any other language that can make HTTP requests.
The documentation is based on version 2.0.0.rc4
Example invocations use curl
and jq
.
$BASEURL
in the examples is the URL of your SequenceServer instance. E.g. http://localhost:4567 if you are running on default localhost URL.
The accompanying script blastnAllDbs.sh is a working shell script to submit a BLAST job and get results.
Getting list of databases
GET: /searchdata.json
In order to submit a Blast job, you have to know the IDs of the Blast databases. This endpoint retrieves information about the databases in JSON format.
curl $BASEURL/searchdata.json | jq --raw-output '.database[].id'
The above command gets the IDs of the databases
Submitting a query
POST: /
Form parameters
method
. The name of the blast search to use (blastn, blastp, tblastn etc)sequence
The query sequencedatabases[]
One or more Ids of Blast databases to searchadvanced
Additional options, e.g evalue, gapopen, gapextend etc
Responses
Successful submission results in a 303 HTTP status code.
- Code 303
Location
header is a link to the submitted job ID
Examples
-
To query a single database using blastn:
curl -v -X POST -Fsequence=ATGTTACCACCAACTATTAGAATTTCAG -Fmethod=blastn -Fdatabases[]=3c0a5bc06f2596698f62c7ce87aeb62a $BASEURL
-
To query multiple databases, add extra -Fdatabases[] arguments, e.g.
curl -v -X POST -Fsequence=ATGTTACCACCAACTATTAGAATTTCAG -Fmethod=blastn -Fdatabases[]=3c0a5bc06f2596698f62c7ce87aeb62a -Fdatabases[]=2f8c0e19d8d5b8ab225962d7284a6cbf $BASEURL
-
Getting location header - you need this in order to retrieve the results
jobUrl=$(curl -v -X POST -Fsequence=ATGTTACCACCAACTATTAGAATTTCAG -Fmethod=blastn -Fdatabases[]=3c0a5bc06f2596698f62c7ce87aeb62a --write-out '%{redirect_url}' $BASEURL)
-
Altering the evalue threshold and adding a gap penalty:
curl -v -X POST -Fsequence=ATGTTACCACCAACTATTAGAATTTCAG -Fmethod=blastn -Fdatabases[]=3c0a5bc06f2596698f62c7ce87aeb62a -Fadvanced="-evalue 1.0e-8 -gapopen 1 -gapextend 1" $BASEURL
Retrieving results
GET: /{jobId}.json
Path variables
jobId
The Job Id
Responses
- Code: 202 The Blast job is still running
- Code: 200 The job is complete, results are in JSON format
Examples
curl -o myresults.json $BASEURL/069b56c8-25bd-451e-b117-dc996a1aed24.json
Results in other formats
GET: /download/{jobId}.{format}
Path variables
jobId
the Job ID retrieved after submissionformat
is one ofxml
,std_tsv
orfull_tsv
Examples
curl -o myresults.xml $BASEURL/download/069b56c8-25bd-451e-b117-dc996a1aed24.xml
curl -o myresults.tsv $BASEURL/download/069b56c8-25bd-451e-b117-dc996a1aed24.std_tsv
curl -o myresults-full.tsv $BASEURL/download/069b56c8-25bd-451e-b117-dc996a1aed24.full_tsv
Downloading hits
Download hits in FASTA format.
POST: /get_sequence
Form parameters
sequence_ids
A comma-separated list of sequence IDsdatabase_ids
A comma-separated list of database Ids
Examples
curl -X POST -Fsequence_ids=SPAC1002.01,SPAC1002.02 -Fdatabase_ids=2f8c0e19d8d5b8ab225962d7284a6cbf $BASEURL/get_sequence