Search Raw Reads from NCBI’s SRA Database

We are thrilled to unveil an important addition to SequenceServer Cloud.

Our newest feature enables direct searching of raw reads from the NCBI’s Sequence Read Archive (SRA) database. This means you can search the stupendously large amounts of raw Illumina, Nanopore, Pacbio reads from each sample in every genomic and RNA-seq experiment. And also the smaller amounts from old Solexa and 454 runs.

This ability is critical for research questions that cannot be simply answered by examining assembled transcriptomes, genomes, or metagenomes, or by seeing summary counts of mapped reads.


How It Works

As outlined in the video above, it’s straightforward.

  1. Input Your Query: Paste your FASTA query sequence(s) into the search field.
  2. Select Your SRA Runs: Specify the identifiers of the SRA runs of interest.
  3. Specify RNA Analysis (If Applicable): For RNA-related searches, click “Allow spliced alignments”, so splicing isn’t penalized.
  4. Specify Parameters: The SRA BLAST user interface allows for setting E-values and the maximum number of aligned sequences to keep when submitting a query.
  5. Submit: Hit Submit. Each SRA is analyzed in parallel. This takes minutes to hours depending on data sizes involved (some SRA datasets are really big!)
  6. View Results: (see below)

The user interface for the SRA BLAST allows you to be able to easily select parameters like the e-value and maximum number of aligned sequences

Making Sense of SRA BLAST Analysis Results

Results appear progressively as the analysis of each SRA run finishes. For each SRA sample, you get:

  1. A results table of all the alignments. You can view hits in a table by clicking on the accession number or the “Table” button.
  2. An interactive “Genome Browser” to visualize alignments.
  3. “Open on NCBI” goes to the web page for the SRA accession.
  4. The number of reads that match/hit the query sequence.
  5. Export options for the alignment reads.
    • a. Download alignments in SAM/BAM format, which includes CIGAR strings and MD tags that indicate mismatches and indels.
    • b. Download all the hit sequences in FASTA format.
    • c. Download ASN and TSV reports that contain BLAST alignment statistics.

Once you have your SRA BLAST run complete you can view and download the results in different formats

Viewing Results in Table Format

The results table can be viewed interactively or downloaded as a TSV file. The table contains lots of information, such as E-values, percentage identity, and other attributes for each hit. Each column can be sorted to view and prioritize alignment metrics.

You can explore your SRA BLAST hits by filtering and sorting. You can select from any of the metrics, such as, E-value, start/end of the alignment and bit score

Viewing Results in the Interactive Genome Browser

The interactive genome browser enables users to view alignments for each SRA accession. Alignment attributes like SNPs, INDELs, and read orientation options can be toggled (blue circle) and explored at different magnifications (green circle). Each query sequence can be selected at the top left (red circle).

You can explore your SRA BLAST hits in the interactive genome browser. Zoom in to specific regions, toggle viewing options, and extract a deeper understanding of your results quickly.

Potential Applications of SRA BLAST

We’re excited to see how SequenceServer Cloud users will use SRA BLAST. Likely applications include:

Happy BLASTing!

Stay up to date

To receive the latest news from our team, enter your email:

Some other blog posts you might like: