Search Raw Reads from NCBI’s SRA Database
We are thrilled to unveil an important addition to SequenceServer Cloud.
Our newest feature enables direct searching of raw reads from the NCBI’s Sequence Read Archive (SRA) database. This means you can search the stupendously large amounts of raw Illumina, Nanopore, Pacbio reads from each sample in every genomic and RNA-seq experiment. And also the smaller amounts from old Solexa and 454 runs.
This ability is critical for research questions that cannot be simply answered by examining assembled transcriptomes, genomes, or metagenomes, or by seeing summary counts of mapped reads.
How It Works
As outlined in the video above, it’s straightforward.
- Input Your Query: Paste your FASTA query sequence(s) into the search field.
- Select Your SRA Runs: Specify the identifiers of the SRA runs of interest.
- Specify RNA Analysis (If Applicable): For RNA-related searches, click “Allow spliced alignments”, so splicing isn’t penalized.
- Specify Parameters: The SRA BLAST user interface allows for setting E-values and the maximum number of aligned sequences to keep when submitting a query.
- Submit: Hit Submit. Each SRA is analyzed in parallel. This takes minutes to hours depending on data sizes involved (some SRA datasets are really big!)
- View Results: (see below)
Making Sense of SRA BLAST Analysis Results
Results appear progressively as analysis of each SRA run finishes. For each SRA sample, you get:
- Numbers of Hits: The number of reads that match the query sequence.
- Hit Sequences: Download all the hit sequences in FASTA or FASTQ formats.
- Alignment details: Extensive details on how each hit sequence aligns. You can see these details in SAM format - they including CIGAR strings and MD tags that indicate mismatches and indels.
- Explore Results: You can view hits in a table by clicking on the accession number. The hits page allows both sorting and filtering of the hits.
- Download Results: Result page allows downloading ASN and TSV reports that contain E-values, percentage identity and other attributes for each hit.
Potential Applications of SRA BLAST
We’re excited to see how SequenceServer Cloud users will use SRA BLAST. Likely applications include:
- Exploring viral sequences to decipher patterns of evolution and spread across species and over time.
- Examining transposable element activity to understand how they jump, evolve, and shape phenotypes.
- Environmental Genomics: Analyzing microbial communities from soil microbiomes to the human gut, for example to understand where particular genes and accessory genomes are found, and in which frequencies.
Happy BLASTing!
Stay up to date
To receive the latest news from our team, enter your email:
Supercharge your genomics research with SequenceServer tools & services
Free Trial