Using BLAST to check specificity of PCR primer sequences

5 key points to keep in mind

A common challenge when designing primers is to avoid non-specific amplification. This is because non-specific amplification can lead to false positives in downstream analyses: if you are trying to amplify a gene of interest, but your primer also amplifies a different gene, then you could get a false positive result.

Similarly, you want to avoid even one primer binding to two locations in the genome. This is because mis-priming of one primer can lead to lower efficiency of your PCR reaction, and in extreme cases to false negatives.

And if you’re thinking about multiplexing, having primers that match multiple locations can lead to false positives and to false negatives.

Common primer design software such as Primer3 will check for primer-dimers, but they do not check for non-specific amplification.

However, BLAST is a powerful tool used to compare sequences and search for similarities. It can be used to check the specificity of PCR primers. But there are five key points to keep in mind.

  1. Use shorter word-size to increase sensitivity
  2. Search the entire genome: Avoid filtering
  3. Be specific in the scope of your search
  4. Check BLAST hit coordinates
  5. Changing how BLAST tallies up scores

Let’s dive deeper into each of these points, before summarizing the command-line options you should use.

1. Use shorter word-size to increase sensitivity

By default, BLAST searches with a word-size of 11 or even 28. This means that BLAST will only detect sequence similarity if there are at least 11 (or even 28) nucleotides of perfect identity. That would be inappropriate for checking primer specificity, because primers are typically 20 nucleotides long, and even partial matches can create mis-priming. So rather than using the default, specify -task blastn-short. This decreases the word-size to 7. Thus it increases BLAST’s ability to detect partial matches.

2. Search the entire genome: Avoid filtering

You also want to be searching the entire genome. But because of filtering, that normally doesn’t happen. Thus, you want to switch off filtering of lower confidence or lower complexity regions. This is because part of your primer may sometimes hit a repetitive or ambiguous sequence.

To switch off filtering, specificy -dust no -soft_masking false. (Soft-masking ignores lowercase letters in the genome, and dust is a filter that removes highly repetitive sequences such as microsatellites or minisatellites).

BLAST is more sensitive if it is searching against the most appropriate database. This is because smaller databases lead to stronger (i.e., smaller) e-values.

So reduce the scope of your search. In most cases, this means searching against only your organism of interest, rather than a multi-genome database or something like “all of RefSeq” or “all of NCBI”). But if your DNA sample contains DNA from multiple organisms, as happens in symbiosis or in microbiomes, then you should use the search against that correct combination of relevant genomes.

4. Check BLAST his coordinates

Ideally, you get a single hit per primer. But if you get multiple, check their coordinates. For this, it can be helpful to download the table of all hits into excel. Make sure you also check hit orientation in comparison to query orientation - this is because BLAST doesn’t necessarily align from the first to the last nucleotide (i.e., the hit may focus on the middle part of the sequence that does match perfectly).

5. Changing how BLAST tallies up scores

For primer search, some people also modify the way BLAST calculates scores… because a mismatch can severely reduce annealing. So they might add -penalty -3 -reward 1. Same thing for gaps (insertion/deletions), you might add -gapopen 5 -gapextend 2.

Overall, use the following options:

To pull everything together, use the following BLASTN options for checking PCR primers:

-task blastn-short -dust no -soft_masking false -penalty -3 -reward 1 -gapopen 5 -gapextend 2

In the SequenceServer BLAST interface you’d simply put that in the “Advanced parameters” box. And your site administrator could set this up as an option that exists by default

In the command line, you’d add those arguments to the end of the BLASTN command.

SequenceServer Cloud makes it easy to perform sequence search results and to interpret them. For this, it leverages cloud computing, publication-ready graphics that facilitate interpretation, and a powerful graphical interface for configuring BLAST databases. [Try it out]

Sequence Search with SequenceServer