Setting Up and Running a Custom BLAST Server: A Comprehensive Guide
Molecular biology, bioinformatics, and genomics are fast-moving fields. Appropriately leveraging powerful tools like BLAST is essential for success. Here, we walk through some of the things you need to consider when setting up and running a custom BLAST server.
When is a custom BLAST server needed?
Why not just use NCBI BLAST? In many cases, a custom server is required. This is typically either because you’re analyzing data that doesn’t exist at NCBI, or because you’re trying to improve your productivity.
Using a custom server to keep your data and analyses confidential and secure
If you’re working with unpublished or proprietary genome data, you can’t use NCBI’s web server - your data isn’t on it. Thus, you need to run a custom BLAST server. This also helps to protect your intellectual property: Housing unpublished data on your server ensures it remains confidential, safeguarding your intellectual property. Running searches on your own server also means nobody can know which gene targets you are searching for.
Using a custom BLAST server can improve your productivity
- Collaboration: Foster collaboration by sharing genome and transcriptome databases and analysis results within your team.
- No waiting times: On a shared resource like NCBI’s web server, your BLAST request is put in a queue, and only processed when it’s your turn. Depending on the demands on their servers, you can wait a long time! With your own server, you can run BLAST whenever you want and get results back faster.
- No limits: NCBI’s web server is a shared government-funded resource. Thus, they have to limit the resources any user can use. Typically, this means you can only run few short queries. With your own server, you can use as much power as you want, and get results back faster.
- Community Engagement: Add essential community- or lab-specific information to foster engagement and create a resource hub (with protocols, publications, a genome browser, downloadable files…). You can also modify colors and fonts to suit the rest of your community genome portal or your personal taste. Or add information regarding policies for all of your lab members.
- Integration: Seamlessly integrate it with other in-house tools and databases, promoting a cohesive workflow and increasing team members’ productivity.
Challenges to consider when setting up and running a custom BLAST server
Some parts of running your lab’s or department’s own BLAST server are easy. But there are challenges, too.
Keeping databases and software up-to-date
- Genome Databases: The road doesn’t end at setup; constant updates to the genome databases are essential to remain relevant.
- Software Updates: Tools like NCBI BLAST and SequenceServer continually evolve, with updates every few months. Keeping them up-to-date is crucial for accessing the latest features and improvements that improve analysis run-time, sensitivity, accuracy, and server security.
Such updates can be time-consuming and, if not done correctly, can lead to errors. Thus, it’s essential to have a plan for how to keep your databases and software up-to-date. This can be a challenge in small labs due to staff turnover and the lack of a dedicated bioinformatician. In well-funded core facilities, such routine work takes up valuable time.
Accessing the server from home and when at conferences
Where will you run your server?
Imagine you have a server in the lab.
How will your team members access it from home or when you’re at a conference?
- You need a static IP or hostname (which you will type into the web browser to access your server). In most offices, this isn’t easy to get.
- You need to open a port in your firewall. This is a security risk, and many companies and institutes don’t allow it. In some cases, VPN access is possible, but this is often slow and cumbersome.
What happens when the power goes down? Power outages are rare but also inevitable. A server losing power gets corrupted. A backup power supply (UPS) can keep your server running. You’ll also want an automatic shutdown script to shut down your server safely when the power goes out. Who will start it up again when the bioinformatician is on holiday? Automation and good policies are essential.
What happens if the server crashes? Perhaps because someone submitted a search that is too big. You really don’t want that to happen when you’re preparing for a conference, and nobody is there to get it running again.
Using a cloud BLAST service can solve these problems.
How will you keep your server secure?
If you’re running your own machine, you’ll need to have a plan for several elements:
- Software and system updates: Keep your software and operating system up-to-date to ensure you have the latest security patches.
- Access Control: Establish access control measures to ensure only people who know what they are doing can access the server.
- “Hardening” the server as protection against hackers: Harden your server by removing unnecessary software and services, configuring the software to run securely, using firewalls, using intrusion detection systems, and limiting SSH access to the server…
- Security Audits: Regularly conduct security audits to identify and mitigate vulnerabilities.
- Data Backups: Establish robust backup systems and recovery plans to secure data against potential losses.
What are the hardware requirements for running BLAST? What size server do you need?
The needs very much depend on what your usage is. Is it just you or many users (a lab, an institute, or a community?)? How often will BLAST be run? What types of queries? How many queries at once? How big are the queries? How big are the databases?
BLAST runs faster if it can load the whole database into memory (RAM). BLAST runs faster if using multiple cores (as long as the disk speed is fast).
How many simultaneous queries may occur? Can you keep them from overlapping? A 48-core machine with 512 GB RAM for running is superb for most labs. But buying and running that kind of machine isn’t cheap!
Our Cloud BLAST service is simple and reliable
Our SequenceServer Cloud BLAST service does away with the complexities described above.
- We keep servers running, updated, and secure with the latest versions of NCBI BLAST and SequenceServer.
- Your data is entirely confidential - use a point-and-click interface to decide who has access to what and manger your workspace.
- Use a point-and-click interface to upload FASTA files of your genomes and transcriptomes - we’ll format them appropriately. And we’ll keep reference databases up-to-date for you.
- No need to worry about server size - our cloud service scales up and down as needed at no extra cost to you.
People across pharma, biotech, and agroindustry trust SequenceServer Cloud Blast, as do many academic labs for teaching and research.
We’d love to help you, too.