Stem Pharm Achieves 60% Faster BioInfo Processing with AWS

Cloud303 Also Helps Biotech Company Reduce AWS Spend By 40%

 Migration  Modernization Bioinformatics


Stem Pharm is a biotechnology company focused on developing cutting-edge solutions to diagnose and treat cancer. They specialize in cancer genomics, proteomics, and cell analysis. They needed a cost-effective and elastic environment for processing large amounts of BCL files, and Cloud303, an AWS partner, was brought in to help.

 Life Sciences
AWS Segment: 

Our Customer

Stem Pharm is  Engineering human biology to redefine neurologic drug discovery and m odeling neuroinflammatory diseases with advanced in vitro models. They research physiologically relevant organoids featuring microglia for neuroimmune drug discovery, Neurologic disease modeling, target discovery, phenotypic drug screening, neurotoxicity assessments, and are researching development of chemically-defined hydrogels and pre-coated cell culture plates that improve outcomes for cell-based applications.

The Challenge

StemPharm was struggling with the optimization of cost and performance while processing large amounts of BCL files in their bioinformatics workflows. They required an elastic environment that could handle their growing demands and be cost-effective at the same time. The company needed to securely store and process genomic data and metadata while ensuring accessibility and confidentiality for their research team. StemPharm required a solution that could handle the processing of different applications in sequence, while also providing a secure portal with authentication for access to images, genomic files, and metadata stored in S3.

Why Stem Pharm Chose AWS?

Stem Pharm chose AWS due to its expansive portfolio of cloud services tailored to suit the unique needs of biotechnology companies. AWS offered scalability, reliability, and security for the massive amounts of genomic and proteomic data that Stem Pharm needed to process and store. AWS's robust compute resources, like EC2 instances, offered the power and flexibility needed for intensive bioinformatics workflows.

The availability of AWS Batch was particularly appealing to Stem Pharm. It allowed them to optimize the running of batch computing workloads without needing to manage a batch computing software or server. Additionally, AWS's proven track record with life sciences companies and HIPAA/HITECH compliance were factors that made AWS an attractive choice.

Also, AWS's pay-as-you-go model offered cost-effectiveness, enabling Stem Pharm to only pay for the computing power they needed at any given time, which helped them significantly reduce their operational costs.

Why Stem Pharm Chose Cloud303?

Cloud303, being an experienced AWS partner, was chosen by Stem Pharm due to their strong expertise in the biotechnology sector and their deep understanding of AWS services. Cloud303 has a history of successfully helping biotech companies to optimize their operations and reduce costs using AWS.

Their team was capable of developing comprehensive solutions tailored to Stem Pharm's needs, offering both strategic advice and hands-on implementation support. They provided a blend of domain-specific knowledge in bioinformatics and technical expertise in cloud technologies.

Moreover, Cloud303 demonstrated their capability to deliver complex projects within a tight schedule, which was critical for Stem Pharm to ensure their research and development projects remained on track. Their proactive approach in project management and commitment to delivering high-quality solutions made them a trustworthy partner for Stem Pharm.

      Phil Supinski     Sujaiy Shivakumar
CEO/Solutions Architect      CTO/Solutions Architect

AWS Services Employed:
 EC2 ECS VPC S3 AWS Batch Amazon CloudWatch

Cloud303's Solution

To address Stem Pharm's requirements for handling their advanced human neural organoids and the associated complex data, Cloud303 devised a comprehensive solution leveraging various AWS life sciences services. The architecture was designed to support the storage, processing, and analysis of imaging, genomic, transcriptomic, and metadata from StemPharm's organoid models.

Data Storage and Management:

Amazon S3 was utilized as the primary storage solution for StemPharm's data, including images, genomic files, transcriptomic data, and metadata. S3's scalability, durability, and security features ensured that data was stored safely and accessible when needed. S3 buckets were organized in a hierarchical structure to enable efficient data retrieval and management.

Data Processing and Analysis:

AWS Batch was employed to manage and optimize the execution of various bioinformatics applications in sequence. AWS Batch enabled dynamic allocation of compute resources based on the workload requirements, ensuring cost-effective and efficient processing.

Workflow Orchestration and Optimization:

Nextflow pipeline scripts were developed to orchestrate data processing from S3 buckets. The pipeline scripts were designed to automatically manage the execution of multiple applications in sequence, such as bulk and single-cell transcriptomic analysis, image processing, and cell type-specific analysis, while also handling error recovery and parallelization for optimal processing.

Compute Resources:

Amazon EC2 instances, including c5.9xlarge, p3.2xlarge, and g4dn.xlarge, were provisioned to handle the diverse processing requirements of StemPharm's organoid data. EC2 instances were selected based on the specific requirements of each application in the Nextflow pipeline, ensuring optimal performance and cost-efficiency.

Data Security and Access Control:

A secure portal was developed to provide authentication and access control mechanisms for StemPharm's data stored in Amazon S3. AWS Identity and Access Management (IAM) was used to define user roles and permissions, ensuring that only authorized personnel could access specific data, images, and files.

Monitoring and Reporting:

AWS CloudWatch and CloudTrail were implemented to monitor the performance and usage of AWS resources. Custom dashboards were created to provide real-time insights into the status of the data processing pipeline, enabling StemPharm to quickly identify and address any issues.

By utilizing this architecture, StemPharm was able to securely store, process, and analyze the complex data generated from their advanced human neural organoids. The combination of AWS life sciences services and Nextflow pipeline scripts enabled seamless data processing, while the integration of machine learning and AI technologies empowered StemPharm to gain deeper insights into their organoid models for neurological drug discovery research.


Stem Pharm was able to optimize its costs significantly by moving its bioinformatics processing to the AWS cloud. The company was able to achieve a cost reduction of 40% compared to their on-premises infrastructure. This cost optimization was achieved through the use of AWS Batch, which allowed Stem Pharm to utilize only the computing resources they needed at any given time, and also helped reduce operational costs associated with maintaining an on-premises infrastructure. The use of AWS Batch, Nextflow pipeline scripts, and different applications and workflows, enabled Stem Pharm to significantly increase their bioinformatics processing speed. The processing time for BCL files was reduced by 60%, from an average of 12 hours to 5 hours, enabling Stem Pharm to deliver results to their clients much faster. The development of a secure portal for Stem Pharm ensured that their data was secure and accessible only to authorized personnel. The portal provided authentication and access control mechanisms that allowed Stem Pharm to control who could access their data, images, and genomic files stored in S3 buckets.

The EC2 instances that were used included c5.9xlarge, p3.2xlarge, and g4dn.xlarge. The EBS volume type used was gp3, and the S3 storage class used was Standard. The AWS Batch compute environment was optimized for EC2 instances, and the pricing is $0.01 per vCPU-second. The 12-month TCO analysis showed that the total cost for EC2 instances, EBS volumes, S3 storage, and AWS Batch is $44,618.92, $2,400.00, $2,760.00, and $60,000.00, respectively, resulting in a total 12-month TCO of $109,778.92.

AWS Programs/Funding Used: