Case Study

HealthCare

Solution for a healthcare platform designed to automate revenue cycle management processes throughout the system. Initially, AWS was used as the infrastructure platform, with Bitbucket used for source code maintenance and Jenkins for CI/CD. To ensure redundancy and scalability, individual VPCs were created for each of the five environments (dev, qa, beta, staging, prod) in multiple availability zones, with Terraform templates used to deploy each environment. The Oregon region was chosen based on client requirements. In the first iteration of the project, 12 microservices were deployed in individual ECS Clusters, scaled using auto-scaling groups across multiple availability zones. Later, OCR and NLP engines were added to the platform, and deployed on compute-optimized EC2 instances. CodeDeploy was utilized to deploy applications to the EC2 machines running NLP and OCR applications. To ensure high availability, backend services were containerized using Jenkins, and the Docker images were pushed to the Elastic Container Registry (ECR).

The images stored in ECR were labeled according to their application versions and then retrieved and deployed as ECS services. The backend services were also deployed in ECS and scaled automatically via an AWS autoscaling group, which was configured based on network throughput, CPU, and memory usage. The frontend sites were updated and hosted in S3, with CloudFront serving as a global CDN to efficiently cache website contents for faster delivery. To simplify scalability, some backend services were divided into multiple services, resulting in an increase from 12 to 22 microservices. Given the application’s handling of sensitive healthcare data, infrastructure security was of utmost importance. Therefore, encryption was implemented across all components, including RDS databases, S3, and EBS volumes, using KMS keys. Access to the S3 buckets used for file storage was restricted to the VPC.

Transport encryption was utilized for file access to and from S3, while Kafka and Zookeeper clusters were provisioned for queuing mechanisms. Request queuing was handled by Rabbitmq, with authentication performed using keycloak deployed on a 3-node cluster. All backend services were configured to retrieve credentials and variables from Parameter Store, which were encrypted using KMS keys. NACL provided an additional layer of security on top of security groups, with all outbound rules blocked to prevent potential data leaks. This stack successfully ran on AWS for over 3 years until the client requested a shift to another cloud. In response, the platform was made cloud-independent by migrating all services and components to Kubernetes clusters provisioned with Rancher. This made the stack portable to any other cloud.

After the decision was made to switch to a new cloud provider, Microsoft Azure was selected, and Rancher was used to provision new clusters. Helm charts were employed to provision services such as Keycloak, MongoDB, PostgresDB, and Minio. Cluster autoscaler was utilized to deploy scalable worker nodes, with a mix of General purpose, Compute optimized, and Memory optimized instances in the Kubernetes cluster. NodeSelector and Labels in the Kubernetes deployment YAML were employed to ensure that the right service was deployed to the appropriate type of node. Services that were compute-intensive were directed to be deployed in the compute-optimized instances using the node Selector, while services that were memory-intensive were labeled for deployment in the memory-optimized instances to improve stability and performance. To replace AWS S3, Minio was utilized. All applications, including Rabbitmq, Keycloak, Minio, Postgres, MongoDB, and Jenkins, were provisioned and configured using Helm.

In Jenkins, Matrix authentication was employed to enhance security measures and restrict unauthorized access. Additionally, code quality testing was conducted through SonarQube, which was integrated into the qa build process. The results of the SonarQube tests were shared with the QA team through email and uploaded to S3 Buckets for future reference. Configuration changes in the applications were tracked using helm charts committed to a git repository. To avoid data loss, all database applications were deployed as stateful sets. External traffic was directed to backend services using Traefik ingress controllers, and Http to Https redirections were set up through middleswares. The management of Https certificates was done using certmanager. Successfully transitioning from AWS to Azure, the entire stack is now deployed on Azure, providing better scalability and performance.

 

Challenges.

 

  1. Two of our backend services encountered performance problems while handling substantial data loads.

Solution:

To overcome the performance bottlenecks, we broke down the two problematic services into microservices, enabling us to scale them individually. By partitioning the larger service into smaller services, we were able to scale them more effectively and efficiently.

 

  1. During the migration to Azure, we faced limitations in utilizing the secret storage capabilities of the AWS Parameter Store.

Solution:

As a workaround, we hosted a Parameter Store as a service and incorporated CI/CD using Jenkins. We ensured the security of the repository by restricting access to authorized users only. Additionally, we automated the periodic rotation of database credentials stored in the Parameter Store.

 

Tools/Technologies

  • Cloud Platform
  • AWS
  • Azure
  • Source Code Management
  • BitBucket
  • Continuous Integration & Deployment
  • CodeDeploy
  • Jenkins
  • Databases
  • MySql
  • PSQL
  • Infra Provisioning Tools & Configuration Management
  • Terraform
  • Aws-Cli
  • Rancher
  • Helm Charts
    1. Containerization & Deployment
  • Docker
  • ECS – EC2
  • Kubernetes
  • Message Queuing
  • Rabbitmq
  • Kafka
  • Zookeeper
  • Nats
  • Authentication
  • Keycloak
  • Encryption
  • AWS KMS
  • Secret Store
  • Parameter Store
  • Logging, Monitoring & Alerting
  • Nagios
  • Graylog
  • Grafana
  • Prometheus
  • AWS Cloudwatch
  • Code Quality Testing
  • SonarQube
  • Load Balancing
  • AWS ALB
  • Ingress Controller
  • Traefik
  • Content Delivery Network
  • AWS Cloudfront
  • Web Hosting
  • AWS S3
  • AWS Lightsail
  • Windows IIS
  • Storage
  • S3
  • Minio
  • EBS
  • Collaboration and Ticketing
  • Mantis