Case Study

FleetManagement

To fulfill the client’s requirement of developing an AI-powered fleet management system, we received data from approximately 5000 GPS devices located in various regions such as Kenya, Rwanda, Mumbai, and others. Initially, we only had fewer than 1000 IoT devices, which were managed through a Windows Dotnet application running on AWS. We scaled the Windows applications to ensure high availability across the various application components. The back-end endpoints, which receive data from the IoT devices, were autoscaled using AWS Autoscaling Group based on the application’s network throughput, CPU, and memory usage. The frontend dashboard was hosted on Windows IIS web servers that were load-balanced using ALB. Three IIS web servers were distributed across different availability zones to ensure high availability. As the device count exceeded 1000, we transitioned from Dotnet to Angular and Node frameworks. We replaced the data-receiving endpoints with Node-written wrappers, which were deployed on AWS ECS.

The frontend sites were redesigned and hosted on S3, and CloudFront was utilized as a CDN to cache the website’s content worldwide, enabling the site to be delivered faster and more efficiently. We implemented other components such as S3, AWS Elasticsearch, Kafka, Kinesis, Lambda, and CloudWatch to support the platform, which was capable of handling 5000 GPS devices sending data at 3-second intervals. The stack was successfully operating on AWS for over three years, until our client requested a migration to a different cloud provider, prompting us to make the platform cloud-independent. Since the platform was heavily reliant on AWS services, we initiated a POC to make it independent by replacing ECS with Kubernetes, S3 with Minio, Lambda with Docker, AWS Elasticsearch with self-hosted Elasticsearch, and self-hosted databases, among other things. The POC was successful, and we were able to replace all AWS components with open-source alternatives.

We managed our entire codebase with a self-hosted GitLab instance and used Jenkins and Gitlab-ci for CI/CD depending on the situation. Our Kubernetes cluster was provisioned with compute-optimized instance types to handle heavy computations, and we utilized three separate Elasticsearch clusters configured with Helm charts. Instead of S3, we switched to Minio, and we shifted our Lambda functions to Docker services deployed on Kubernetes. External traffic was routed through the Traefik ingress controller, and we used Graylog and Prometheus in place of Cloudwatch for monitoring. Kibana was also employed to keep tabs on Elasticsearch clusters. To achieve cost savings of $5000, we ran our Kubernetes cluster on five master nodes and twenty worker nodes, with a mix of general-purpose and compute-optimized instance types. We employed the NodeSelector attribute in the Helm chart to ensure that services ran on the correct instance type, and we used Nagios for alerting. Through our POC, we successfully replaced all AWS components with open-source alternatives, making our infrastructure cloud-independent.

 

Challenges.

 

  1. The migration of elasticsearch clusters posed a risk of data loss which could lead to errors in the frontend dashboard and incorrect vehicle location details in the daily run report. This was particularly concerning as we were dealing with a daily influx of 100GB of GPS data.

Solution:

To prevent any data loss during the migration of Elasticsearch clusters, we took a precautionary measure by configuring the new Elasticsearch cluster as a secondary endpoint. Concurrently, we reindexed the old data from the old Elasticsearch clusters. This allowed us to process both the incoming data and the reindexing of old data simultaneously. With this approach, we were able to migrate the data from AWS Elasticsearch to the self-hosted Elasticsearch with no data loss. As a result, we were able to ensure the accuracy of the vehicle daily run report and prevent false location details from appearing on the frontend dashboard.

 

  1. To optimize costs for the entire architecture, we initially separated the data-related services into private subnets and the frontend into a public subnet. However, we faced significant expenses due to the processing of NAT data. Additionally, our AWS Lambda functions were set up to log to Cloudwatch, which added to our expenses.

Solution:

When we migrated to Kubernetes, we were able to eliminate this cost by leveraging service discovery and CoreDNS for internal traffic distribution within the cluster. Instead of using AWS Lambda, we switched to Docker, and replaced CloudWatch with Graylog. The Docker service and Graylog were configured to communicate within the cluster, resulting in significant cost savings. Overall, we were able to reduce the monthly running cost by $5000.

 

Tools/Technologies

  • Cloud Platform
  • AWS
  • Source Code Management
  • BitBucket
  • Gitlab
  • Continuous Integration & Deployment
  • CodeBuild
  • Gitlab-CI
  • Jenkins
  • Databases
  • MS-SQL
  • MySql
  • ElasticSearch
  • PSQL
  • Infra Provisioning Tools & Configuration Management
  • Terraform
  • Aws-Cli
  • Rancher
  • Helm Charts
    1. Containerization & Deployment
  • Docker
  • ECS – EC2
  • Kubernetes
  • Message Queuing
  • Rabbitmq
  • Kafka
  • VernMQ
  • Authentication
  • Keycloak
  • Logging, Monitoring & Alerting
  • Kibana
  • Nagios
  • Graylog
  • Prometheus
  • AWS Cloudwatch
  • Load Balancing
  • AWS ALB
  • Ingress Controller
  • Traefik
  • Content Delivery Network
  • AWS Cloudfront
  • Web Hosting
  • AWS S3
  • AWS Lightsail
  • Windows IIS
  • Storage
  • S3
  • Minio
  • EBS
  • Collaboration and Ticketing
  • Mantis
  • Skype