AWS Architecture Blog·May 19, 2026

Scaling ML Workloads with Hybrid AWS Architecture for Geological Analysis

This article details how ALS GeoAnalytics implemented LITHOLENS™, an ML-powered platform for automated core logging, on AWS. The solution utilizes a hybrid architecture combining Amazon EKS for compute-intensive deep learning tasks with AWS Lambda for API orchestration, Amazon S3 for data storage, and Amazon RDS for structured data. Key architectural decisions focused on scalability, cost efficiency, and performance for variable geological analysis workloads.

AI & ML Infrastructure Cloud & Infrastructure Distributed Systems

Read original on AWS Architecture Blog

Introduction to LITHOLENS™ and the Challenge

LITHOLENS™ by ALS GeoAnalytics automates geological core logging using machine learning and computer vision. This platform addresses significant challenges in traditional mining analysis, such as subjective interpretations by human geologists, remote site access difficulties, underutilized historical data, and scheduling bottlenecks. The goal was to achieve higher accuracy, consistency, and scalability in geological insights.

Machine Learning Pipeline for Geological Analysis

The core of LITHOLENS™ is a robust ML pipeline. It begins with a Color Extraction module identifying unique pixel colors in core images, stored in Amazon S3. This feeds into a Color Clustering module, which uses algorithms like K-Means or Gaussian Mixture Models to reduce image complexity and highlight mineralogical variations. A Percentage Report module then quantifies color composition along the core, enabling spatial analysis. Additionally, the system employs specialized deep learning models like RoQE Net for Rock Quality Designation (RQD) and VeinNet/CobbleNet for identifying complex geological features, demonstrating superior accuracy and scalability over traditional methods.

Hybrid AWS Solution Architecture

ALS GeoAnalytics deployed LITHOLENS™ using a hybrid AWS architecture designed for both performance and cost efficiency. It combines containerized workloads on Amazon EKS for compute-intensive ML tasks with serverless components via AWS Lambda for lightweight API operations. Amazon S3 serves as the primary data lake for input, intermediate, and output data, while Amazon RDS manages structured metadata.

ℹ️

Unified API Model

The system features a unified REST API built with Amazon API Gateway and AWS Lambda. This API acts as a single access point, combining multiple services and data streams, allowing users to submit analysis jobs, monitor progress, and retrieve results. This simplifies client-side integration and automates complex workflows across various data sources and departments.

Key Architectural Decisions for Scale and Efficiency

Amazon EKS for ML Workloads: Chosen for managing deep learning model training and inference, leveraging GPU-accelerated G6 instances with automatic scaling based on job queue depth to handle sustained compute needs.
AWS Lambda for API Gateway: Used for job submission, status checking, and result retrieval. This serverless approach eliminates the overhead of always-on servers, reducing costs during low-usage periods by only consuming resources during active requests.
Pre-configured AMIs: Custom Amazon Machine Images (AMIs) are used, containing all necessary dependencies and model artifacts. This significantly reduces container startup times (from minutes to under 30 seconds), improving job throughput and minimizing idle compute costs.
Automated Resource Management: EKS clusters are configured to scale down to zero when no jobs are queued, ensuring compute resources are only utilized during active processing. This, combined with S3 for data persistence and RDS for metadata, creates a highly cost-effective and scalable architecture.

AWSEKSLambdaMachine LearningComputer VisionScalabilityCost OptimizationHybrid Architecture

Comments

Loading comments...

Architecture Design

Design this yourself

Design an automated geological core logging platform (LITHOLENS™) that processes high-resolution imagery using machine learning and deep learning. The system must support variable workloads, provide a unified API, and be highly scalable and cost-efficient. Include details on how compute-intensive ML tasks are handled (e.g., container orchestration with GPUs, dynamic scaling), how lightweight API operations are managed, and how data is stored and accessed efficiently.

Practice Interview

Other design angles

· Design the ML inference and training pipeline as a standalone service, focusing on model deployment, versioning, and continuous integration.· Design a data ingestion and storage layer for geological imagery and metadata, ensuring high availability, durability, and efficient access for ML workloads.· Design the unified API and job orchestration layer for a multi-tenant ML platform, considering authentication, authorization, and tenant isolation.