Kubernetes • Terraform • AWS • GCP • CI/CD • Observability
Building reliable cloud platforms, scalable Kubernetes systems, and production-ready automation.
GCP-focused DevOps/SRE engineer with 3 years of experience supporting production cloud platforms and Kubernetes workloads. Strong in IaC, observability, cloud migration, and security-by-design.
Production Kubernetes Management
Enterprise Migration Support
Secure Infrastructure as Code
Observability & Incident Response
I am a GCP-focused DevOps and SRE engineer with 3+ years of experience architecting, automating, and supporting production-grade cloud platforms across enterprise environments. My passion lies in building resilient, scalable, and secure infrastructure that empowers development teams and delivers seamless user experiences.
I have strong hands-on expertise in Kubernetes (GKE/EKS), Terraform (Infrastructure as Code), CI/CD automation, cloud networking, and production observability. I have contributed to large-scale AWS-to-GCP migration initiatives, ensuring smooth cutovers, operational stability, and high platform availability throughout critical transitions.
Beyond infrastructure engineering, I specialize in monitoring, logging, and reliability engineering using tools such as Splunk, New Relic, Prometheus, and Grafana. I focus heavily on proactive incident management, root cause analysis (RCA), performance optimisation, and reducing operational toil through automation.
My technical background also includes cloud security and governance practices such as IAM, RBAC, secrets management, and infrastructure hardening across cloud-native environments. I enjoy working on high-impact systems where reliability, scalability, and operational excellence are critical.
Production Observability • GitOps • Incident Response
Production-style observability platform built with Prometheus, Grafana, Loki, Promtail, Alertmanager, Slack, Docker, and GitHub Actions for real-time monitoring, centralized logging, alerting, GitOps deployment, and incident response automation across cloud and local systems.
Demo Access
Username: demo_user
Password: demo_user
Use these demo credentials to explore the live monitoring platform.
Cloud Visualization • Interactive UI • Frontend Engineering
An immersive digital environment designed to bridge the gap between complex DevOps concepts and high-performance frontend engineering. Features a custom 3D engine and a reactive UI shell, hosted on Google Cloud Platform.
Performance
Optimized RequestAnimationFrame loop & hardware-accelerated GSAP animations.
Graphics
Direct GPU-accelerated 3D rendering for complex particle systems and effects.
Load Time
Asset compression, lazy loading, and low-latency GCP hosting.
Responsive
Mobile-first CSS and dynamic 3D camera resize listeners.
Full-Stack • React + FastAPI • GCP Deployment
Personal shift-tracking web app for two Aramark employees to log hours, calculate pay, and monitor weekly and pay-period targets. Replaces manual spreadsheets with a mobile-first React + FastAPI app - auto break deductions, Aramark payroll calendar, and CI/CD to GCP via GitHub Actions.
Users
Manoj & Jothesh - each user sees only their own shifts via user_id isolation on every API endpoint.
App Tabs
Log, Overview (donut chart), Shifts table, Salary (pay periods), Schedule (Aramark W02–W52 tax calendar).
Hourly Rate
30 min break auto-deducted if shift > 6h. 2-week Aramark pay cycles anchored to Sat 14 Mar 2026.
Hosted
Nginx serves React build + proxies /api/ to FastAPI :8000. Systemd keeps backend alive. GitHub Actions CI/CD on push.
Designed a reusable deployment framework for development, staging, and production Kubernetes environments using Helm, Terraform, and GitHub Actions.
Built a migration support workflow for AWS to GCP transitions covering provisioning, validation checklists, and post-cutover monitoring readiness.
Implemented operational dashboards and alert tuning workflows with Splunk, New Relic, and Cloud monitoring tools for production systems.
Developed modular Terraform components for cloud networking, IAM, and workload deployment with security-by-design principles.
A production-style observability platform built using Prometheus, Grafana, Loki, Promtail, Alertmanager, Slack, Docker, and GitHub Actions for real-time infrastructure monitoring, centralized logging, alerting, and incident response automation.
GCP VM, MacBook, website, and Prometheus health monitoring.
Centralized Docker and system logs using Loki and Promtail.
Alertmanager sends real-time Slack alerts and recovery notices.
MacBook
Node Exporter
GCP VM
Docker Stack
Prometheus
Metrics
Grafana
Dashboards
Slack
Alerts
CPU, memory, filesystem, load average, network traffic, SLA, and infrastructure health.
Local machine observability using secure SSH reverse tunneling.
Blackbox monitoring for uptime, latency, HTTP status, and SSL expiry.
Target health, scrape duration, failures, and monitoring pipeline metrics.
Real-time Slack alerts from Alertmanager.
Full alert lifecycle with automatic recovery notifications.
1. Alert
Slack receives incident notification.
2. Metrics
Analyze Grafana dashboards.
3. Logs
Search Loki logs for RCA.
4. Fix
Resolve infrastructure/service issue.
5. Recovery
Alertmanager sends resolved notification.
Designing scalable and secure cloud-native infrastructure.
Production support, incident response, RCA, and MTTR reduction.
Building reusable and repeatable infrastructure provisioning.
Automating build, test, and deployment workflows.
Monitoring, logging, alerting, and incident analysis.
Managing and optimizing multi-cloud environments.
Implementing IAM, access control, and platform governance.
Agile practices, documentation, and cross-team collaboration.
Architecting production environments where reliability meets velocity. My approach centers on automation, security, and deep observability.
Eliminating manual toil through modular Terraform modules and GitOps workflows. Ensuring repeatable, drift-aware infrastructure deployments.
Managing production Kubernetes workloads with high-availability patterns. Optimizing cluster performance, autoscaling, and secure IAM/RBAC.
Reducing MTTR via Splunk and New Relic. Implementing proactive alerting, incident RCA, and data-driven platform optimizations.
Associate-level certification focused on Google Cloud infrastructure, deployment, monitoring, and operations.
Expected Completion: May 2026
Advanced cloud architecture, scalability, reliability, and security design.
Planned Completion: End of Q2 2026
Kubernetes cluster administration, troubleshooting, networking, and production orchestration expertise.
Planned Completion: End of Q2 2026
Infrastructure as Code (IaC), automation, provisioning, and cloud infrastructure management using Terraform.
Planned Completion: End of Q2 2026
Evolution of Last-Mile Delivery Efficiency & E-Commerce Logistics Performance
Explainable Credit Scoring Using Deep Learning & SHAP
European Airport Traffic Data Visualisation
Statistically Guided ML Pipeline for Diabetic Retinopathy Classification
Classification, Neural Networks & Heart Disease Prediction
Big Data Analytics for Lung Cancer Risk Prediction on GCP
Statistical Analysis Across Multiple Datasets
NETWORK
NODES
Global Communication Channels
GitHub
@manoj-panduraj
LinkedIn
Professional Network
Email
Direct Contact