Summary
Overview
Work History
Education
Skills
Websites
Certification
Key Project Highlights
Awards Recognition
Core Technical Skills
Timeline
Generic

Abha Kabra

Data Engineer & Solution Architect
Mumbai

Summary

Data Engineer and Big Data Architect with 7+ years of experience designing and delivering scalable, high-performance data pipelines across AWS, GCP, and Azure. Specialized in real-time stream processing, data lake architecture, and cloud-native data engineering using tools like AWS Glue, EMR, Redshift, Kinesis, Kafka, PySpark, and Lake Formation. Proven success in re-architecting legacy systems, optimizing performance, and enabling secure, reliable data insights across industries including Finance, Hospitality, Retail, SaaS, and Energy.

Overview

7
7
years of professional experience
4
4
years of post-secondary education
3
3
Certifications

Work History

Solution Architect - Big Data

OpenLM Israel
02.2023 - 06.2025
  • Led migration of legacy ETL to AWS-native Spark Streaming, reducing processing time by 3x
  • Delivered scalable cloud-native pipelines using AWS EMR, Redshift, and Delta Lake
  • Implemented secured CRUD pipelines using Lake Formation and IAM-based access controls
  • Enabled anomaly detection and alerting using Kinesis, Kafka, and CloudWatch
  • Facilitated seamless communication between technical teams and non-technical stakeholders by effectively translating complex concepts into understandable terms for all parties involved in projects.
  • Optimized resource allocation across multiple projects by utilizing advanced project management tools and techniques for more efficient scheduling and task prioritization.
  • Managed project scope, schedule, status and documentation.

Senior Data Engineer

Clearwater Securities
03.2025 - 05.2025
  • Built high-throughput Spark+Scala pipelines processing millions of records via AWS S3, Glue, and EMR
  • Architected Delta Lake-based streaming framework for cross-batch aggregation and alerting
  • Automated anomaly detection workflows integrated with Kinesis and Elasticsearch
  • Designed fault-tolerant, windowed stream processing and real-time analytics pipelines
  • Acted as a trusted advisor for clients by providing thought leadership on best practices in data engineering, ensuring their systems were optimized for performance and scalability.
  • Established standard procedures for version control, code review, deployment, and documentation to ensure consistency across the team''s work products.

Big Data Engineer

eXate UK
01.2022 - 03.2025
  • Developed data encryption platform for banking data on Spark and Kafka
  • Built ingestion and transformation frameworks with PySpark and multi-format support (JSON, Parquet, Avro)
  • Hardened data security by rebuilding vulnerable third-party API logic with secure, scalable code
  • Architected hybrid cloud solutions using Kubernetes, OpenShift, and Cloudera
  • Automated routine tasks through scripting languages, reducing manual effort and human error risks.
  • Trained junior team members on best practices in big data engineering, fostering a culture of continuous improvement.
  • Optimized data processing by implementing Hadoop and Spark frameworks for big data management.
  • Reduced query response times with efficient database partitioning and indexing techniques.

Big Data Consultant

easidoo GmbH
09.2020 - 01.2022
  • Built ELT pipelines with NiFi, Spark, and GCP Dataflow for energy meter analytics
  • Created real-time dashboards using Elasticsearch and Tableau
  • Enhanced stream ingestion performance and schema evolution support with HDFS and Kafka
  • Spearheaded the development of proof-of-concept projects showcasing the potential benefits of integrating big data solutions into existing workflows.
  • Improved data quality through rigorous cleansing, validation, and transformation processes.
  • Evaluated emerging big data technologies to stay current on industry trends and maintain competitive advantage.
  • Streamlined ETL processes for seamless integration of various data sources into a unified system.

Data Engineer

L&T Mindtree
01.2018 - 12.2020
  • Developed batch and streaming pipelines using Apache NiFi, Spark, and Kafka
  • Managed AWS infrastructure (EC2, S3, Redshift, Lambda) for scalable and secure solutions
  • Collaborated with DevOps teams to deploy pipelines with Docker and Terraform
  • Integrated federated identity with AWS Cognito and Okta for secure access provisioning
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Migrated legacy systems to modern big-data technologies, improving performance and scalability while minimizing business disruption.

Education

B.Tech - Computer Science

Jaipur Engineering College & Research Centre
Jaipur
01.2014 - 01.2018

Skills

AWS Glue

EMR

Lambda

S3

Redshift

Kinesis

Apache Spark with Scala

PySpark

Kafka

Hive

Tez

NiFi

Shell Scripting

PostgreSQL

MySQL

undefined

Certification

AWS Certified Solutions Architect - In Progress

Key Project Highlights

  • Big Data Re-architecture: Migrated and re-architected 3 legacy pipelines, improving performance by 5x and cutting costs by 60%
  • Real-time Streaming on AWS: Designed event-driven pipelines with Kinesis, Kafka Streams, and Glue for fraud detection
  • Cloud Data Lake on AWS: Delivered scalable Delta Lake solution with Lake Formation for row-level security and data cataloging
  • Monitoring & Optimization: Developed end-to-end pipeline observability using CloudWatch, Elasticsearch, and Grafana

Awards Recognition

  • "A Team" Badge - L&T Mindtree, 2020
  • "Strategic Thinking" Spot-On - L&T Mindtree, 2018
  • Campus Ambassador - Udacity, 2017-2018

Core Technical Skills

AWS Glue, EMR, DMS, SCT, Lambda, Lake Formation, S3, Redshift, RDS, MSK, Kinesis (Streaming, Analytics), Apache Spark (Scala, PySpark), Kafka, Kinesis, Hadoop (HDFS, Hive, HBase, Tez, YARN), NiFi, Kafka Streams, Python, PySpark, Scala, SQL, Shell Scripting, PostgreSQL, MySQL, Oracle, SQL Server, MongoDB, Cassandra, DynamoDB, DocumentDB, Airflow, Docker, Kubernetes, Terraform, Jenkins, Git, OpenShift, ELK Stack, CloudWatch, Cognito, Okta

Timeline

Senior Data Engineer

Clearwater Securities
03.2025 - 05.2025

Solution Architect - Big Data

OpenLM Israel
02.2023 - 06.2025

Big Data Engineer

eXate UK
01.2022 - 03.2025

Big Data Consultant

easidoo GmbH
09.2020 - 01.2022

Data Engineer

L&T Mindtree
01.2018 - 12.2020

B.Tech - Computer Science

Jaipur Engineering College & Research Centre
01.2014 - 01.2018
Abha KabraData Engineer & Solution Architect