Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Vipul Karanjkar

Nagpur

Summary

Senior Data Engineer with 6+ years of experience building and scaling production-grade data platforms. Strong background in designing reliable batch and real-time pipelines, lakehouse architectures on Databricks, and analytics-ready data models. Hands-on expertise with Medallion architecture, semantic layers, and enabling trusted data for downstream analytics and decision-making.

Overview

10
10
years of professional experience
4
4
years of post-secondary education
1
1
Certification

Work History

Data Engineer

InfoCepts
Nagpur
12.2025 - Current
  • Built and optimized scalable batch and incremental Apache Spark pipelines, processing vehicle, sales, and service data in Azure Data Lake Storage Gen2
  • Developed end-to-end pipeline monitoring and operational visibility using Azure-native monitoring, logging, and alerting tools
  • Developed data quality validations and governance controls, enforcing schema across curated data layers
  • Developed a governed semantic layer to standardize KPIs and enable self-service analytics for business and analytics teams

Senior Data Engineer

Zendesk
San Francisco
12.2021 - 12.2024
  • Optimized large-scale PySpark batch pipelines using Apache Hudi on Azure, reducing end-to-end processing time from 11 hours to 7 hours and delivering approximately $200K in annual cost savings
  • Designed and built real-time data pipelines using Kafka, Spark, and Flink, processing 1M+ events per day with low-latency delivery to power near real-time analytics dashboards
  • Automated data onboarding and pipeline orchestration using Apache Airflow, cutting environment setup and onboarding time by 55% and accelerating time-to-production
  • Designed and implemented analytics-ready dimensional and star-schema models in Snowflake using dbt, enabling scalable BI reporting and self-service analytics
  • Implemented enterprise-grade metadata management and data cataloging with Alation, improving data discoverability, governance, and analyst productivity
  • Designed and enforced data quality, validation, and monitoring frameworks across batch and streaming pipelines using automated checks and alerts, reducing data incidents and improving trust in downstream analytics and reporting

Software Engineer II - Data Engineer

CDK Global
San Jose
11.2019 - 12.2021
  • Led benchmarking data pipelines for 1,200+ automobile dealerships, facilitating large-scale performance analytics
  • Containerized and operated ETL pipelines on Azure Kubernetes Service (AKS) with Docker, enhancing scalability and reliability
  • Enabled large-scale, multi-tenant benchmarking analytics by architecting containerized ETL workloads on AKS, supporting 1,200+ dealerships with consistent performance, fault isolation, and operational reliability

Software Engineer

Vyako Technologies
Nagpur
06.2016 - 06.2017
  • Built REST APIs using PHP CodeIgniter and optimized SQL queries.
  • Assisted in staged data pipelines (raw, refined, curated) for reporting.

Education

M.S. - Computer Science

San Francisco State University
08.2017 - 01/2020

B.E. - Computer Engineering

Nagpur University
08.2012 - 04.2016

Skills

  • Python
  • Java
  • SQL
  • Scala
  • Apache Spark
  • Kafka
  • Flink
  • Databricks
  • Airflow
  • Hudi
  • AWS
  • Snowflake
  • Dbt
  • Docker
  • Kubernetes
  • Terraform

Certification

  • AWS Data Engineer - Associate, 2024
  • Databricks Data Engineer
  • Databricks Generative AI

Timeline

Data Engineer

InfoCepts
12.2025 - Current

Senior Data Engineer

Zendesk
12.2021 - 12.2024

Software Engineer II - Data Engineer

CDK Global
11.2019 - 12.2021

M.S. - Computer Science

San Francisco State University
08.2017 - 01/2020

Software Engineer

Vyako Technologies
06.2016 - 06.2017

B.E. - Computer Engineering

Nagpur University
08.2012 - 04.2016
Vipul Karanjkar