Summary

Overview

Work History

Education

Skills

Timeline

Mandira Hawaldar

Data Engineer Assistant Engineer

Jalna

Summary

Data Engineer with 2+ years of experience building and optimizing cloud-based data pipelines on Azure using Azure Databricks, Azure Data Factory, PySpark, SQL, and Python. Experienced in developing scalable ETL workflows, metadata-driven ingestion frameworks, and analytics ready datasets for retail and supply chain domains. Proven ability in workflow automation and data quality validation. Strong background in data processing, data modeling, and pipeline orchestration, enabling reliable data delivery for downstream analytics, reporting, and advanced data science initiatives.

Overview

3

3

years of professional experience

Work History

Data Engineer

PepsiCo

08.2023 - Current

1. Data & Analytics Datasets:

1PEPCAN Dashboard (Sweet, Salty, Weekly Tables)

Tools: Azure Databricks, PySpark, SQL.

Developed end-to-end ingestion pipelines for Nielsen datasets, implementing Bronze–Silver–Gold architecture in Databricks.
Created weekly analytics tables (product, market, fact, and period) for FLC and Quaker product categories used for business reporting.
These datasets are used in a one-of-a-kind 1PEPCAN dashboard for all business units: FLC, Qkr, and Beverages across Canada.

POS Data Ingestion: Walmart and LCL.

Tools: Azure Databricks, PySpark, SQL.

Built and maintained POS data ingestion pipelines for major retail customers, including Walmart and LCL.
Refactored pipelines to align with the PGT framework, and migrated data sources to Teradata from PSA, enabling reliable downstream analytics.

Crossmark Audit Data Processing.

Tools: Azure Databricks, PySpark, SQL.

Implemented and automated Crossmark audit event pipelines, supporting 9 audit events in 2025.
Refactored legacy pipelines to PGT-compliant Databricks workflows for improved reliability and maintainability.
Currently working on the Crossmark Beverage ETL pipeline.

Employee Data ETL Automation (DDH)

Tools: Azure Data Factory, Azure Blob Storage, SQL Server.

Built a metadata-driven ADF pipeline to ingest employee CSV datasets from Blob Storage.
Implemented dynamic file detection, metadata extraction, data loading into SQL Server, and automated archival.

Pricing Scorecard Analytics.

Tools: Azure Databricks, PySpark, SQL.

Built pricing scorecard datasets using Nielsen data to support pricing analytics for FLC and Quaker product lines.

Trifacta to Databricks Migration.

Tools: Azure Databricks, PySpark, SQL.

Converted multiple Trifacta data transformation workflows into Databricks PySpark notebooks, including Brand Ladder, Pricing Scorecard, and RGM flows.
Implemented transformation logic using Spark DataFrame operations to improve scalability and maintainability.

2. Supply Chain Datasets:

Fleet Driver Scorecard Automation.

Tools: Azure Databricks, PySpark, PowerApps, and SharePoint.

Designed and implemented an end-to-end automation pipeline for Fleet Driver Scorecard reporting.
Developed Databricks ETL workflows to generate weekly and historical datasets consumed by PowerBi dashboards.

Education

B.Tech - B.Tech in C.S.E. With Specialization in IOT

VIT Vellore

Vellore, India

04.2001 -

Skills

Programming Languages
PySpark, Python, SQL

Cloud Platforms
Microsoft Azure, Azure Databricks, Azure Data Factory, Azure Blob Storage

Data Engineering & ETL
Data Pipeline Development, Data Ingestion Frameworks, Metadata-driven ETL, Data Lake Architecture (Bronze/Silver/Gold), Workflow Automation

Databases & Storage
SQL Server, Data Lake Storage, Teradata

Development Practices
Agile Development, CI/CD Awareness, Data Quality Validation, Job Monitoring & Optimization

Timeline

Data Engineer

PepsiCo

08.2023 - Current

B.Tech - B.Tech in C.S.E. With Specialization in IOT

VIT Vellore

04.2001 -