
Data Engineer with 2+ years of experience building and optimizing cloud-based data pipelines on Azure using Azure Databricks, Azure Data Factory, PySpark, SQL, and Python. Experienced in developing scalable ETL workflows, metadata-driven ingestion frameworks, and analytics ready datasets for retail and supply chain domains. Proven ability in workflow automation and data quality validation. Strong background in data processing, data modeling, and pipeline orchestration, enabling reliable data delivery for downstream analytics, reporting, and advanced data science initiatives.
1. Data & Analytics Datasets:
1PEPCAN Dashboard (Sweet, Salty, Weekly Tables)
Tools: Azure Databricks, PySpark, SQL.
POS Data Ingestion: Walmart and LCL.
Tools: Azure Databricks, PySpark, SQL.
Crossmark Audit Data Processing.
Tools: Azure Databricks, PySpark, SQL.
Employee Data ETL Automation (DDH)
Tools: Azure Data Factory, Azure Blob Storage, SQL Server.
Pricing Scorecard Analytics.
Tools: Azure Databricks, PySpark, SQL.
Trifacta to Databricks Migration.
Tools: Azure Databricks, PySpark, SQL.
2. Supply Chain Datasets:
Fleet Driver Scorecard Automation.
Tools: Azure Databricks, PySpark, PowerApps, and SharePoint.
Programming Languages
PySpark, Python, SQL
Cloud Platforms
Microsoft Azure, Azure Databricks, Azure Data Factory, Azure Blob Storage
Data Engineering & ETL
Data Pipeline Development, Data Ingestion Frameworks, Metadata-driven ETL, Data Lake Architecture (Bronze/Silver/Gold), Workflow Automation
Databases & Storage
SQL Server, Data Lake Storage, Teradata
Development Practices
Agile Development, CI/CD Awareness, Data Quality Validation, Job Monitoring & Optimization