Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Gaurav Bhedi

Snowflake, ETL, ELT, AWS, Python, Airflow
Pune

Summary

Dynamic Technical Architect with over 15 years of extensive experience in Data Engineering and Data Migration, driving innovative cloud-native solutions. Proficient in developing robust data pipelines using Apache Airflow and Snowflake, alongside a strong command of AWS and Python. A proven track record in leveraging cutting-edge technologies to enhance data workflows and optimize performance. Committed to delivering high-quality results that align with business objectives and promote efficient data management strategies.

Overview

16
16
years of professional experience
3
3
Certifications

Work History

Technical Architect

Atgeir Solutions
Pune
08.2024 - Current

Associate Data Architect

Atgeir Solutions
08.2021 - 08.2024

Tide.co

Data Platform Lead July 2022 – Present
Roles & Responsibilities:

  • Data Mesh Transformation - Snowflake, dbt & Fivetran As the Engineering Manager of the data platform team, I successfully led the implementation of a Data Mesh strategy, transforming our data architecture from a monolithic setup to a domain-oriented structure. This shift empowered teams to own their data products and drive insights independently. Leveraging tools like Fivetran, Snowflake, dbt, and Looker, I built a scalable, secure data platform tailored to meet the Skills and Expertise Data Engineering Data Platform Development Data Mesh Snowflake Implementation & Optimization AIRFLOW DBT AWS Fivetran Looker Python Data Warehousing Architecture Spark with Python Databricks Tecton Analytics Operations Data Science Supervised Learning Unsupervised Learning NLP Looker API Development Docker CICD Data OPS organization’s varied needs. I also promoted a self-service culture, driving team upskilling and tool adoption, which increased efficiency and accelerated data delivery across departments.
  • Snowflake Data Platform Enhancements: Snowflake Costs Optimization: In August 2024, we at Tide ran into 1.5M overage on snowflake contract. I took over the responsibility of reduction of the costs along with EMs of domains. We tackled this problem with collaboration and brought down the overage by 200K as of October 2024. Automated Role-Based Access Control (RBAC) Implementation: Implemented role-based access control in Snowflake, providing tight controls and minimum privileges to users. The entire process was automated using code, ensuring version control and enhanced security. Anonymization of Credit Card Information: Identified and anonymized credit card information present in Snowflake, achieving compliance with UK data regulations. Leveraging stored procedures, the process was automated and scanned the entire 80+ TB data warehouse at a low cost of $300 USD per execution, way cheaper than other options in the market. Voice Call Data Migration and Transcription: Developed pipelines to migrate customer call recordings from a third-party application to an S3 bucket using API. Leveraging AWS Transcribe service, the voice call data was transcribed into text. This resulted in the transcription of 6 million voice calls in just 4 days, enabling efficient complaints resolution. Snowflake Cost Monitoring Alerts: Implemented alerts for warehouse cost monitoring in Snowflake, providing granular control over costs and identifying daily cost spikes to avoid surprises. This enhanced cost management and transparency within the Snowflake environment.
  • dbt Enhancements: Implemented robust data quality measures in dbt using open-source libraries like Elementary and dbt Expectations, which significantly improved data reliability and trust across the organization. As the data platform leader, I also introduced critical guardrails for our large-scale dbt project with over 4,000 models. This involved creating and mandating PR templates, developing CI checks, and enforcing best practices to standardize development. These initiatives effectively reduced production costs and minimized issues, supporting a stable and cost-efficient data environment.
  • Airflow Setup & Migration: Successfully managed the setup and maintenance of Airflow on AWS, overseeing a seamless migration to Amazon Managed Workflows for Apache Airflow (MWAA) when scaling became challenging. Executed the migration of over 120 DAGs with zero downtime or operational impact, ensuring continuous data pipeline reliability and enhanced system efficiency.

Project Environment: Snowflake, DBT, Fivetran, Airflow, Looker

Associate Data Architect

VSquare Systems Pvt. Ltd.
03.2021 - 07.2021

Technology Lead

VSquare Systems Pvt. Ltd.
03.2019 - 01.2021

Tech Lead

Datametica Solutions Private Limited
01.2018 - 03.2019

StateAuto Insurance – Netezza to GCP Migration

Module lead Jan 2018 to Mar 2019

Client is a leader in providing automobile mutual insurance

services, StateAuto genuinely understand their customers and

who develop trusted relationships. StateAuto pledged

reasonable rates with prompt and fair claim service that remain

our hallmark to this day.

As part of this project, one of the Business Module is migrated

from Netezza to Big query Google cloud platform (GCP). Which

is a very beneficial for client in terms of cost effectiveness and

performance improvement. One thousand plus Informatica

sessions were migrated from Netezza to write data into Big

Query. Which include historical and Incremental data migration

of 2300 tables. 19 TB of data.

Roles & Responsibilities:

  • Analyzed the source data and the requirements document.
  • Analyzed existing Informatica workflows & Netezza store
  • procedures.
  • Created spark functions and framework for code migration.
  • Generated Scala code for all the Informatica sessions/
  • mappings.
  • Develop Oozie job properties and workflow xml for Oozie
  • scheduling.
  • Created new TWS jobs for running GCP code.
  • Completed Unit testing for every session.
  • Data comparison and validation using Pelican (DM developed tool)
  • System integration and email alerts generation.

Project Environment: GCP, Big Query, Scala, TWS, Pelican,

Oozie, Netezza, Mainframe, Windows batch scripts, Unix,

Informatica 9x, SSIS

IT Analyst

Tata Consultancy Services
05.2012 - 12.2017

Barclays – Huron Data lake

ETL/Hadoop Developer Jul 2016 to Dec 2017

Huron Compliance Data Lake is a data exploration platform with a variety of data sources in one place, enabling the ability to do advanced/modern analytics using machine learning and data science.

To ensure success, a Data Lake should be driven by a strategic approach, with a clear idea of the business benefits that the Data Lake is expected to drive.

In Huron, it became immensely essential to outline an end-to-end sustainable engagement model that would deliver the anticipated business outcomes faster, by embracing automation and institutionalizing improvements in the org structure with the right technical capabilities and quality governance processes

  • Extensively involved in ETL data-pipeline starting data acquisition- Extracting terabytes of data using Apache Sqoop from Oracle Db/Composite studio into HDFS.
  • Data Cleansing and Pre-processing data to discover insights with analysis on the breaches & violations using Hive/Spark.
  • Pushed the generated output of hive queries from HDFS to Elastic DB using Apache PIG for business analysts for dashboard creation.
  • Involved in creating Hive reporting tables/views for Spotfire Dashboards, which are exposed to the end users.
  • Implemented Slowly Changing dimension type2 methodology for accessing the full history of accounts and transaction information.
  • Implementation of GIT stash, Nolio, Jenkin tool for code versioning.

Project Environment: Informatica, UNIX, Hadoop, Sqoop, HDFS, Hive, Impala, Spark, Autosys, Composite & Spotfire

Morgan Stanley – Project Winter ETL Remediation (Onsite NY)

Module lead

Mar 2015 to June 2016

Client is a global leader in providing the finest financial services, products and execution. Client delivers financial services to companies, governments and institutional investors from around the world, as well as individual investors.

As part of Project Winter Wealth Management ETL platform was remediated. Below are the high levels of activities performed. (4500 TWS jobs, 89 apps)

  • Implement kerberization in the database connections.
  • Implement Teradata Query Banding.
  • Convert all FTP’s to SFTP.
  • Convert Mainframe FTP to use NDM
  • Move all executable from SAN to AFS.
  • Implement Standard folder structure across.

Project Environment: Teradata, Unix, Informatica PowerCenter 9x, SSIS, BOXI, TWS,

Morgan Stanley – SMTG

ETL Developer

June 2012 to Mar 2015

Client is a global leader in providing the finest financial services, products and execution. Client delivers financial services to companies, governments and institutional investors from around the world, as well as individual investors.

Spend Management group represents a holistic view of the activities involved in the "source-to-pay" process. This process includes spend analysis, sourcing, procurement, receiving, payment settlement and management of accounts payable and general ledger accounts.

In an enterprise, spend management is managing how to spend money to best effect in order to build products and services. The Spend Management Technology Group (SMTG) develops, deploys and supports applications for the Firm Wide Sourcing and Accounts Payable areas

  • Interacted with business clients
  • Analyzed the source data and the requirements document
  • Developed and modified existing mappings using all transformations and mapplets as per standard
  • Developed informatica sessions and workflows with dependencies
  • Develop shell scripts as per the requirements.
  • Autosys setup and operations.
  • Developed and documented test cases for data validation of Source and Target system

Project Environment: Sybase, SYTS, Informatica PowerCenter 8.6.0, BOXI, UNIX, Autosys

Developer

Systech Solutions
Chennai
03.2010 - 04.2012

Completed Several Projects with Systech Solutions, please find below list

1) Twentieth Century Fox – Informatica/Netezza Migration

ETL Developer

December 2011 to April 2012

Project Environment: Netezza, MS SQL Server 2005, Informatica PowerCenter 8.6.1, Informatica PowerCenter 9.0.1, UNIX, SSIS

2) Rambus – PMO Datamart

ETL Developer

August 2011 – December 2011

Project Environment: MS SQL Server 2005, Informatica PowerCenter 8.6.0, OBIEE 11g, MS EXLS

3) International Rectifier – HR Asia Datamart

ETL Developer

October 2010 – March 2011

Project Environment: MS SQL Server 2005, Informatica PowerCenter 8.6.0

4) Toyota Financial Services – Netezza Migration

Developer

July 2010 – September 2010

Project Environment: MS SQL Server 2005, Netezza Twinfin 5, Informatica Power Center 8.5.1

5) Juniper Networks – EDW Environment and Enhancement

Developer

May 2010 – July 2010

Project Environment: BusinessObjects XI, Oracle 9i, Informatica 8.6

6) Proctor & Gamble – Mediabase Data warehouse

ETL Developer & Tester

March 2010 – May 2010

Project Environment: SQL Server 2005, SSIS

Software Engineer

Prolifics Software & Technologies
Pune
10.2009 - 02.2010

Education

Bachelor of Science - Computer Science

Amravati University
Chikhli, India
07.2025 -

Skills

  • Experience with Snowflake

  • Informatica expertise

  • Experienced in Python programming

  • Amazon Web Services expertise

  • Experience with ELT methodologies

  • Matillion ETL expertise

  • Experience with DBT modeling

  • Experience with Fivetran integration

Certification

SnowPro Advanced Architect Certification

Timeline

Bachelor of Science - Computer Science

Amravati University
07.2025 -

Technical Architect

Atgeir Solutions
08.2024 - Current

SnowPro Advanced Architect Certification

12-2023

SnowPro Core Certification,

08-2023

Associate Data Architect

Atgeir Solutions
08.2021 - 08.2024

Associate Data Architect

VSquare Systems Pvt. Ltd.
03.2021 - 07.2021

Technology Lead

VSquare Systems Pvt. Ltd.
03.2019 - 01.2021

Tech Lead

Datametica Solutions Private Limited
01.2018 - 03.2019

IT Analyst

Tata Consultancy Services
05.2012 - 12.2017

Developer

Systech Solutions
03.2010 - 04.2012

Oracle Database 11g: SQL Fundamentals

12-2009

Software Engineer

Prolifics Software & Technologies
10.2009 - 02.2010
Gaurav BhediSnowflake, ETL, ELT, AWS, Python, Airflow