Summary
Results-driven professional with 19.5+ years of experience in Big Data, AI, Machine Learning, and Java/J2EE, including 3 years of specialization in AI and NLP. Proficient in building scalable, high-performance data solutions using Hadoop, Spark, cloud platforms (GCP/AWS/Azure), and cutting-edge AI models such as GPT and Hugging Face Transformers.
Skills
Data Engineering & Architecture: Hadoop, Spark, Kafka, Storm, Airflow, Flink, Oozie, YARN, DBT.
Big Data & Storage: HDFS, Hive, Sqoop, HBase, Cassandra, MongoDB, Snowflake, Redshift, BigQuery, Azure Data Lake.
AI & Machine Learning: GPT Models, LLMs, Scikit-Learn, Hugging Face, TensorFlow, PyTorch, Spark MLlib, NLP, Generative AI.
Streaming & Real-Time Analytics: Kafka (Confluent), Spark Streaming, Apache Storm, Flink.
Cloud Technologies: GCP (DataProc, BigQuery, DataFusion), AWS (EMR, Glue, S3), Azure (AI Services, Cognitive Search).
Programming & Development: Python, Scala, Java, J2EE (Spring, Hibernate), C++, PySpark.
Data Modeling & ETL: DBT, Erwin, LucidChart, Microsoft Visio, Toad Data Modeler, Fact & Dimensional Modeling.
Orchestration & Workflow Automation: Apache Airflow, Oozie, CI/CD for data pipelines.
Reporting & Visualization: QlikView, Power BI, Tableau, Looker.
Performance Optimization: Spark & Hive tuning, distributed cluster management, AI model deployment efficiency.
Projects Worked On
Client: Telus Communication, Canada
Designed and implemented advanced AI and data processing solutions, focusing on data governance, curation, and transformation. Led end-to-end design and development of data pipelines, ensuring seamless integration with AI models to enhance customer experience and operational efficiency.
Technologies: GCP, DataFusion, DataProc, BigQuery, GCP Data Lake, Google Gemini, Hugging Face, GPT models, LLMs, AutoML
- Data Governance and Cleansing: Established robust governance policies for customer datasets, ensuring compliance and quality through automated validation workflows.
- Data Curation and Transformation: Designed ETL pipelines for cleaning and restructuring data to improve model accuracy and analytics.
- Data Lake Development: Built scalable, cloud-based GCP Data Lake for storing vast datasets, enabling real-time insights and AI model integration.
- AI Integration: Deployed LLMs and GPT models within Google Gemini for predictive analytics and customer support optimization.
Client: US Bank, USA
Spearheaded the design and solutioning of data ingestion, governance, and curation processes to consolidate fraud insights and enable predictive risk management.
Technologies: Spark, Databricks, Kafka, Python, Tableau
- Designed and developed scalable data pipelines for ingesting high-velocity financial transaction data.
- Implemented robust governance frameworks to ensure data quality and regulatory compliance.
- Optimized ETL processes for real-time anomaly detection and fraud analytics.
Client: Uplight, USA (Google End Client)
Designed and developed data solutions, emphasizing data governance and transformation for energy analytics. Led the integration of curation workflows with analytics platforms to provide real-time energy usage insights.
Technologies: GCP, DataFusion, DataProc, BigQuery
- Built end-to-end pipelines for real-time data ingestion, transformation, and energy data analytics.
- Designed scalable cloud-based data storage and governance solutions to support advanced analytics applications.
- Enhanced data reliability and quality by implementing automated curation and validation processes.
Client: BNY Mellon, USA
Designed and developed scalable data pipelines with a focus on data governance, curation, and transformation for a rule-based billing management system.
Technologies: Spark, Airflow, AWS, Hive, Erwin, Collibra
- Established a robust Data Lake architecture to centralize financial data processing.
- Implemented data transformation processes to meet complex billing requirements.
- Automated workflows to ensure data quality, reduce manual interventions, and optimize billing accuracy.
Client: Nike, USA
Architected data governance and curation frameworks for building an advanced data lake solution. Led the design and development of end-to-end data solutions to support analytics and reporting initiatives.
Technologies: AWS, Spark, Airflow, Hive, Snowflake, Toad Data Modeler
- Designed and implemented AWS-based Data Lake for processing large-scale customer and sales data.
- Developed automated data pipelines with Apache Airflow to improve data availability and reliability.
- Built data transformation workflows to enable Snowflake-powered advanced analytics and reporting.
EISL (UBS, USA): Kafka-based data orchestration and Spark-based data aggregation.
Chevron EDAP: Built enterprise analytics platforms for data filtration and business dashboards.
COOP NDX Loyalty: Migrated and transformed loyalty data using Spark SQL.
AT&T RAPTOR: Deployed Hadoop/Spark/MongoDB-based service assurance systems.
Vodafone Socio Data Archival: Architected CRM archival and analytics systems using Hadoop.
EXPERIENCE
SENIOR ARCHITECT, Wipro Technologies, Bangalore, Oct 2018 - Nov 2024
ASSOCIATE DIRECTOR, Nielsen India Pvt. Ltd., Bangalore, July 2017 - Feb 2018
SOLUTION ARCHITECT, TechMahindra Ltd., Bangalore, April 2016 - April 2017
MANAGER, Capgemini India Pvt Ltd., Bangalore, Nov 2014 - Nov 2015
SENIOR MANAGER, Accion Labs India Pvt Ltd., Bangalore, Dec 2012 - Oct 2014
LEAD CONSULTANT, ITC Infotech India Ltd., Bangalore, Aug 2011 - Sept 2012
TECH LEAD, Symphony Services, Bangalore, Aug 2010 - Aug 2011
MODULE LEAD, MphasiS, Bangalore, Sept 2009 - July 2010
SSE, Exilant Consulting Pvt. Ltd., Bangalore, March 2009 - July 2009
SSE, Fujitsu Consulting India Pvt. Ltd., Noida, Jan 2008 - Jan 2009
SOFTWARE ENGINEER, L&T Infotech Pvt. Ltd., Mumbai, July 2007 - Jan 2008
SOFTWARE ENGINEER, Patni Computers Pvt. Ltd., Mumbai, Nov 2005 - July 2007
EDUCATION
EXECUTIVE PROGRAMME IN BUSINESS MANAGEMENT, IIM Calcutta, 2011
MASTER OF COMPUTER APPLICATIONS (MCA), Utkal University, 2000-2003
Social Share
1.
SEND
2.
MATCH
3.
TRIAL
4.
ON BOARD










