top of page
Website Logo
Download CV

ML, Python, Llama, TesseractOCR, Devops, SQL

NG

Nayan G.

Expert

Expert

AI Tech Lead

* Zero Evaluation Fee

Available

Available in IST Timezone

Summary

Technical Skills

Projects Worked On

Nayan G.
00:00 / 01:04
Nayan G.
00:00 / 00:42

Summary

  • 11 years of experience as Data Scientist, Machine Learning Engineer and Python Developer.
  • Experience of Ensembling the machine learning algorithms.
  • Experience of building Recommendation System using Cosine Similarity Matrix.
  • Knowledge of Natural Language Processing techniques like Part of Speech tagging [POS], Entity Extraction, Topic Modelling, Keyword Extraction, Word2Vec, GLOVE, TFIDF, Word Cloud.
  • Experience of working with different methods of time series i.e., ARIMA, SARIMA, Decomposition, Simple Moving Average, Exponential Average and LSTM
  • Knowledge of deep learning tools like Keras.
  • Experience of building Chat Bot with Python Rasa package.
  • Experience of building Q&A Chabot using open source and proprietary LLMs using Llama Index, Lang Chain, Hugging Face and OpenAI APIs [ChatGPT], ChatGroq.
  • Experience of Airflow which will be used for automation and scheduling.
  • Experience of Web Scraping using Python Requests, Selenium, LXML, Beautiful Soup packages.
  • Experience of extracting tables from system made as well as scanned pdfs using Camelot, Tabula, PDFMiner, PDF2Table, PDF2Text and OpenCV.
  • Experience of Web Development using Python Django, HTML, CSS, JS, and Ajax.
  • Experience of API development using Python Django Rest Framework, Flask and Fast API packages.
  • Experience of generating free SSL certificates using Let’s Encrypt / Cert Bot.
  • Experience of deploying web application using Nginx, Apache, Python Guni corn, Supervisorctl.
  • Experience of building CI-CD pipeline using GitLab’s Shell and Docker executors.
  • Experience of Docker images and containers.
  • Experience of end-to-end product development cycle.
  • Experience of managing team, scrum, and agile methodologies.
  • Knowledge about cloud technologies like AWS and GCP

Technical Skills

Technologies & Framework: Machine Learning, Natural Language Processing, DevOps, SQL, Django, Rasa, Time series Forecasting, Regression, Anomaly Detection, Clustering.

Languages & Packages: Python, R Programming, Llama Index, Lang Chain, Hugging Face, Scikit-Learn, Pandas, Keras, Prophet, NLTK, Request, Selenium, OpenCV, Tabula, Camelot, PDF2Table, PDF2Text, Genism, Spacy, PIL, TesseractOCR, R Shiny
Operating System: Windows, Linux
Versioning: GitHub, GitLab, Bit bucket
Project Management: Tool JIRA

Projects worked on

An interactive PaaS for transforming raw data into meaningful analytics:
Project Description: The platform is a Platform as a Service (PaaS) solution that allows customers to seamlessly handle end-to-end data analysis workflows. It provides an integrated environment where users can upload datasets, perform data preprocessing, apply various machine learning algorithms (such as regression, clustering, anomaly detection, and time series analysis), and visualize the results through interactive dashboards. This platform aims to simplify the data analytics process, enabling users with minimal technical knowledge to leverage powerful analytical tools.
Technologies Used:

  • Machine Learning Frameworks: Scikit-learn, PMDArima, Keras, XGBoost, Stats models, Ray Tune
  • Data Processing & Preprocessing: Pandas, NumPy

Domain: Machine Learning

Roles & Responsibilities:

  • Data Preprocessing: Develop data cleaning and preprocessing pipelines to handle missing data, outliers, and ensure data consistency.
  • Model Development: Implement machine learning algorithms (regression, clustering, anomaly detection, time series) for different analytical use cases.
  • Algorithm Optimization: Fine-tune algorithms to ensure they are efficient and scalable, adapting to various dataset sizes and complexities.
  • Collaboration: Work with the product and engineering teams to integrate new machine learning functionalities into the platform.
  • Customer Support: Assist clients with data analysis workflows, provide support for choosing the right algorithms, and guide them in interpreting results.

 

Q&A Chabot with Vector Embedding’s and Database:
Project Description: Developed a Q&A Chabot using custom data, integrating Vector Embedding’s and database access for accurate and informative responses. The client has call center where they provide services to different companies for their customer care services. The goal of the project was to provide information quickly and accurately for the question asked to the customer care executives so that they need to review all the product documentation.
Technologies Used:

  • Programming Languages: Python
  • AI/ML Libraries: Llama Index, Lang chain, Hugging face, OpenAI's ChatGPT
  • Database: PostgreSQL
  • Other: Prompt Engineering

Domain: Natural Language Processing
Roles & Responsibilities:

  • Worked as NLP Engineer.
  • Developed the Q&A Chabot for custom data.
  • Evaluation of the Chabot and made necessary changes over a period of time if required i.e., hyper parameter tuning for proper answers.
  • Do CRUD operations to data provided to LLM model and regenerate vector embedding for the same.
  • Evaluate hallucination of the LLM model and overcome the issue by changing the input data format, hyper parameter tuning, and prompt engineering.
  • QA testing of the Chabot

 

Airflow DAGs for Data Ingestion and Processing & Time Series Forecasting Model:
Project Description:
Created Airflow DAGs (Directed Acyclic Graphs) to automate and schedule data ingestion and processing pipelines. The client is providing IT service for commercial building’s energy conversation. For this service, client has integration with different energy data provider where data will be provided via different means like FTP, API, Email. The goal of the project was to utilize Airflow to automate the process of data fetching, cleaning, prepare in required format and push the respective tables. The client is in Australian energy market. In Australia, the energy charges will vary based on season and day time. To provide facility to end-users we developed time series-based forecasting model. This model uses decomposition method to separate trend, seasonality and cycle. After that it will use SARIMA and ARIMA method to do forecasting of individual components and put it together to provide final forecasting in terms of energy consumption and energy bill.
Technologies Used:

  • Workflow Automation: Apache Airflow
  • Data Ingestion: Python libraries like pandas, requests
  • Data Processing: Python libraries like NumPy, Pandas
  • Time series Model Building: Python libraries like Stats models, Prophet, Keras

Domain: Machine Learning, DevOps
Roles & Responsibilities:

  • Worked as Data Engineer.
  • Developed the Airflow scripts for ETL process.
  • Support and maintain the Airflow operations.
  • Deploy time series-based forecasting model to forecast energy demand and energy bill.
  • Deploy MWAA [AWS based Airflow] and made necessary changes to run Airflow on AWS.

 

Time series Based Inventory Management:
Project Description:
The client is in glass industry where they provide different raw and processed glasses like toughened glass, acoustic glass etc., for different purposes. For this project client wants to build smart inventory management system where they know when to order raw material for glass and in which quantity it should be ordered. We build time series based demand forecasting model which will provide information of demand. For this we used Keras LSTM method to build the model. This model will provide the details of expected demand for individual months and thus help to manage the inventory.
Technologies Used:

  • Programming Languages: Python
  • Data Processing: Python libraries like NumPy, Pandas
  • Time series Model Building: Python libraries like Stats models, Prophet, Keras

Domain: Machine Learning

Roles & Responsibilities:

  • Worked as ML Engineer.
  • Developed and deployed the time series based demand forecasting model.
  • Evaluate the model and run training cycle with new data over a period of time.

 

Web Scraping of Financial Data:
Project Description:
This project involves developing a web scraping system to extract financial data from multiple online sources, such as stock exchanges, financial news websites, and investment portals. The primary goal is to organize this data into structured formats for further analysis, such as trend forecasting, portfolio optimization, and risk management. The data will be used to support investment decisions, build predictive models, and visualize key financial metrics for stakeholders.
Technologies Used:

  • Programming Languages: Python
  • Libraries & Frameworks: Beautiful Soup, Scrapy, Selenium, Requests, Pandas, NumPy

Domain: Finance & Investment
Roles & Responsibilities:

  • Data Engineer/Web Scraper Developer:
  • Design and develop web scraping scripts using Requested and Selenium
  • Handle dynamic content scraping with tools like Selenium
  • Implement data cleaning and parsing
  • Store scraped data in databases like MSSQL.

 

Hindi & English Language Chat bot with Python-Rasa, Zulip and Send bird:
Project Description:
Developed a chatbot application using Python-Rasa, Zulip, and Send bird for conversational interaction and information delivery. The client is providing IT service in construction industry. The goal of the project was to develop chatbot through which one can raise any issue at construction site.
Technologies Used:

  • Programming Languages: Python
  • NLP Libraries: Gensim, spaCy
  • Chatbot Frameworks: Rasa
  • Communication Platforms: Zulip, Send bird

Domain: Natural Language Processing, Machine Learning

Roles & Responsibilities:

  • Worked as NLP Engineer.
  • Developed the Chabot for Hindi & English language with prediction model.
  • Evaluate prediction model and run training cycle with new data over a period of time.
  • Evaluation of the Chabot and made necessary changes i.e., changing intent, actions in Rasa

 

Recommendation System for Restaurant Billing Software:

Project Description: Developed a recommendation system for an IT company providing billing software to restaurants. The system analyzed customer purchase data and dining preferences to recommend personalized menu items and promotions.

Technologies Used:

  • Programming Language: Python
  • Data Analysis Libraries: Pandas, NumPy
  • Recommendation Algorithms: Collaborative filtering, content-based filtering

Domain: Machine Learning

Roles & Responsibilities:

  • Worked as an ML consultant.
  • Created a prototype of a content-based recommendation system based on the cosine similarity matrix.

 

Live Image Recognition Prototype:

Project Description: Developed a prototype for real-time image recognition using the Haar Cascades algorithm for object detection. Future planning in the project was to detect skin diseases.

Technologies Used:

  • Programming Language: Python
  • Computer Vision Libraries: OpenCV, PIL
  • Machine Learning Algorithm: Haar Cascades

Domain: Computer Vision

Roles & Responsibilities:

  • Worked as an ML consultant.
  • Created a prototype for real-time image recognition.

 

Disease Prediction Model:

Project Description: Built a disease prediction model using the Naive Bayes algorithm for classifying symptoms and suggesting potential diagnoses. The company operates in the healthcare field, providing EHR facilities. The goal of the project was to assist doctors in predicting diseases based on symptoms.

Technologies Used:

  • Statistical Language: R
  • Machine Learning Algorithm: Naive Bayes

Domain: Natural Language Processing, Machine Learning

Roles & Responsibilities:

  • Worked as an ML consultant.
  • Created a model to predict diseases based on symptoms.
  • Ran several cycles of training, testing, and validation to improve the model.
  • Prepared an NLP data cleaning pipeline and generated a bag of words and word embeddings.

 

Claim Prediction Model:

Project Description: Developed a model to predict claim acceptance or denial using the Naive Bayes algorithm and analyzed claim data. The company operates in the healthcare field, providing EHR facilities. The goal of the project was to determine whether a claim would be accepted or rejected.

Technologies Used:

  • Statistical Language: R
  • Machine Learning Algorithm: Naive Bayes

Domain: Natural Language Processing, Machine Learning

Roles & Responsibilities:

  • Worked as an ML consultant.
  • Created a model to predict claim acceptance or denial.
  • Ran multiple cycles of training, testing, and validation for model improvement.
  • Prepared an NLP data cleaning pipeline and generated a bag of words and word embeddings.

 

Optical Character Recognition Prototype:

Project Description: Created a prototype for extracting text from images using the Tesseract library.

Technologies Used:

  • Programming Language: Python
  • Text Extraction Library: Tesseract

Domain: Machine Learning

Roles & Responsibilities:

  • Worked as an ML consultant.
  • Prepared a prototype to extract text from scanned images.

 

NLP Project for Medicine and Disease Extraction:

Project Description: Utilized Keras and NLP techniques like Word2Vec, Glove, and LSTM to extract relevant information (medicine names, diseases, symptoms) from medical text data.

Technologies Used:

  • Programming Language: Python
  • Deep Learning Library: Keras
  • NLP Libraries: Gensim, spaCy
  • NLP Techniques: Word2Vec, Glove, Topic Modeling, Keyword Extraction, POS tagging, Named Entity Recognition, LSTM

Domain: Natural Language Processing

Roles & Responsibilities:

  • Worked as an ML consultant.
  • Prepared an NLP data cleaning pipeline and generated a bag of words and word embeddings.

 

Credit Scoring Automation and Model Development:

Project Description: Developed and automated the credit scoring process using R and Python. Covered data extraction, transformation, scaling, model training, and performance evaluation. Used WOE and IV approaches for robust model building.

Technologies Used:

  • Programming Languages: R, Python
  • Data Analysis Libraries: Pandas, NumPy
  • Machine Learning Libraries: Statsmodels (Python), Scikit-learn (Python)
  • Model Validation Techniques: WOE (Weight of Evidence), IV (Information Value)

Domain: Machine Learning

Roles & Responsibilities:

  • Worked as a Data Scientist.
  • Developed an end-to-end pipeline for ETL, model updates, and deployment.
  • Researched new algorithms and data processing techniques to improve model accuracy.

 

Customer Segmentation and Product Recommendation:

Project Description: Analyzed customer data using NLP techniques like TF-IDF, Topic Modeling, and Keyword Extraction. Identified customer segments and personalized product recommendations.

Technologies Used:

  • Programming Languages: R, Python
  • NLP Libraries: Gensim, NLTK (Python), Stanford CoreNLP (Java)
  • Text Analysis Tools: IBM Watson, Apache OpenNLP
  • Data Visualization Tools: R Shiny, Tableau

Domain: Natural Language Processing, Machine Learning

Roles & Responsibilities:

  • Worked as a Data Scientist.
  • Researched and experimented with different NLP tools to identify customer segments.
  • Used unsupervised learning to make personalized product recommendations.
  • Set up a multi-node Hadoop and Spark cluster for large-scale data processing.
  • Deployed machine learning models using Spark MLLib and developed interactive dashboards using Tibco Spotfire.

 

Big Data Analytics and Model Deployment on Hadoop:

Project Description: Set up a multi-node Hadoop and Spark cluster for large-scale data processing. Deployed machine learning models using Spark MLLib. Developed interactive dashboards using Tibco Spotfire for data visualization.

Technologies Used:

  • Big Data Platform: Hadoop, Spark
  • Machine Learning Library: Spark MLLib
  • Data Visualization Tools: Tibco Spotfire

Domain: Big Data

Roles & Responsibilities:

  • Worked as a Data Scientist.
  • Researched and experimented with big data tools to test their performance on small datasets.

 

Report Automation with R Shiny and Case Separation via Dictionary-Based Text Mining:

Project Description: Developed interactive reports using R Shiny for automated data analysis and visualization. Implemented a dictionary-based text mining approach in R and Python to separate and categorize cases within the data.

Technologies Used:

  • Programming Languages: R, Python
  • Data Analysis Libraries: Pandas (Python)
  • Interactive Reporting: R Shiny
  • Text Mining: Dictionary-based approach (custom code)

Domain: Machine Learning

Roles & Responsibilities:

  • Worked as a Data Analyst.
  • Converted Excel-based reports into automated R Shiny reports.
Social Share

How it Works

KNOW

SEND

LIKE

SEND

ON BOARD

How it Works

1.

SEND

2.

MATCH

3.

TRIAL

4.

ON BOARD

icons8-speech-to-text-90.png
Whatsapp
bottom of page