top of page
Download CV

SS

Siddharth S

Expert

Expert

AI/ML Engineer

* Zero Evaluation Fee

Available

Available in IST Timezone

Summary

Technical Skills

Projects Worked On

Summary:

  • 5+ years of experience in Python programming, Al/ML development, and scalable system design.
  • Strong expertise in Generative AI, LLMs (OpenAI PT-4/4.1/mini), and ML model development.
  • Skilled in document processing, entity extraction, and generating business insights using Al/ML.
  • Proficient in backend frameworks (Flask, FastAPI) and in designing RESTful and GraphQL APIs.
  • Experienced in containerization (Docker), cloud deployment on Azure, and setting up CI/CD pipelines.
  • Hands-on with data pipelines, model training, and evaluation across domains like NLP, classification, and recommendation.

 

Skills:

  • Languages: Python, JavaScript, TypeScript
  • AI/ML Frameworks: Django, FastAPI, LangChain, LlamaIndex, Hugging Face Transformers
  • LLMs & Gen Al: OpenAI GPT, LLaMA 3.2, Claude, MCP Protocol, Tool Calling, Prompt Engineering
  • RAG & Vector DBs: LangChain, FAISS, Weaviate, Pinecone, Qdrant, ElasticSearch, Chunking Optimization
  • Databases: PostgreSQL, MySQL, MongoDB, DynamoDB
  • Cloud Platforms: GCP (Vertex AI, BigQuery, Cloud Run), AWS (S3), Azure (OpenAI)
  • Authentication: OAuth 2.0, API Gateway, JWT, Security Frameworks
  • Frontend: React, TypeScript, Tailwind CSS
  • DevOps: Docker Jenkins, GitHub Actions, n8n, Agile Delivery
  • Other Tools: Redis, Celery, Kafka, n8n (workflow automation), Pydantic, WebSockets, Asyncio, RESTful APIs

 

Projects:

 

Google Gemini GYM

Technologies Used: FastAPI, Celery, Redis + WebSockets, BigQuery, SQL (optimized queries), Docker, Google Drive API, Python, (ETL/async processing), Cl/CD integration (environment-based configs, testing frameworks)

Project Description: The Google Gemini GYM project is a backend evaluation and orchestration platform built to simulate scenarios using Large Language Models (LLMs). It provides clean REST APIs with authentication, filtering, and pagination while enabling evaluation workflows through asynchronous task orchestration. Data pipelines are designed to normalize and ingest historical evaluation results for analytics, ensuring fast, consistent, and reliable usage insights. The platform seamlessly integrates with Google Drive metadata and multiple model providers such as Gemini, OpenAl, and Claude, with containerized deployments and CI-friendly workflows for scalability.

Responsibility:

  • Built a FastAPI backend implementing secure REST APIs with auth, pagination, and filtering.
  • Integrated n8n workflows to automate data normalization, cache priming, and real-time notifications during evaluation runs.
  • Orchestrated asynchronous evaluation workflows using Celery, including parsing Colab notebooks, running models, and broadcasting status updates through Redis and WebSockets.
  • Designed and implemented an ETL pipeline for ingesting and normalizing evaluation data from BigQuery with async batching, idempotent upserts, and cache priming.
  • Added ETL-side validations, schema mapping, and performance optimizations to avoid N+1 queries and ensure high-throughput imports.
  • Developed analytics modules exposing evaluation success rates, execution metrics, and system usage using optimized SQL queries.
  • Introduced a caching layer and accessor-based data access patterns to deliver fast and consistent API responses.
  • Integrated model orchestration support for Gemini, OpenAI, and Claude with resilient error handling and Google Drive metadata ingestion.
  • Containerized API services and background workers using Docker with environment-driven configuration.
  • Wrote unit and integration tests to ensure system reliability in CI/CD pipelines.

 

Asphare - Multi-Tenant SaaS Platform (Team Lead)

Technologies Used: Python, Django, DRF, PostgreSQL, Docker, Celery, Redis, PayPal API, CI/CD, AWS S3, Terraform, React

Project Description: Asphare is a schema-based, multi-tenant SaaS platform designed to serve multiple businesses in an isolated yet centralized environment. The platform supports dynamic onboarding, multilingual interfaces, RBAC (role-based access control), and a billing system integrated with PayPal. Siddharth led the backend development and architecture, ensuring secure data isolation, scalable deployment pipelines, and compliance with data protection regulations such as GDPR.

Responsibility:

  • Architected and led development of PostgreSQL schema-based multitenancy to isolate client data
  • Designed and implemented secure, modular REST APIs using Django and DRF for multi-tenant routing and data access
  • Integrated PayPal Invoicing APIs manually (without SDK) for recurring subscriptions and billing lifecycle management
  • Automated recurring billing sync, onboarding notifications, and data migrations using N8N workflows alongside Celery/Redis tasks
  • Configured Docker-based environments for development, staging, and production
  • Built and deployed infrastructure using Terraform and CI/CD pipelines for automated rollouts
  • Developed asynchronous background job processing using Celery and Redis (e.g., email notifications, billing sync)
  • Established GDPR-compliant access controls and data security policies
  • Collaborated with frontend, DevOps, and QA teams for seamless delivery
  • Led a team of developers, performed code reviews, and provided architectural guidance.

 

Goonj - Real-time Voice Chat with Transcription and Al Integration

Technologies Used: Python, FastAPI, Redis, PostgreSQL, Celery, Docker, OpenAI, Azure Cognitive, AWS EC2, Amazon Connect

Project Description: Goonj is a real-time audio chat platform with built-in voice-to-text transcription and conversational Al capabilities. It allows users to engage in voice-based interactions, which are transcribed using Azure Cognitive Services and processed by OpenAI's LLM to generate contextual replies.

Responsibility:

  • Developed a scalable backend using FastAPI to manage sessions, voice stream handling, and real-time event processing
  • Integrated OpenAI's GPT API to enable contextual response generation from transcribed input
  • Utilized Azure Speech-to-Text API to transcribe voice data in real time
  • Established secure WebSocket channels for interactive, bidirectional communication
  • Implemented RBAC (Role-Based Access Control) and JWT-based token authentication
  • Dockerized the application and deployed it on AWS EC2 with Redis for session management
  • Ensured logging, exception handling, and horizontal scalability for thousands of users.

 

Tradershub - Real-time Trading Analytics & Market Intelligence

Technologies Used: Python, Django, AWS (S3, EC2), MongoDB, MySQL, Celery, Redis, Asyncio

Project Description: Tradershub is a real-time stock and crypto market monitoring platform that delivers personalized insights, alerts, and data visualization to users via web dashboards and Discord integrations. The system leverages live web scraping, Al-driven analytics, and real-time data processing.

Responsibility:

  • Designed scalable scrapers using Asyncio and HTTP clients to collect data from various market sources
  • Developed Django-based backend APIs to serve trading analytics and user-specific insights
  • Built Discord bots for personalized real-time alerts and announcements
  • Used MongoDB for flexible, tenant-based data storage and MySQL for analytics/reporting
  • Integrated Redis and Celery for parallel processing of high-frequency data
  • Hosted backend on AWS EC2 with custom CI/CD and monitoring integration.

 

Nanonets – Intelligent Document Data Extraction & Classification

Technologies Used: Python, FastAPI, SQL, AWS EC2, OCR, PDF Parsing, Selenium

Project Description: An Al-powered document processing system to extract structured data from scanned PDFs, images, and business forms. This solution automated document classification and conversion to usable CSV and JSON formats for downstream systems.

Responsibility:

  • Developed OCR pipelines using Python and tools like PyPDF2, OpenCV, and Tesseract
  • Built RESTful APIs using FastAPI to accept uploads, process content, and return extracted structured data
  • Automated headless scraping and data validation workflows using Selenium and BeautifulSoup
  • Managed job queues using Celery and Redis for background extraction and transformation tasks
  • Enabled secure storage using AWS S3 and audit logging to track document lifecycles
  • Applied text analytics and regex pattern recognition for classification and filtering.
Social Share

This element will not be visible on your live website - it works in the background to help protect your content.

How it Works

KNOW

SEND

LIKE

SEND

ON BOARD

How it Works

1.

SEND

2.

MATCH

3.

TRIAL

4.

ON BOARD

icons8-speech-to-text-90.png
Whatsapp
bottom of page