Hi, I'm Amit Prajapati

A
Self-driven, quick starter, passionate programmer with a curious mind who enjoys solving a complex and challenging real-world problems.

About

Data Science graduate student at WPI with hands-on experience building scalable ML pipelines, real-time AI systems, and end-to-end data platforms using Python, TensorFlow, AWS, and more. Passionate about creating impactful, data-driven solutions for businesses and communities.

LeetCode Live Stats

Experience

TrueLight Energy

Machine Learning Engineer
  • Built scalable ML pipelines by automating data ingestion and transformation using REST APIs and AWS EC2, following MLOps best practices.
  • Deployed LSTM-based models (Vanilla, Bi-LSTM) to forecast energy consumption, improving prediction accuracy by 60% and enhancing business demand planning.
  • Designed a normalized PostgreSQL schema for time-series data, accelerating query performance by 90% and reducing dashboard load times.
  • Improved production workflows by integrating model serving and versioning systems, enabling faster and more reliable model updates.
  • Tools: Python, Flask, TensorFlow, PostgreSQL, AWS EC2
Jan 2025 – May 2025 | Boston, USA

Offshore Construction Associates

Data Science Intern
  • Automated data collection pipelines using Selenium, BeautifulSoup, and cron jobs, reducing manual extraction efforts by 70% and ensuring real-time offshore construction metric tracking.
  • Designed interactive Power BI dashboards to visualize wave sensor data, enhancing decision-making for offshore wind project safety and future strategy planning.
  • Fine-tuned LLaMA 3 models on scraped scientific documents with Hugging Face Transformers, building an agentic RAG-based pipeline for semantic knowledge retrieval and open-ended scientific inquiry simulation.
  • Tools: Python, Selenium, BeautifulSoup, Power BI, Hugging Face Transformers, LLaMA 3
May 2024 – Present | Boston, USA

Munich RE

Data Analyst
  • Analyzed large-scale insurance records using PySpark on Azure Databricks by connecting to Azure Data Lake, applying Apriori and FP-Growth algorithms to detect fraud and risk patterns.
  • Developed an R-CNN-based model to classify diverse insurance claims documents, improving triage accuracy and reducing manual processing time, enhancing claim authenticity validation by 36%.
  • Built interactive dashboards in Power BI to visualize fraud risk patterns, enabling stakeholders to monitor high-risk claims, prioritize investigations, and drive data-driven decisions in fraud management.
  • Tools: PySpark, Azure Databricks, Azure Data Lake, R-CNN, Power BI
Dec 2022 – May 2023 | Mumbai, India

Fidelis Macro Global Fund

Quant Intern
  • Developed and tested multiple NSE option trading strategies using Zerodha API, formatting data for easy access and robust live market validation.
  • Applied technical indicators (VWAP, RSI, MA, MACD) to trigger automated position entries and exits in real-time trading environments.
  • Monitored and optimized financial indicators to enhance ROI by 50% while implementing dynamic risk management strategies to mitigate stop-loss risks.
  • Tools: Python, Zerodha API, Technical Analysis
May 2022 – Sep 2022 | Ebene, Republic of Mauritius

Let the Data Confess

Data Science Intern
  • Directed the development of a loan approval workflow by analyzing credit histories, improving the approval process effectiveness by 70%.
  • Engineered features using VIF and RFE techniques and built classification models achieving 90% accuracy.
  • Deployed the entire ML pipeline on the cloud using Streamlit and GitHub, ensuring seamless accessibility and real-time collaboration for students.
  • Tools: Python, Streamlit, GitHub, Machine Learning (VIF, RFE)
Jan 2022 – Feb 2022 | India

Projects

WPIBot
WPIBot

A campus-specific chatbot using LLaMA 3, FAISS, and Streamlit for real-time Q&A.

Accomplishments
  • Tools: Python, LLaMA 3, Hugging Face, FAISS, Streamlit, AWS EC2
  • Finetuned LLaMA 3 via Groq API and built a semantic search engine using FAISS.
  • Scraped and indexed 2,000+ WPI campus pages with Sentence-BERT embeddings.
  • Deployed a low-latency Q&A platform on AWS EC2 using Streamlit.
Object Detection with Satellite Images
Object Detection with Satellite Images

Optimizing object detection performance vs resolution trade-offs using satellite imagery.

Accomplishments
  • Tools: Python, YOLO, OpenCV, Matplotlib
  • Evaluated detection accuracy across varying pixel resolutions using the xView dataset.
  • Generated resolution-performance curves to identify optimal GSD "knee points."
  • Reduced data usage by 70% while retaining 90% detection accuracy.
Real-Time Traffic Sign Classifier
Real-Time Traffic Sign Classifier

Built a real-time traffic sign recognition system with low-latency edge deployment.

Accomplishments
  • Tools: Python, TensorFlow/Keras, OpenCV, Docker
  • Developed a low-latency inference pipeline using a deep learning model trained on traffic signs.
  • Implemented robust preprocessing and data augmentation (directional and non-directional).
  • Containerized the solution using Docker for edge-compatible real-time deployment.

Skills

AI / Machine Learning:

  • TensorFlow, PyTorch, Scikit-learn, Keras, XGBoost
  • Hugging Face Transformers, LLaMA, RAG Pipelines
  • Deep Learning, Computer Vision, NLP

Cloud Platforms & MLOps:

  • AWS EC2, GCP, Azure
  • Docker, Kubernetes (basic), MLflow
  • Streamlit, FastAPI, Heroku

Databases:

  • MySQL, PostgreSQL, MongoDB
  • Azure Data Lake, Snowflake

Programming Languages:

  • Python, Java, JavaScript, C, C++, Bash, HTML5, CSS3

Libraries & Tools:

  • NumPy, Pandas, OpenCV, Matplotlib, SciPy, Dask
  • Git, Power BI, Tableau, Excel

Education

Worcester Polytechnic Institute

Worcester, USA

Degree: Master of Science in Data Science
Duration: Aug 2023 – May 2025

    Relevant Courses:

    • Data Structures and Algorithms
    • Big Data Management
    • Generative AI
    • Business Intelligence
    • Machine Learning

NMIMS University

Mumbai, India

Degree: Bachelor of Technology in Data Science
Duration: Aug 2019 – May 2023

    Relevant Courses:

    • Artificial Intelligence
    • Cloud Computing
    • Statistics
    • Data Ethics
    • Finance
    • Advanced Database Systems
    • Deep Learning

Contact