AI & Machine Learning Engineer / Data Scientist

Sai Sandeep Kethiboina

Five-plus years designing, building, and shipping production AI/ML, deep learning, and generative-AI systems across telecommunications, banking, and healthcare — from raw data pipelines to LLM-powered assistants serving millions of records.

View work ↗ Download CV ↓

05+

Years in production ML

Regulated industries

150M+

Records processed

90%+

LLM response relevance

Scroll · 00 → 06

01 Profile

AI/ML engineer who turns messy enterprise data into predictive systems and LLM products that move real business metrics.

I design end-to-end machine learning and generative-AI solutions — predictive analytics, deep learning, NLP, time-series forecasting, fraud detection, and RAG-based assistants — and carry them all the way to production with MLOps/LLMOps discipline. My work spans telecom, banking, and healthcare, where reliability, governance, and explainability are non-negotiable. I care about the unglamorous parts: clean pipelines, monitored models, and outcomes you can measure.

DisciplineAI / ML Engineering

Experience5+ years

CurrentAI/ML Eng · CVS Health

EducationM.S. Computers & Information Science

DomainsTelecom · Banking · Healthcare

BasedUnited States

02 Experience

Feb 2025 — Present

Texas, USA

Healthcare

AI / Machine Learning Engineer

CVS Health

Lifted healthcare data availability +40% by building end-to-end AI/ML pipelines across 50M+ patient records with Spark and Airflow.
Raised model accuracy +25% with deep-learning models for patient risk stratification, readmission prediction, and claims-fraud detection.
Hit 90%+ response relevance on LLM clinical assistants using RAG and vector search over FAISS/Pinecone.
Cut manual documentation effort −35% by automating clinical coding with LangChain and LoRA-fine-tuned models.
Slashed deployment time −60% at 99.9% availability through enterprise MLOps on MLflow and Kubeflow with a CI/CD model registry.

Feb 2024 — Dec 2024

Texas, USA

Banking

Machine Learning Engineer

Capital One

Improved model accuracy +30% across credit-risk, fraud, and loan-default models spanning segmentation and scoring.
Reduced data-processing time −45% by building scalable ETL and real-time pipelines on Spark and Airflow.
Reached 90%+ response accuracy with LLM banking assistants powered by RAG and semantic search over FAISS/Pinecone.
Trimmed manual effort −40% by automating service, document analysis, and regulatory reporting with GenAI.
Strengthened transaction monitoring and fraud prevention through time-series forecasting and anomaly detection.

Jun 2019 — Nov 2022

Hyderabad, India

Telecom

Data Scientist

Jio

Boosted subscriber retention +25% with churn and behavior models built on XGBoost and Spark MLlib.
Cut processing time −40% by building ETL and a data lake over 100M+ records with Airflow, Spark, and Snowflake.
Reduced incident resolution time −30% via real-time network monitoring on Kafka and Spark Streaming.
Drove capacity planning and subscriber growth through forecasting, segmentation, and demand models.
Lifted decision efficiency +35% with executive dashboards tracking ARPU, CLV, and churn in Power BI and Tableau.

03 Selected Work

Open-source projects built end-to-end — from data and modeling to serving. Each links to its repository on GitHub.

LLM/01

Local RAG Chatbot

Fully local retrieval-augmented chatbot: ingests and embeds Markdown docs into a Chroma vector store and answers via a llama.cpp-served LLM. FastAPI backend with streaming chat, document upload, conversation memory, and incremental re-indexing; React + TypeScript frontend.

FastAPIllama.cppChromaDBReact

View repository ↗

Generative AI/02

Synthetic Tabular Data with GANs

TGAN/CTGAN generation of synthetic healthcare records, evaluated across statistical similarity (PCA, autoencoders, clustering), a custom privacy-at-risk metric, and downstream ML utility on length-of-stay and mortality prediction.

CTGANTGANTensorFlowscikit-learn

View repository ↗

MLOps/03

End-to-End MLOps on GCP

Reference pipeline: Prefect ETL of SF 311 data into BigQuery, dbt transforms, MLflow experiment tracking, and a scikit-learn model served via FastAPI on Cloud Run — provisioned with Terraform and shipped through GitHub Actions.

PrefectBigQuerydbtTerraformMLflow

View repository ↗

LLM/04

LLaMA From Scratch · 2.3M params

A 2.3M-parameter LLaMA-style language model implemented from scratch in PyTorch — RMSNorm, rotary positional embeddings (RoPE), and SwiGLU — trained on character-level TinyShakespeare.

PyTorchLLaMARoPETransformers

View repository ↗

Time Series/05

Time-Series Forecasting Benchmark

End-to-end forecasting on the Beijing PM2.5 dataset benchmarking ~25 approaches — from ARIMA/SARIMAX and Holt-Winters to XGBoost/LightGBM and LSTM/DeepAR/Prophet — scored on MAE, RMSE, MAPE, and R².

TensorFlowXGBoostLightGBMProphet

View repository ↗

Computer Vision/06

Computer Vision Collection

A set of CV notebooks: medical image classification (cataract, pneumonia, eye disease), traffic-sign and emotion recognition, driver-drowsiness detection, and OpenCV / YOLOv3 detection demos.

TensorFlow/KerasCNNOpenCVYOLOv3

View repository ↗

Recommenders/07

Multi-Modal E-commerce Recommender

Fashion recommender exploring four strategies — collaborative, content-based, hybrid, and a multi-modal PyTorch model fusing user/product embeddings with ResNet50 image and SentenceTransformer text features over a GCN layer.

PyTorchFlaskResNet50PyG

View repository ↗

Data Engineering/08

Airflow ETL → Snowflake (SCD2)

An Apache Airflow DAG extracting HR and salary data from PostgreSQL to S3, diffing against the warehouse, and loading into Snowflake with Slowly Changing Dimension Type 2 to track salary history over time.

AirflowPostgreSQLSnowflakeAWS S3

View repository ↗

See all repositories on GitHub ↗

04 Stack & Capabilities

Languages /01

PythonSQLRScalaJavaC++Bash

AI / ML /02

Deep LearningNLPComputer VisionReinforcement LearningTime-SeriesFeature Eng.

GenAI & LLMs /03

OpenAIHugging FaceLangChainLangGraphLlamaIndexRAGLoRA / PEFTAgentic AI

Frameworks /04

TensorFlowPyTorchScikit-learnKerasXGBoostLightGBMCatBoost

Data & Vector /05

Apache SparkKafkaHadoopAirflowSnowflakeFAISSPineconeChromaDB

Cloud & MLOps /06

AWS SageMakerVertex AIAzure MLDatabricksMLflowKubeflowDockerKubernetesCI/CD

05 Impact, by the numbers

150M+

Records processed in production pipelines

CVS Health + Jio

−60%

Model deployment time via enterprise MLOps

CVS Health

+25%

Model accuracy on patient-risk & fraud detection

CVS Health

99.9%

Availability on production AI serving

CVS Health

−45%

Data-processing time on banking pipelines

Capital One

+35%

Decision efficiency from analytics dashboards

Jio

06 Contact

Let's build something
measurable.

Emailsaisandeepk1999@gmail.com↗ LinkedInin/saisandeepkethiboina↗ GitHubgithub.com/SaiSandeepk23↗ RésuméDownload CV (PDF)↓ Phone+1 (561) 214-1251↗

Sai Sandeep Kethiboina

AI / Machine Learning Engineer

Machine Learning Engineer

Data Scientist

Local RAG Chatbot

Synthetic Tabular Data with GANs

End-to-End MLOps on GCP

LLaMA From Scratch · 2.3M params

Time-Series Forecasting Benchmark

Computer Vision Collection

Multi-Modal E-commerce Recommender

Airflow ETL → Snowflake (SCD2)

Languages /01

AI / ML /02

GenAI & LLMs /03

Frameworks /04

Data & Vector /05

Cloud & MLOps /06

Let's build somethingmeasurable.

Let's build something
measurable.