MENU

GET IN TOUCH

tanalpha.aditya@gmail.com
Back

SPARK: Proof Assistant for Reasoning

Year

2025

Tech & Technique

LLM Fine-Tuning, PyTorch, DPO/GRPO, LoRA, Qwen2.5, Formal Verification

Description

A step-by-step proof assistant designed for advanced mathematical reasoning by integrating small Large Language Models (LLMs) with formal verification techniques.

Key Achievements:
  • Engineered DPO/GRPO fine-tuning pipelines using 4-bit LoRA for specialized models like Qwen2.5-MATH and Phi-mini-4k.
  • Achieved a +14.7 percentage point increase on the MATH500 benchmark, significantly surpassing Gemini-2.0 on GSM8K.
  • Developed custom data preparation and error-driven reward design methodologies to enhance model accuracy and reasoning capabilities.

My Role

As the lead developer, I was responsible for the entire project lifecycle:
  • ✅ Conceptualized and designed the core architecture integrating LLMs with formal verifiers.
  • 🧠 Implemented and optimized the fine-tuning pipelines from scratch.
  • 📊 Conducted rigorous benchmarking and performance analysis.
  • 🔬 Designed the custom reward models for error-driven learning.

ADITYA

tanalpha.aditya@gmail.com