AI Engineer | LLM Training & Optimization | High-Performance Computing
Ph.D. physicist turned AI engineer with deep expertise in High-Performance Computing. I specialize in training, fine-tuning, and optimizing large-scale AI models - from pre-training billion-parameter LLMs on distributed systems to achieving >50% performance gains via LoRA fine-tuning.
Outside of work, I compete in powerlifting - the kind of systematic, incremental progress that also defines how I approach engineering problems.
Pre-training, LoRA fine-tuning, synthetic data generation
C, MPI, CUDA, distributed computing, Cerebras
ReAct prompting, tool use, multi-step workflows
Vector search, GraphRAG, knowledge graphs
Full publication list on Google Scholar.
$ python convert.py pharia-1
Loading architecture...
Mapping 142 tensors → GGUF
Quantize: Q4_K_M █████ 100%
✓ pharia-1.gguf (1.8 GB)
Reverse-engineered the Pharia model architecture for GGUF conversion. Implemented bespoke inference logic in C++ within llama.cpp for efficient quantized CPU inference.
$ agent "Best Wilks, 93kg?"
[Plan] Decomposing query...
[Tool] wilks_calc → 521.3
[Tool] search → openpowerlifting
[Agent] ✓ Answer ready
ReAct-style agent for complex, multi-hop powerlifting questions that standard RAG fails on. Custom tools for structured data retrieval and web search, coordinated by a planner LLM.
$ evolve --pop 32 --gens 50
Gen 01 | best=0.42 ██░░░░░░
Gen 25 | best=0.87 ██████░░
Gen 50 | best=0.98 ████████
✓ Best solution: gen 47
Open-source implementation of DeepMind's AlphaEvolve evolutionary algorithm, using an LLM-driven mutation operator to autonomously evolve Python code.
Run this in Python, or click to copy:
"".join(["@", "je", "ecke", ".", "ns", "ai", "lu"][i] for i in [1, 4, 0, 6, 2, 3, 5])
Berlin, Germany