Case Studies

Real problems. Rigorous methodology. Measurable outcomes.

Mietpreisbremse AI Advisor

🏢 Berlin Legal Firm 📅 4 weeks 💼 Legal Tech

The Problem

A Berlin-based legal tech firm was losing €3,000 per week in billable hours. Their paralegal team spent 15+ hours weekly on rent regulation (Mietpreisbremse) consultations—manually reviewing tenant inquiries, cross-referencing regulations, and preparing legal advice documents.

Clients waited 3-5 days for responses. The firm couldn't scale without hiring more staff.

My Approach

Discovery Phase (Week 1)

  1. Interviewed 3 paralegals on their actual workflow (not the ideal one)
  2. Analyzed 20 real client inquiries to identify patterns
  3. Found: 80% of inquiries follow 5 common regulation categories
  4. Mapped the decision tree: regulation type → evidence needed → legal threshold → advice

Design Phase (Week 2)

  1. Designed AI workflow: client input → regulation classification → rules engine → advice draft
  2. Decided: AI generates draft, paralegal reviews before sending (human-in-loop)
  3. Selected Google AI Studio (faster iteration needed, prototype phase)

Prototype & Evaluation (Week 3)

  1. Built RAG-based assistant with Mietpreisbremse documentation
  2. Tested on 10 historical inquiries: 8/10 correct, 2/10 required clarification
  3. Added regional variation handling → accuracy improved to 9/10
  4. Key finding: Accuracy less important than speed (15 min review vs. 40 min manual)

Production Deployment (Week 4)

  1. Trained paralegals on tool limitations (emphasis on review requirement)
  2. Set up monitoring: checked first 20 uses for edge cases
  3. Found 1 edge case, patched it immediately
  4. After 4 weeks: handling 70% of routine inquiries

Outcomes

Processing time: 15 hours/week → 3 hours/week (80% reduction)

Paralegal capacity: Freed 12 billable hours/week

Client response time: 3-5 days → same-day for 70% of inquiries

Firm ROI: Recouped prototype cost in 2 weeks

Paralegals were more engaged (doing judgment work, not data entry). Clients were happier (faster turnaround). The firm could handle 30% more inquiries without hiring.

Key Lesson

AI Readiness Is Not Accuracy Alone

The system wasn't 100% accurate, but it was useful. Paralegals were happy to review outputs because it saved 15 minutes per inquiry. The friction point wasn't accuracy—it was speed. I optimized for "good enough + fast" rather than "perfect + slow."

Product decision ≠ engineering decision. A 90% accurate AI tool that saves time is better than a 99% accurate tool that's slower than doing it manually.

EU AI Act Compliance Assistant

🇪🇺 Regulatory Framework 📅 3 weeks 💼 AI Governance

The Problem

August 2026 enforcement of the EU AI Act creates urgent compliance risk for 10,000+ EU companies. Founders and product leaders ask: "Is my AI system regulated?" but don't know how to answer.

Existing tools are legal document explorers (not classification engines). They don't connect use case → regulation → obligations. Getting a lawyer costs €2,000 and takes weeks.

My Approach

Discovery Phase

  1. Analyzed 5 official EU AI Act PDFs (summary, risk categories, timeline, full text, compliance guide)
  2. Identified: Classification question is answerable by pattern-matching against Annex III use cases
  3. Found: 80% of questions follow same decision tree (use case + data + scope → risk level)

System Design

  1. Designed RAG system: user input → Claude extracts facts → search knowledge base → generate classification
  2. Key decision: Include methodology section (shows framework, not just output)
  3. Key decision: Confidence scoring (transparency on when to seek legal review)

Implementation (Claude Projects)

  1. Loaded 5 PDFs into Claude Projects knowledge base
  2. System prompt: "Classify ONLY using provided documents. If not covered, say so."
  3. Output format: Risk classification + plain-language obligations + timeline + confidence score

Evaluation Phase

  1. Built 20-question golden test set from real scenarios
  2. Results: 15/20 correct (75%), 3/20 nuanced (require human judgment), 2/20 incorrect
  3. Key finding: Errors in high-complexity scenarios where regulation itself is ambiguous

Outcomes

Time to classification: 40 hours (hiring lawyer) → 5 minutes (using tool)

Cost delta: €2,000 consultation → free tool

Confidence: 75% high-confidence classifications, 25% flagged for legal review

Business impact: Lead magnet for AI governance consulting

The tool validates founder intuition ("I thought we were high-risk, and the tool confirms it"). It flags edge cases where legal review is needed. Most importantly, it demonstrates my methodology to potential clients.

Why This Matters

Shows My Thinking, Not Just My Coding

Reveals my framework for breaking down complex regulations. Shows I understand when AI can solve problems (binary classification) vs. when humans decide (ambiguous edge cases). Demonstrates evaluation discipline (golden test sets, confidence scoring).

This positions me as someone who understands regulation + AI equally, values transparency over false confidence, and ships products quickly but rigorously.

Want This Methodology for Your Project?

Whether you need help with evaluation, compliance, or team guidance, let's talk about your specific situation.

Book a Consultation