MUNICH · PRAGUE · ON-PREMISE AI ENGINEERS
AI that never
leaves the building.
Fine-tuned language models deployed on your GPU servers — air-gapped, GDPR-compliant by architecture, owned by you. For hospitals, law firms, and AI vendors locked out of regulated deals.
Live in 10+ public hospitals · Rowan Legal · T-Systems · Eurowag
10+
Public hospitals running our AI in production
0
External API calls in air-gapped mode
3 wks
Fastest contract-to-production deployment
100%
Data residency on your own hardware
LIVE AT






















WHAT WE BELIEVE
Cloud AI is the wrong architecture for your data.
When the data is patient records, M&A drafts, or government correspondence, 'send it to the cloud' isn't a procurement decision — it's a liability. Every API call to a cloud LLM leaves a trace. Every query becomes training material. Or evidence in court. We built the alternative: AI that runs on your hardware, trained on your data, owned by you.
No tokens. No quotas. No third-party access.
Your data never leaves your network. By architecture.
You own the model. You own the updates. You own the outcomes.
— Krystof Olik, Founder
WHO WE BUILD FOR
Three buyers. One architecture.
Each deployment is the same thesis applied to a different regulator: the model moves in, the data stays put.
FOR HOSPITALS
Clinical documentation without the cloud
Every ambient scribe on the German market routes patient audio through a US cloud. Ours runs in your server room and writes FHIR records into your KIS.
Live in 10+ public hospitals · voice → FHIR · Medicalc HIS
On-prem AI for hospitalsFOR LAW FIRMS
Airgapped AI for privileged work
Client files, M&A drafts, and case strategy never reach a third party. Agents run inside your network — professional secrecy survives by architecture.
Rowan Legal runs its operations on our airgapped agents
AI for law firmsFOR AI VENDORS
Your product, deployable on-prem
You are losing hospital, bank, and government deals to data-residency requirements. We port your stack to the customer's hardware.
Cloud-to-on-prem porting · local inference · GPU sizing
Unblock your dealsWHAT WE DEPLOY
Four systems. One footprint: your hardware.
Fine-tuned open-weights models, local inference, and production integrations into the systems you already run.
VOICE → RECORDS
Dictation to structured records
Doctors and lawyers dictate; structured output lands in the system of record. Speaker separation, domain vocabulary, schema-true output.
- Olingo Speech
- FHIR / HL7
- OCR — 99.8% accuracy
- Medical & legal vocabulary
DOCUMENTS → ANSWERS
Knowledge engine over your archive
Decades of contracts, records, and correspondence become a sourced answer system — citations down to the exact paragraph.
- Local RAG
- Vector search
- Source citations
- Legacy DB integration
BACKGROUND AGENTS
Airgapped agents for operations
Intake classification, document routing, compliance checks — running continuously inside your network with full audit logs.
- Multi-agent orchestration
- Local inference
- Audit logs
- Zero egress
MODELS & HARDWARE
Fine-tuned models on sized hardware
We pick or fine-tune open-weights models per domain and spec the GPU servers they run on. You own both.
- Mistral / Llama / Whisper
- Olingo model line
- NVIDIA DGX / RTX
- CUDA / ROCm
HOW ENGAGEMENT WORKS
Fixed scope. Fixed price. Something you keep at every step.
Three phases. Each one ends with an artifact that is yours — a blueprint, a running system, or both.
Sovereignty assessment
€9,800 fixed
We map your data flows, size the hardware, and design the deployment architecture.
Credited in full against the build if you proceed.
Pilot deployment
from €120,000
One use case in production on your hardware — real users, real data, measured outcomes.
Fixed-price proposal before we start. Hardware procured at cost.
Rollout & managed service
from €6,000 / month
Site-wide rollout, integrations, model updates, monitoring, and SLA — operated by us, owned by you.
No per-token costs. No per-seat licences.
DEPLOYED SYSTEMS
Live systems. Real outcomes.

Healthcare
On-PremiseOlingo Medical
Voice & documents → FHIR records. Live in 10+ public hospitals across the Czech Republic.

Automotive / Telecommunications
On-PremiseT-Systems Connectivity Platform
Fleet connectivity platform with ML-driven predictive analytics — on T-Systems infrastructure.

Fuel & Telematics
On-PremiseEurowag Legislation Monitor
Multi-jurisdiction legislation monitor — daily traffic-light compliance reports for Eurowag.

Legal
On-PremiseRowanAI
Airgapped AI agent infrastructure running law firm operations at Rowan Legal — no data leaves the building.
FAQ
Questions buyers actually ask
The AI model and all processing run on hardware in your own data centre or server room. No data is sent to any external cloud service. You own the hardware, control the access, and the system can operate fully air-gapped — with no internet connection at all.
Our fastest deployment was 3 weeks from contract to production. A typical pilot takes 6–12 weeks depending on integration complexity and your IT readiness.
Fixed per phase: a €9,800 sovereignty assessment (credited against the build), pilots from €120,000, and managed service from €6,000/month. There are no token-based running costs — just infrastructure and support.
We specify it during the assessment. Production-grade inference for a 70B-class model typically lands between €40,000 and €190,000 in hardware, depending on throughput. We procure at cost, or deploy on GPUs you already own.
Often contested, which is the problem. EU–US transfer frameworks remain under court challenge, and a cloud LLM prompt containing patient or client facts is a disclosure to a third party. On-premise deployment removes the question entirely: the data never leaves your network.
Open-weights models — Mistral, Llama, Whisper, and others — fine-tuned on your domain, plus our Olingo model line for healthcare and legal workloads. You own the resulting weights.
Your IT team, with our managed service behind it: monitoring, model updates (delivered offline for air-gapped sites), and an incident SLA. Full handover to your team is an option, not a hostage negotiation.
Yes. We port cloud AI products to customer hardware: local inference replaces external API calls, the stack is containerised, and updates work air-gapped. Fixed scope, typically four to ten weeks.
CONTACT
Talk to engineers, not sales.
We work with organisations ready to own their AI infrastructure. 30-minute session: we map your data flows and propose a deployment architecture. If we're not the right fit, we'll tell you.
WHAT HAPPENS NEXT
- 01We map your data flows and existing systems.
- 02We propose a deployment architecture for your infrastructure.
- 03You leave with a concrete plan. No pitch.
Or send a question by email — we reply within 48 hours.