MUNICH · PRAGUE · ON-PREMISE AI ENGINEERS

AI that never
leaves the building.

Fine-tuned language models deployed on your GPU servers — air-gapped, GDPR-compliant by architecture, owned by you. For hospitals, law firms, and AI vendors locked out of regulated deals.

Book a 30-min assessment See deployed systems

Live in 10+ public hospitals · Rowan Legal · T-Systems · Eurowag

10+

Public hospitals running our AI in production

External API calls in air-gapped mode

3 wks

Fastest contract-to-production deployment

100%

Data residency on your own hardware

LIVE AT

WHAT WE BELIEVE

Cloud AI is the wrong architecture for your data.

When the data is patient records, M&A drafts, or government correspondence, 'send it to the cloud' isn't a procurement decision — it's a liability. Every API call to a cloud LLM leaves a trace. Every query becomes training material. Or evidence in court. We built the alternative: AI that runs on your hardware, trained on your data, owned by you.

No tokens. No quotas. No third-party access.

Your data never leaves your network. By architecture.

You own the model. You own the updates. You own the outcomes.

— Krystof Olik, Founder

WHO WE BUILD FOR

Three buyers. One architecture.

Each deployment is the same thesis applied to a different regulator: the model moves in, the data stays put.

FOR HOSPITALS

Clinical documentation without the cloud

Every ambient scribe on the German market routes patient audio through a US cloud. Ours runs in your server room and writes FHIR records into your KIS.

Live in 10+ public hospitals · voice → FHIR · Medicalc HIS

On-prem AI for hospitals

FOR LAW FIRMS

Airgapped AI for privileged work

Client files, M&A drafts, and case strategy never reach a third party. Agents run inside your network — professional secrecy survives by architecture.

Rowan Legal runs its operations on our airgapped agents

AI for law firms

FOR AI VENDORS

Your product, deployable on-prem

You are losing hospital, bank, and government deals to data-residency requirements. We port your stack to the customer's hardware.

Cloud-to-on-prem porting · local inference · GPU sizing

Unblock your deals

WHAT WE DEPLOY

Four systems. One footprint: your hardware.

Fine-tuned open-weights models, local inference, and production integrations into the systems you already run.

VOICE → RECORDS

Dictation to structured records

Doctors and lawyers dictate; structured output lands in the system of record. Speaker separation, domain vocabulary, schema-true output.

Olingo Speech
FHIR / HL7
OCR — 99.8% accuracy
Medical & legal vocabulary

DOCUMENTS → ANSWERS

Knowledge engine over your archive

Decades of contracts, records, and correspondence become a sourced answer system — citations down to the exact paragraph.

Local RAG
Vector search
Source citations
Legacy DB integration

BACKGROUND AGENTS

Airgapped agents for operations

Intake classification, document routing, compliance checks — running continuously inside your network with full audit logs.

Multi-agent orchestration
Local inference
Audit logs
Zero egress

MODELS & HARDWARE

Fine-tuned models on sized hardware

We pick or fine-tune open-weights models per domain and spec the GPU servers they run on. You own both.

Mistral / Llama / Whisper
Olingo model line
NVIDIA DGX / RTX
CUDA / ROCm

HOW ENGAGEMENT WORKS

Fixed scope. Fixed price. Something you keep at every step.

Three phases. Each one ends with an artifact that is yours — a blueprint, a running system, or both.

012 WEEKS

Sovereignty assessment

€9,800 fixed

We map your data flows, size the hardware, and design the deployment architecture.

Credited in full against the build if you proceed.

026–10 WEEKS

Pilot deployment

from €120,000

One use case in production on your hardware — real users, real data, measured outcomes.

Fixed-price proposal before we start. Hardware procured at cost.

03ONGOING

Rollout & managed service

from €6,000 / month

Site-wide rollout, integrations, model updates, monitoring, and SLA — operated by us, owned by you.

No per-token costs. No per-seat licences.

See the full engagement model

DEPLOYED SYSTEMS

Live systems. Real outcomes.

View all case studies

Healthcare

On-Premise

Olingo Medical

Voice & documents → FHIR records. Live in 10+ public hospitals across the Czech Republic.

Automotive / Telecommunications

On-Premise

T-Systems Connectivity Platform

Fleet connectivity platform with ML-driven predictive analytics — on T-Systems infrastructure.

Fuel & Telematics

On-Premise

Eurowag Legislation Monitor

Multi-jurisdiction legislation monitor — daily traffic-light compliance reports for Eurowag.

Legal

On-Premise

RowanAI

Airgapped AI agent infrastructure running law firm operations at Rowan Legal — no data leaves the building.

FAQ

Questions buyers actually ask

The AI model and all processing run on hardware in your own data centre or server room. No data is sent to any external cloud service. You own the hardware, control the access, and the system can operate fully air-gapped — with no internet connection at all.

Our fastest deployment was 3 weeks from contract to production. A typical pilot takes 6–12 weeks depending on integration complexity and your IT readiness.

Fixed per phase: a €9,800 sovereignty assessment (credited against the build), pilots from €120,000, and managed service from €6,000/month. There are no token-based running costs — just infrastructure and support.

We specify it during the assessment. Production-grade inference for a 70B-class model typically lands between €40,000 and €190,000 in hardware, depending on throughput. We procure at cost, or deploy on GPUs you already own.

Often contested, which is the problem. EU–US transfer frameworks remain under court challenge, and a cloud LLM prompt containing patient or client facts is a disclosure to a third party. On-premise deployment removes the question entirely: the data never leaves your network.

Open-weights models — Mistral, Llama, Whisper, and others — fine-tuned on your domain, plus our Olingo model line for healthcare and legal workloads. You own the resulting weights.

Your IT team, with our managed service behind it: monitoring, model updates (delivered offline for air-gapped sites), and an incident SLA. Full handover to your team is an option, not a hostage negotiation.

Yes. We port cloud AI products to customer hardware: local inference replaces external API calls, the stack is containerised, and updates work air-gapped. Fixed scope, typically four to ten weeks.

CONTACT

Talk to engineers, not sales.

We work with organisations ready to own their AI infrastructure. 30-minute session: we map your data flows and propose a deployment architecture. If we're not the right fit, we'll tell you.

WHAT HAPPENS NEXT

01We map your data flows and existing systems.
02We propose a deployment architecture for your infrastructure.
03You leave with a concrete plan. No pitch.

info@ollsoft.com

Or send a question by email — we reply within 48 hours.

.cal-iframe-content{display:none}We work with organisations ready to own their AI infrastructure. 30-minute session: we map your data flows and propose a deployment architecture. If we're not the right fit, we'll tell you. Embed blocked? Open the booking page directly

Embed blocked? Open the booking page directly

AI that neverleaves the building.

Cloud AI is the wrong architecture for your data.

Three buyers. One architecture.

Clinical documentation without the cloud

Airgapped AI for privileged work

Your product, deployable on-prem

Four systems. One footprint: your hardware.

Dictation to structured records

Knowledge engine over your archive

Airgapped agents for operations

Fine-tuned models on sized hardware

Fixed scope. Fixed price. Something you keep at every step.

Sovereignty assessment

Pilot deployment

Rollout & managed service

Live systems. Real outcomes.

Olingo Medical

T-Systems Connectivity Platform

Eurowag Legislation Monitor

RowanAI

Questions buyers actually ask

Talk to engineers, not sales.

AI that never
leaves the building.