When do you need local models instead of OpenAI/Anthropic?

At least three cases. First — personal data processing (GDPR, FZ-152) or medical/financial data (sector regulation). Second — confidential commercial information (internal docs, customer correspondence). Third — strategic preference to not depend on US vendors. Otherwise, cloud with proper anonymization is usually fine.

How much worse are local models vs GPT-4 / Claude?

For most business tasks, Llama 3 70B and Mistral Large are comparable to GPT-4. They may underperform on complex reasoning, multilingual (English is best), creative tasks. Exact score after pilot testing on your data in first 3-4 days.

How much does a GPU server for local model cost?

Llama 3 8B runs on RTX 4090 (~$2,500 hardware) or rents for ~$300/month. Llama 3 70B needs A100 80GB (~$800/month rent). For most business cases 8B-13B models are enough.

What is prompt injection and why does it matter?

An attack where a user's input forces the AI agent to ignore system instructions and perform unwanted actions (e.g., disclose other users' data, decide against policy). We defend with multiple layers — input filtering, isolated tool permissions, full agent audit log, knowledge source limits.

Can you pass a GDPR audit with AI deployment?

Yes. Key requirements — processing consent (wording and storage), access log, data residency, leakage protection, right to be forgotten. All covered by standard self-hosted + audit-log architecture. We provide auditor documentation.

Cost of audit only without deployment?

Baseline audit of existing AI deployment — from $1,500 for 3-5 days. You get a report with prioritized risk list and concrete recommendations. Deployment is separate.

AI Security Audit, Local LLMs, GDPR-Ready Deployment

When you need this service

Three typical triggers:

Regulatory requirement — GDPR for EU customers, HIPAA for US healthcare, FZ-152 for Russian personal data, sector-specific (banking, insurance, legal)
Strategic independence — you don’t want to depend on OpenAI/Anthropic potentially cutting service, changing prices, or shifting policy
Sensitive data — internal docs, customer correspondence, contracts, R&D research

If one applies, this service covers all three.

Threat model

On the audit we check seven typical risks:

Vendor leakage — data lands in OpenAI/Anthropic logs, can be subpoenaed by foreign authorities
Prompt injection — user bypasses system prompt with crafted input
Data leakage between users — AI remembers one user’s context and talks about it to another
Jailbreak — bypassing model safety filters for harmful content
Tool misuse — agent uses available tools for unintended actions (e.g., data deletion)
Secret exfiltration — API keys, passwords, tokens leak into prompts and logs
Compliance gaps — missing access log, consents, retention policy

”Secure by default” architecture

Default architecture:

Layer isolation — frontend → API gateway with auth → AI service without direct DB access → constrained tool set with permission checks
Encryption at rest — all DBs encrypted (Postgres TDE or filesystem-level)
Secret management — no tokens in code or env files; HashiCorp Vault or Cloudflare Secrets
Audit-log pipeline — every user request and AI decision in a separate read-only DB
RBAC — admin role model, least privilege
Rate limiting — defense against overload and abuse

Local models — selection criteria

Model choice depends on task and hardware:

Llama 3 8B — great for classification, data extraction, simple QA. Runs on RTX 3090/4090. 30-80 tok/sec.
Llama 3 70B — close to GPT-4 quality. Needs A100 80GB or 2× A6000. 8-20 tok/sec.
Mistral 7B / Mistral Large — strong on European languages, Mistral Large commercial version excellent.
Qwen 2.5 — strong reasoning, excellent multilingual.

On the audit we test several models on your tasks and pick the optimal “quality × inference cost” ratio.

Compliance documentation

Post-deployment deliverables:

Record of processing for GDPR Article 30
Privacy policy and consents in audit-ready format
DPIA (Data Protection Impact Assessment) for high-risk EU deployments
Self-check checklist for your team

Get started

Book a free 2-day audit. We’ll review your current AI deployment (if any) or planned architecture, deliver a written report with prioritized risks and recommendations.

AI Security & Privacy

When you need this service

Threat model

”Secure by default” architecture

Local models — selection criteria

Compliance documentation

Get started

What you get

Data stays inside

Regulator compliance

Prompt injection defense

Encryption and access control

How we work

Current state audit · 2 days

Architecture and plan · 2 days

Local model deployment · 4-5 days

Handoff and training · 1 day

Tech stack

Pricing

Frequently asked

Book a 30-minute audit.