Skilja: IDP Software Vendor

On This Page

How Skilja processes documents
Platform architecture
Vinna 3.0 and 3.1 releases
Use cases
German insurance and public sector
High-volume enterprise document processing
Compliance-sensitive environments
Technical specifications
Company information
Resources

Skilja is a German intelligent document processing (IDP) specialist founded in 2012 and headquartered in Freiburg. The company serves regulated industries with a modular platform combining transformer-based OCR, BPMN-based workflow design, and enterprise identity integration. Its largest deployment processes 100 million documents annually in a single system.

Skilja

600,000Documents processed daily by TK

87%Automation rate at TK (Techniker Krankenkasse)

100MDocuments per year in largest single Vinna deployment

1msLAERA classification latency

How Skilja processes documents

Skilja's platform rests on three independently deployable components: LESA for OCR, LAERA for classification and extraction, and TEGRA as the integration and orchestration layer. Each module exposes a RESTful API, so organizations can adopt individual components or the full stack without SDK integration.

CEO Alexander Goerke describes the approach: "Intelligent Document Processing is an extension of classical data capture technology. It aims to take documents that were created for human readers and make them intelligible to machines." That framing shapes the architecture. Rather than positioning the platform as general-purpose AI, Skilja focuses on mimicking human document understanding through deep learning trained on business document samples, then layering quality controls to prevent model deterioration from incorrect training data.

The result is measurable at scale. TK (Techniker Krankenkasse), Germany's largest public insurance company, processes 600,000 documents daily at an 87% automation rate. Skilja's largest customer overall runs 100 million documents per year through a single Vinna system, equivalent to 400,000 documents per day. Skilja targets a recall rate above 90% for automatic field recognition without manual review, benchmarking against the typical human error rate of approximately 1%.

Unlike ABBYY, which pursues broad multi-industry coverage, Skilja's architecture prioritizes depth in regulated verticals: insurance, financial services, and healthcare, where on-premise data residency and audit trails are non-negotiable.

Platform architecture

Skilja's three-component stack is designed so each layer can scale independently.

LESA AI OCR uses deep learning neural networks trained on millions of business document samples. It operates at the word level rather than the character level, applying layout analysis and handwritten text recognition to produce contextually grounded output. The system accepts pre-existing OCR results or PDF text layers as input, bypassing OCR processing when source quality is sufficient and reducing overall processing time.

LAERA handles classification and extraction. The classifier achieves 1-millisecond latency using a voting mechanism with a pretrained large language model (LLM) classifier. The extraction module uses predictive generative algorithms for context-based content discovery across forms, semi-structured documents, and unstructured content. Both include visual designers so process owners and business analysts can configure, train, and test models without writing code. When combined, LESA and LAERA reach automation rates in the high 90%. Classification and extraction run without requiring cloud storage or customer document retention, supporting on-premise privacy compliance directly.

TEGRA is the integration layer that connects LESA and LAERA to enterprise systems via RESTful APIs. It accepts JPG, TIF, PNG, and PDF inputs, processes documents asynchronously with priority-based queuing, and returns structured JSON, plain text, or PDF with a full-text layer. TEGRA deploys on Windows and Linux via Docker and Kubernetes, scales horizontally through worker processes, and stores transient document queues and model data in PostgreSQL, MS-SQL, or Oracle.

Vinna is the orchestration platform that ties the stack together. Its BPMN-based Process Editor gives non-technical users a no-code interface for building and versioning document workflows. The standard activity library covers Import, OCR, Classification, and Extraction, with support for custom microservices and arbitrary routing logic. Process Monitor provides real-time operational visibility across all processes and work items, with drill-down to document and field level. An end-to-end audit trail captures the full interaction history for each work item across its lifespan.

Vinna 3.0 and 3.1 releases

Vinna 3.0, released after 18 months of development, rebuilt the Process Editor in Angular 10 with componentized functions and full BPMN support. Process version management now supports major and minor versioning (1.0, 1.1, 2.0) with draft protection and separate variable configuration per runtime environment. The release introduced transient Activity Servers that auto-start with virtual machines or Docker containers and shut down when idle, enabling dynamic resource scaling without manual intervention. Vinna 3.0 maintains full backward compatibility with Vinna 2.4, allowing step-by-step upgrades without production disruption.

Vinna 3.1 replaced password-based authentication with OAuth2 protocol, supporting Microsoft Azure Active Directory and Keycloak as identity backends. Three authentication flows are available: Authorization Code Flow with PKCE for user-facing applications, username/password via password grant flow, and client credentials grant flow for machine-to-machine communication. Skilja's position on internal credential management is direct: "A platform should not have its own user management. This is a no-go for many customers." Role-based access control maps to enterprise user directories rather than storing credentials internally.

Vinna deploys on Microsoft Azure, AWS, Google Cloud, private cloud, on-premise, or hybrid configurations.

Use cases

German insurance and public sector

All major German public insurance companies have implemented IDP for customer correspondence processing, according to Goerke. The TK deployment at 600,000 documents per day and 87% automation is the most documented example. The transformer-based OCR approach handles complex policy documents, claim forms, and customer correspondence without the brittleness of rule-based character recognition systems. On-premise deployment options address data residency requirements that cloud-only platforms cannot meet.

High-volume enterprise document processing

Organizations requiring throughput above 100,000 documents per day use Vinna's BPMN workflow engine and intelligent load balancing for complex orchestration across distributed environments. The online learning capability continuously improves accuracy from human corrections. Skilja's "sagacious forgetting" mechanism prevents model deterioration by unlearning incorrect patterns, maintaining accuracy as document types and formats evolve over time.

Compliance-sensitive environments

European organizations subject to strict data sovereignty requirements deploy Skilja on-premise with OIDC 2.0 authentication and encrypted communication. Veracode Verified certification and regular penetration testing during Vinna development address security requirements in financial services and healthcare. The audit trail captures detailed interaction history for the lifespan of each work item, meeting business-critical compliance requirements for regulated document processing.

Technical specifications

Feature	Specification
OCR technology	Transformer-based word-level deep learning
Classification latency	1 millisecond (LAERA)
Processing capacity	100M+ documents per year (single system)
Input formats	JPG, TIF, PNG, PDF
Output formats	Structured JSON, plain text, PDF with full-text layer
Deployment	On-premise, AWS, Azure, Google Cloud, hybrid
Container support	Docker, Kubernetes
Authentication	OAuth2, OIDC 2.0, Azure AD, Keycloak
Databases	PostgreSQL, MS-SQL, Oracle
Security certification	Veracode Verified
Workflow design	BPMN-based, no-code Process Editor
Integration	RESTful API, RPA systems, SQL databases
Learning	Online learning with sagacious forgetting

Resources

Skilja product overview
VINNA platform details
TEGRA integration layer
Security and compliance documentation
Technology blog
Classification resources

:::recent 3 :::