On This Page

LLM-first document processing platform with visual editing, multimodal extraction, and email-driven workflows for finance, legal, and real estate applications.

Sensible.so

150+Pre-built document parsers
5,000Max concurrent documents (Batch Extract)
$499/moEntry price for 750 documents
100Free documents per month

Overview

Among intelligent document processing vendors, Sensible provides an API combining LLM parsing with deterministic layout rules for extracting structured JSON from PDFs, emails, spreadsheets, and Excel files. Founded in San Francisco in 2020 with $8M in funding led by Craft Ventures, the company targets operations-heavy workflows in finance, legal, insurance, and healthcare.

The platform's core differentiator is SenseML, a proprietary config-as-code extraction syntax that pairs LLM-based semantic understanding with deterministic rules. This hybrid approach produces per-field confidence signals and full audit trails with source coordinates, which matters in regulated industries where explainability is a compliance requirement, not a preference. The agentic LLM workflows launched in late 2025 and detailed in early 2026 extend this further: multi-step pipelines can chain LLM prompts via a source_ids parameter, apply conditional logic for format-adaptive extraction, and perform semantic normalization such as mapping diagnoses to ICD-10 codes or standardizing contract clause categories without re-scanning source documents.

Pricing is per document, not per page. A 50-page loan packet counts as one document. Entry starts at $499/month for 750 extractions (approximately 66 cents per document), with a free tier of 100 documents per month and no credit card required to start. Parsli's March 2026 competitive analysis positions this as premium relative to its own $20/month entry point, while noting Sensible's 150+ pre-built configurations and SOC 2 Type II and HIPAA compliance as the justification for the price gap.

For teams evaluating open-source alternatives that also use LLMs for structured extraction, LangExtract offers a comparable zero-shot approach built on Google's Python library.

We've been able to go to 2,000 members and ~$2MM of tracked premiums in a very short time with one person on operations. We couldn't have done that without Sensible.

Stuart Winchester, CEO, Marble Insurance (Sensible.so case study, 2026)

How Sensible.so processes documents

Sensible's extraction pipeline combines three methods: LLM-based parsing for semantic field identification, visual layout rules for position-dependent data, and multimodal analysis for non-text content including embedded images, charts, and handwriting. Users configure parsers through a visual cards interface using plain English descriptions rather than code, with the platform generating the underlying SenseML extraction logic automatically.

Documents enter through REST API endpoints, direct upload in the Sensible app, or email forwarding for attachment-driven workflows. The Classification API routes incoming documents to the appropriate parser before extraction begins. For high-volume scenarios, Batch Extract processes up to 5,000 documents concurrently. The agentic workflow layer handles complex documents requiring multi-step reasoning: extract bank statement transactions in one step, then analyze the extracted data for merchant frequency and maximum transaction amounts in a second step, without re-scanning the original document.

Format handling covers PDF, email, Excel (.xlsx), TIFF, JPEG, PNG, and Word. Excel files receive format-specific preprocessing: wide sheets are rescaled via .xlsx metadata to prevent column truncation, and multi-sheet workbooks are paginated with sheet titles inserted at the top of each page, enabling per-sheet routing through the existing portfolios feature as independent documents.

The platform migrated from text-davinci-003 to gpt-3.5-turbo-0613 in 2025. The SenseML architecture is designed to absorb model upgrades without requiring parser reconfiguration, which reduces maintenance overhead as underlying LLMs evolve.

1

Ingest

Documents arrive via REST API, direct upload, or email forwarding. The Classification API identifies document type and routes to the matching parser.

2

Extract

SenseML combines LLM semantic parsing with deterministic layout rules. Multimodal analysis handles images, charts, and handwriting in the same pass.

3

Enrich

Agentic workflows chain additional LLM steps on extracted data: semantic normalization, schema versioning, derived fields, and data enrichment without re-scanning.

4

Deliver

Results return as structured JSON via webhook or SDK, with per-field confidence scores and source coordinates for audit trails.

Agentic workflows and what they enable

The agentic workflow capability, announced in detail in early 2026, represents the most significant architectural shift since Sensible's founding. Where earlier versions extracted fields from documents in a single pass, agentic pipelines treat extracted data as an intermediate state that subsequent LLM steps can reason over.

Three use cases illustrate the practical scope. Schema normalization detects form versions within a document batch and applies version-specific extraction strategies automatically, solving a persistent problem in insurance and lending where form layouts change across years or jurisdictions. Semantic normalization standardizes free-text clinical diagnoses to ICD-10 codes or maps contract clause language to defined categories, reducing downstream data cleaning. Data enrichment classifies extracted business names by industry type or derives geographic patterns from address fields, adding analytical value without a separate processing step.

The source_ids parameter enables this by passing the output of one extraction step as the input context for the next, keeping the full pipeline within Sensible's audit framework rather than requiring external orchestration.

Competitive position

Sensible's comparison against Ocrolus reveals where it competes and where it does not. Sensible handles 150+ document types across insurance, healthcare, logistics, legal, and financial services with no geographic restriction. Ocrolus focuses on lending documents and North American formats, and adds fraud detection capabilities that Sensible does not replicate. Sensible explicitly acknowledges this gap, positioning itself for organizations that need breadth across regulated verticals rather than depth in high-risk lending fraud workflows.

On compliance, both hold SOC 2 Type II and HIPAA. Ocrolus additionally holds ISO 27001 and PCI DSS certifications that Sensible does not advertise, which matters for payment processing and international enterprise procurement.

The SenseML learning curve is a documented friction point. Parsli's analysis notes that extending beyond Sensible's 150 pre-built configurations requires developer familiarity with SenseML syntax, while Parsli's own plain English schema builder targets business teams without that resource. Graip.AI's 2026 IDP market analysis identifies no-code and self-service automation as a normalizing competitive requirement, suggesting Sensible's visual editor partially addresses this but the SenseML depth remains a segmentation factor: Sensible targets organizations with developer resources and compliance requirements; lower-cost alternatives target self-service business teams.

The same Graip.AI analysis reports that 95% of generative AI pilots in enterprises failed to deliver expected value (MIT Sloan Management Review, 2025), with the market consolidating around vendors demonstrating live workloads and measurable ROI. Sensible's published pricing, no-credit-card free tier, and named customer outcomes position it for this consolidation phase more directly than vendors requiring sales contact for pricing.

Teams processing high volumes of financial documents with outcome-based pricing requirements may also evaluate AmyGB, which invoices only after measurable business results are achieved. Teams requiring a self-hostable LLM platform for legal extraction may evaluate Unstract.

Use cases

Financial document processing

Fintech companies integrate Sensible's API to extract data from bank statements, tax forms, and financial reports. The per-document pricing model means a 50-page loan packet costs the same as a single-page form, which changes the economics for high-page-count financial documents relative to per-page competitors. The February 2026 Excel update extended coverage to spreadsheet-based financial data that previously required manual preprocessing, with billing and financial operations named explicitly as target verticals. Agentic workflows add a second layer: extract transaction data from a bank statement, then derive merchant frequency and spending patterns from the extracted data in a subsequent step.

Law firms use Sensible's email-driven processing to automatically extract data from contract attachments and legal filings received by email, removing the upload step from otherwise automated pipelines. The agentic workflow layer handles complex legal document structures requiring multi-step reasoning, and semantic normalization maps contract clause language to defined categories for downstream contract management systems. HIPAA compliance covers sensitive legal data where health information intersects with legal proceedings. Vendors focused on self-hostable deployment for legal workflows, such as Unstract, offer a point of comparison for teams with data residency requirements.

Insurance and healthcare

The Marble Insurance outcome, 2,000 members and approximately $2MM in tracked premiums managed by one operations person, illustrates the staffing leverage Sensible enables in insurance operations. The platform handles insurance loss runs and multi-page policy documents outside Ocrolus's scope. ICD-10 semantic normalization in agentic workflows directly addresses healthcare extraction requirements where free-text clinical language must map to standardized codes before entering downstream systems.

Real estate automation

Real estate applications use Batch Extract to process thousands of property documents concurrently. The Zapier integration connects extracted property data to Salesforce and databases without custom API work. Excel is named among the target verticals for the February 2026 format expansion, reflecting the prevalence of spreadsheet-based property data in real estate operations. Vendors focused exclusively on real estate document workflows, such as Evana.ai, offer a point of comparison for teams evaluating property-specific document handling depth against Sensible's horizontal API approach.

Technical specifications

Feature Specification
Core products Extract API, Classification API, Visual Editor
Parsing methods LLM parsing, SenseML deterministic rules, multimodal extraction, agentic workflows
Document types PDF, Excel (.xlsx), email, TIFF, JPEG, PNG, Word
Excel handling Wide-sheet rescaling via .xlsx metadata; multi-sheet pagination with sheet-title prepending; per-sheet routing via portfolios feature
LLM integration gpt-3.5-turbo-0613; agentic chaining via source_ids parameter
Batch processing Up to 5,000 documents concurrently
Pre-configured parsers 150+ ready-to-use parsers for common document types
SDKs JavaScript, Python, Node
Integration REST API, webhooks, Zapier connector (beta)
Infrastructure AWS, multi-region deployment
Compliance SOC 2 Type II, HIPAA (Enterprise plan)
Pricing Free: 100 documents/month; $499/month for 750 documents (~66¢/document)
Target verticals Finance, legal, insurance, healthcare, real estate, manufacturing, retail, logistics

Resources

Company information

Headquarters: San Francisco, United States

Founded: 2020

Employees: 11

Funding: $8M seed round led by Craft Ventures, with Engineering Capital and Clocktower Technology Ventures participating.