On This Page

LLM-first document processing platform with visual editing, multimodal extraction, and email-driven workflows for finance, legal, and real estate applications.

Sensible.so

Overview

Sensible provides a document processing API combining LLM parsing with visual layout rules for extracting data from PDFs, emails, spreadsheets, and now Excel files. Founded in San Francisco in 2020 with $8M in funding led by Craft Ventures, the company has evolved from a developer-focused extraction tool into a comprehensive LLM-first platform targeting operations-heavy enterprise workflows.

Through early 2026, Sensible closed two meaningful format and workflow gaps. In January, it launched Visual Document Extraction with natural language editing - users describe fields in plain English rather than writing extraction rules - alongside a Multimodal Engine for embedded images, charts, and handwriting, and email-driven processing that handles attachments by forwarding emails directly to Sensible. In February, it added native Excel (.xlsx) support across all API endpoints and the Sensible app, addressing a persistent gap in PDF-first platforms where operational data living in spreadsheets required manual preprocessing. The implementation handles two structural edge cases specific to Excel: wide sheets are rescaled using .xlsx metadata to preserve all columns on a single page, and multi-sheet workbooks are split into per-sheet pages with sheet titles prepended, enabling each sheet to route through the existing portfolios feature as an independent document.

The company's hybrid approach combines traditional layout-based methods with LLM capabilities. Third-party integrations including a Raycast extension and Zapier connector indicate growing developer adoption in a PDF data extraction market projected at $2.0 billion in 2025 at 13.6% CAGR. For teams evaluating open-source alternatives that also use LLMs for structured extraction, LangExtract offers a comparable zero-shot approach built on Google's Python library.

How Sensible.so Processes Documents

Sensible's extraction pipeline combines three methods: LLM-based parsing for semantic field identification, visual layout rules for position-dependent data, and multimodal analysis for non-text content. Users configure parsers through a visual cards interface using plain English descriptions - "capture middle names" - rather than code, with the platform generating the underlying extraction logic automatically.

Documents enter through REST API endpoints, direct upload in the Sensible app, or email forwarding for attachment-driven workflows. The Classification API routes incoming documents to the appropriate parser before extraction begins. For high-volume scenarios, Batch Extract processes up to 5,000 documents concurrently. Agentic LLM workflows handle complex documents requiring multi-step reasoning across pages or sections.

Format handling covers PDF, email, Excel (.xlsx), TIFF, JPEG, PNG, and Word. Excel files receive format-specific preprocessing: wide sheets are rescaled via .xlsx metadata to prevent column truncation, and multi-sheet workbooks are paginated with sheet titles inserted at the top of each page. Extracted results pass through optional human review before delivery via webhook or SDK.

The platform migrated from text-davinci-003 to gpt-3.5-turbo-0613 in 2025, with agentic workflow capabilities added in December 2025 - the architecture is designed to absorb model upgrades without requiring parser reconfiguration.

Use Cases

Financial Document Processing

Fintech companies integrate Sensible's API to extract data from bank statements, tax forms, financial reports, and - following the February 2026 Excel update - spreadsheet-based financial data that previously required manual handling. The Visual Document Extraction feature allows finance teams to create custom parsers using natural language, while the Multimodal Engine processes embedded charts and signatures. Excel's role in billing and financial operations makes the native .xlsx support particularly relevant here: existing extraction configurations can be reused across sheet-level documents without modification. Teams processing high volumes of financial documents with outcome-based pricing requirements may also evaluate AmyGB, which invoices only after measurable business results are achieved.

Law firms use Sensible's email-driven processing to automatically extract data from contract attachments and legal filings received by email, eliminating the upload step from otherwise automated pipelines. The platform's agentic AI workflows handle complex legal document structures requiring multi-step reasoning. HIPAA compliance covers sensitive legal data where health information intersects with legal proceedings. Teams requiring a no-code LLM platform with hallucination mitigation for similar legal extraction workflows may also evaluate Unstract as a self-hostable alternative.

Real Estate Automation

Real estate applications leverage Batch Extract to process thousands of property documents concurrently. The Zapier integration enables automated workflows connecting extracted property data to Salesforce and databases. Excel is explicitly named among the target verticals for the February 2026 format expansion, reflecting the prevalence of spreadsheet-based property data in real estate operations. Vendors focused exclusively on real estate document workflows, such as Evana.ai, offer a point of comparison for teams evaluating depth of property-specific document handling against Sensible's horizontal API approach.

Technical Specifications

Feature Specification
Core Products Extract API, Classification API, Visual Editor
Parsing Methods LLM parsing, visual layout rules, multimodal extraction, agentic workflows
Document Types PDF, Excel (.xlsx), email, TIFF, JPEG, PNG, Word
Excel Handling Wide-sheet rescaling via .xlsx metadata; multi-sheet pagination with sheet-title prepending; per-sheet routing via portfolios feature
LLM Integration gpt-3.5-turbo-0613, agentic LLM workflows
Batch Processing Up to 5,000 documents concurrently
Pre-Configured Parsers 150+ ready-to-use parsers for common document types
SDKs JavaScript, Python, Node SDK
Integration REST API, webhooks, Zapier connector (beta)
Infrastructure AWS, multi-region deployment
Compliance SOC 2, HIPAA
Target Verticals Finance, legal, real estate, manufacturing, retail, billing

Resources

Company Information

Headquarters: San Francisco, United States

Founded: 2020

Employees: 11

Funding: $8M (Seed round led by Craft Ventures, with Engineering Capital and Clocktower Technology Ventures)