Nanonets
AI-powered document processing platform offering OCR, workflow automation, and open-source tools for enterprise document operations.

Overview
Nanonets, founded in 2017 by Sarthak Jain and Prathamesh Juvatkar and headquartered in San Francisco, develops cloud-based document processing software combining OCR, machine learning, and workflow automation. The company raised $29 million in Series B funding in 2024 led by Accel and serves over 1,000 enterprises including 34% of Global Fortune 500 companies.
In August 2025, Nanonets made a strategic shift toward open-source with the release of DocStrange, an MIT-licensed Python library featuring a 7B parameter model. This marked the company's first major foray into open-source, offering both cloud API processing with 10,000 free documents monthly and complete local processing capabilities. The hybrid approach directly addresses enterprise privacy concerns while competing against cloud-only providers like Rossum.
By late 2025, Nanonets was recognized as a mid-market OCR solution ranking fifth among eight leading platforms in finance sector analysis, and was included in TechTarget's roundup of notable low-code/no-code AI tools. Nanonets also appears in Gartner's Intelligent Document Processing Solutions directory alongside 19 competitors including Google Document AI, ABBYY, and Rossum — a standing listing updated December 2025, though no Magic Quadrant positioning or peer review scores accompany it. In early 2026, the company expanded its educational content strategy with the LLM Structured Outputs Handbook, positioning itself as both technology provider and educational resource.
A product direction worth watching: Nanonets has introduced AI Guidelines for its instant learning models, allowing users to write plain-language instructions that govern extraction edge cases — vendor-specific rules, jurisdiction-based tax logic such as VAT handling by region, and page-level filtering to isolate invoice pages from multi-page documents — without model retraining or reconfiguration. The feature targets the long tail of document variation that structured rules engines handle poorly, a gap that several competitors including Affinda and Rossum are also pursuing through natural-language configuration layers. A publication date for this release has not been confirmed from available sources.
How Nanonets Processes Documents
Nanonets IDP features AI-powered OCR with text extraction from images and scanned documents, claiming 99% accuracy and scoring 64.5 ± 1.1 on olmOCR-bench. The DocStrange open-source library provides a 7B parameter model with local and cloud processing options, enabling template-free processing through automatic data capture without pre-configured templates. The no-code platform offers a user-friendly interface for configuration without technical expertise, while vision language models handle document-to-Markdown conversion. Multi-channel capture supports document intake via email, mobile photo, scan, and API, with ERP/CRM integration through pre-built connectors for NetSuite, QuickBooks, Sage, SAP, and Salesforce.
The AI Guidelines feature extends this template-free approach: rather than configuring rules engines, users express extraction logic in natural language. Practical applications include applying different VAT handling rules by jurisdiction, extracting only invoice pages from mixed multi-page documents, and handling vendor-specific formatting exceptions — all without touching the underlying model.
Use Cases
Accounts Payable Automation
Finance teams automate invoice processing from capture through payment with three-way matching against purchase orders and receiving documents, routing through approval workflows based on amount thresholds.
Healthcare Document Processing
Healthcare providers process patient intake forms, insurance cards, and medical records with HIPAA-compliant encrypted data handling and direct EHR system population.
Expense Management
Organizations automate expense reimbursement through mobile receipt capture, extracting merchant details and categorizing expenses while flagging policy violations.
Technical Specifications
| Feature | Specification |
|---|---|
| Deployment | Cloud-based SaaS, Local processing (DocStrange) |
| OCR Accuracy | Up to 99% (claimed), 64.5 ± 1.1 on olmOCR-bench |
| Open-Source Model | 7B parameter DocStrange library (MIT license) |
| API | RESTful API with webhooks |
| Document Formats | PDF, JPEG, PNG, TIFF, BMP, Office formats |
| Security | SOC 2, GDPR, HIPAA compliant |
| Pricing | Free startup tier, 10,000 free docs/month (DocStrange) |
| AI Guidelines | Plain-language extraction rules for instant learning models; no retraining required |
Resources
- Website
- DocStrange Open-Source Library
- OCR API
- LLM Structured Outputs Handbook
- Documentation
- Nanonets-OCR-s Model
- AI Guidelines Documentation
- Gartner IDP Solutions Directory
- Nanonets: Competitive Analysis
Company Information
Headquarters: San Francisco, California, United States
Founded: 2017
Founders: Sarthak Jain, Prathamesh Juvatkar
Funding: $29 million Series B (2024, Accel lead)
Address: 156 2nd Street, San Francisco, CA 94105
Phone: +1 650 382 8676
Email: info@nanonets.com
Customers: 1,000+ enterprises, 34% of Global Fortune 500
Compliance: SOC 2, GDPR, HIPAA
Sources: 2025-08 [vendor: DocStrange release | github.com] MIT-licensed 7B parameter open-source library (https://github.com/NanoNets/docstrange) · 2025-11 [third_party: finance OCR ranking | ibtimes.com] Fifth among eight platforms in finance sector analysis (https://www.ibtimes.com/best-ocr-data-extraction-software-finance-accounting-teams-2025-3782986) · 2025-11 [third_party: low-code AI tools roundup | techtarget.com] Included in TechTarget's notable low-code/no-code AI tools (https://www.techtarget.com/searchenterpriseai/tip/Low-code-no-code-tools-simplify-AI-customization-for-engineers) · 2025-12 [third_party: vendor directory | gartner.com] Listed in Gartner IDP Solutions directory, updated December 2025 (https://www.gartner.com/reviews/market/intelligent-document-processing-solutions) · 2026-01 [vendor: educational content | nanonets.com] LLM Structured Outputs Handbook published (https://nanonets.com/cookbooks/structured-llm-outputs/) · 2026-02 [vendor: product changelog | changelog.nanonets.com] AI Guidelines for instant learning models — date unconfirmed (https://changelog.nanonets.com/) · 2026-02 [vendor: AI Guidelines docs | docs.nanonets.com] Feature documentation (https://docs.nanonets.com/update/docs/ai-agent-guidelines)