PSPDFKit (Now Nutrient) — PDF SDK and Document AI Platform
On This Page
- Overview
- How PSPDFKit (Now Nutrient) processes documents
- Agentic document editing agent
- MCP servers for agentic AI stacks
- Python SDK
- Nutrient Workflow platform
- Use cases
- Legal and contracts
- Financial services
- Procurement and supply chain
- Healthcare and government
- Developer tooling and AI infrastructure
- Technical specifications
- Resources
- Company information
PSPDFKit is now Nutrient
PSPDFKit rebranded to Nutrient in 2024. See the Nutrient vendor page for current product information, features, and capabilities.
Overview
Founded in 2011 by Peter Steinberger, PSPDFKit built its reputation on high-fidelity PDF rendering, annotation, and editing SDKs across iOS, Android, and JavaScript platforms. As one of the leading document processing vendors, the company expanded through acquisitions of Muhimbi, ORPALIS, Aquaforest, and Integrify following a €100M investment from Insight Partners in 2021. In 2024, those products were unified under the Nutrient brand, marking the company's evolution from a PDF SDK into a broader document processing platform.
By March 2026, Nutrient had repositioned further still. The company launched an agentic document editing agent embedded directly in its Viewer SDK, published three open-source Model Context Protocol (MCP) servers, and released a production-ready Python SDK. The trajectory is deliberate: Nutrient is building document processing infrastructure for agentic AI stacks, not just a PDF tool.
Jonathan Rhyne, co-founder and CEO, framed the shift in the March 25, 2026 press release: "Every enterprise we talk to is trying to solve the same problem: they need AI that works with their documents, not just reads them. But they also need it to run inside their own application, under their own rules."
How PSPDFKit (Now Nutrient) processes documents
PSPDFKit's core processing architecture separates rendering and SDK capabilities from server-side document transformation. The Document Engine handles splitting, merging, watermarking, form filling, redaction, and annotation at the server layer, while client SDKs (iOS, Android, JavaScript) handle viewing, annotation, and editing at the application layer.
As of March 2026, that architecture has a third layer: an agentic reasoning system that plans, executes, and adapts across multi-step document tasks using purpose-built tools for rendering, structure-aware extraction, form operations, annotation, and redaction.
Agentic document editing agent
Nutrient's AI Assistant expansion, announced March 25, 2026, introduced an agentic document editing agent that operates embedded within the Viewer SDK across Web, iOS, Android, and React Native platforms. The editing agent is available now in the Web Viewer SDK; mobile and hybrid platform support arrives later in 2026.
The agent's architecture differs from standard retrieval-augmented generation (RAG) approaches. As Nutrient's technical blog explains, the agent has access to full page renders and low-level document metadata, not just extracted text, enabling it to reason through edge cases, ambiguity, and multi-step workflows that would otherwise require custom code. Edits are staged for review and applied permanently only after approval, maintaining human control of final output.
Policy-governed autonomy is the key differentiator from binary autonomous/manual agent models. Organizations configure three tiers: autonomous actions the agent executes without confirmation, confirmation-required actions that pause for human review, and prohibited actions the agent cannot take. Custom skills, defined as reusable task definitions encoding domain-specific logic, let organizations encode their own governance rules into the agent's behavior.
The agent connects to the organization's choice of LLM provider: OpenAI, Azure OpenAI, AWS Bedrock, Anthropic Claude, self-hosted models, or any OpenAI-compatible API including vLLM, Hugging Face TGI, and Ollama. This keeps inference costs and data residency under enterprise control rather than routing through Nutrient's infrastructure.
SD Times noted that "a reasoning layer plans work, selects tools, and carries multi-step document tasks through to completion using the viewer's high-fidelity rendering and document structure." Planned enhancements include workflow templates for common business processes, deeper visual understanding for complex document layouts, and observability tools for agent decision transparency.
MCP servers for agentic AI stacks
In early 2026, Nutrient published three MIT-licensed MCP servers that expose document processing capabilities to AI agents via natural language. The three servers address distinct deployment contexts rather than overlapping.
The Document Engine MCP Server (@nutrient-sdk/document-engine-mcp-server) targets self-hosted deployments. It wraps the Document Engine and exposes four capability areas to AI agents: content access (text extraction, page rendering, form and table extraction), document editing (splitting, watermarking, form filling), security and redaction (SSN and credit card presets plus custom regex), and annotations (highlights, notes, stamps). The server ships with four reference workflow implementations: a LangGraph contract processing pipeline, OpenAI Agents SDK equivalents, a React plus LangGraph document chat interface, and a LangGraph.js procurement agent that classifies purchase orders, invoices, and payment documents and flags items for human review. The Document Engine requires a commercial ACTIVATION_KEY; evaluation mode adds watermarks.
The DWS MCP Server (@nutrient-sdk/dws-mcp-server) connects AI agents to Nutrient's cloud Document Web Services Processor API via stdio transport, requiring only Node.js 18+ and a NUTRIENT_DWS_API_KEY. Four tools are exposed: document_processor (merge, convert, OCR, watermark, rotate, redact, flatten, extract text, tables, and key-value pairs), document_signer (PAdES-compliant CMS and CAdES digital signatures, visible or invisible), sandbox_file_tree, and directory_tree. Format support spans PDF, DOCX, XLSX, PPTX, PNG, JPEG, WebP, PDF/A, PDF/UA, HTML, and Markdown. Supported platforms include Claude Desktop, Cursor, Windsurf, VS Code with GitHub Copilot, LangChain, LangGraph, and the OpenAI Agents SDK. A free tier is available at dashboard.nutrient.io.
The PDF MCP Server (nutrient-pdf-mcp-server, PyPI: nutrient_pdf_mcp-1.0.0) is explicitly a developer debugging tool, not a production workflow server. It exposes two tools: get_pdf_object_tree (returns a JSON representation of the PDF object tree with lazy loading) and resolve_indirect_object (resolves a specific indirect object by object and generation number). Token efficiency is concrete: lazy mode returns 5-50 lines, shallow resolution 50-100 lines, and deep resolution 500+ lines. Navigation uses dot notation (e.g., Pages.Kids.0.MediaBox.2). Nutrient routes developers across all three servers by use case: this server for debugging, DWS MCP for cloud production workflows, Document Engine MCP for self-hosted production workflows.
Python SDK
The Python SDK launched in February 2026 covers bidirectional document conversion (PDF, Word, Excel, PowerPoint, HTML, Markdown, images), optical character recognition (OCR) across 100+ languages with automated skew correction and noise removal, zone-based permanent redaction claimed GDPR and HIPAA compliant (removing content from file structure rather than applying a visual overlay), key-value pair extraction for invoices, receipts, bank statements, and forms exported to JSON, digital signatures, fillable form creation and batch-fill from databases, and Word template population via JSON. The SDK includes type hints, async support, and compatibility with Django, Flask, and FastAPI. Batch processing scales linearly with CPU cores.
Two capabilities are on the roadmap but not yet available: page-aware architecture enabling true concurrent processing of individual page objects, and AI-powered document understanding returning full document structure as JSON including relationships and hierarchies. A free trial is available at nutrient.io/sdk/try/.
The SDK addresses format preservation failures, library fragmentation, and batch scaling limitations that developers currently solve by stitching together multiple libraries like PyMuPDF and pdfplumber. Developers evaluating Python-based alternatives can find a broader comparison in the Python PDF libraries guide.
Nutrient Workflow platform
Nutrient Workflow is a low-code workflow automation platform combining document processing, AI-powered data extraction, and approval routing. The platform supports AI-powered document data extraction without templates, auto-populating forms with names, dates, and totals from unstructured files. Deployment models include cloud, private cloud, self-managed on-premises, and hybrid options.
Published case studies document concrete outcomes: automation of multilevel asset acquisitions across six Latin American offices handling 236 requests; capital finance approvals with SAP integration and SOX compliance reducing approval cycles from weeks to a single day for 150 employees; and academic workflows reducing approval cycles from weeks to days. Integrations cover SharePoint, Exchange, SAP, SQL Server, Deltek, Salesforce, HubSpot, Email, SFTP, Power Automate, Zapier, and AWS Lambda.
Use cases
Legal and contracts
PDF annotation, redaction, and form filling SDKs are embedded in legal document review workflows. The Document Engine MCP Server ships with a LangGraph contract processing reference implementation, enabling AI agents to extract, classify, and route contract documents autonomously. The agentic editing agent's three-tier approval model addresses the compliance requirement that final document changes remain under human control before execution.
Financial services
Key-value pair extraction via the Python SDK targets invoices, receipts, bank statements, and forms, exporting results to JSON. The DWS MCP Server exposes the same extraction capabilities to AI agents via cloud API. Redaction presets cover SSN and credit card patterns out of the box, with custom regex for additional personally identifiable information (PII) types. Tiny IDP takes a complementary approach in this space, using GPT-4o and Claude Sonnet 3.7 to convert similar financial documents directly to structured JSON via API.
Procurement and supply chain
The Document Engine MCP Server's LangGraph.js procurement agent reference implementation classifies purchase orders, invoices, and payment documents and flags items for human review, serving as a concrete workflow template for procurement automation teams. Unstract is an open-source alternative in this space, offering no-code LLM-based extraction with hallucination mitigation for similar document types.
Healthcare and government
Nutrient positions the agentic system for healthcare (patient demographics extraction, PHI redaction) and government (personal identifier redaction in public records). The HIPAA certification and zone-based permanent redaction that removes content from file structure rather than overlaying it are the relevant technical controls for these verticals.
Developer tooling and AI infrastructure
The three-tier MCP architecture and Python SDK together position Nutrient as document processing infrastructure for agentic AI stacks. Developers building on Claude Desktop, LangGraph, OpenAI Agents SDK, Cursor, Windsurf, or VS Code with GitHub Copilot can connect to Nutrient's document engine without custom integration work. LangExtract, Google's open-source Python library for structured extraction from unstructured text using LLMs, addresses a complementary layer of this same developer stack. Teams evaluating self-hosted alternatives for sensitive document workflows may also consider Captova, which targets government and defense deployments with on-premises processing at claimed speeds exceeding 100 pages per second.
Technical specifications
| Attribute | Detail |
|---|---|
| Deployment | Client SDK (iOS, Android, JavaScript), Server component (Document Engine), Web component, Cloud API (DWS), Embedded Viewer SDK |
| MCP Servers | Document Engine MCP (self-hosted), DWS MCP (cloud), PDF MCP (debugging) |
| MCP Install | npx (Node.js servers), pip (Python PDF MCP server) |
| AI Assistant | Agentic document editing agent; available now in Web Viewer SDK; mobile/hybrid later in 2026 |
| LLM Providers | OpenAI, Azure OpenAI, AWS Bedrock, Anthropic Claude, self-hosted, vLLM, Hugging Face TGI, Ollama |
| Approval Policies | Three-tier: autonomous, confirmation-required, prohibited |
| Python SDK | GA with OCR, redaction, conversion, digital signatures, key-value extraction, form fill, template population |
| Python SDK Roadmap | Page-aware parallel processing (not yet available), AI-powered document understanding (not yet available) |
| OCR Languages | 100+ (Python SDK) |
| Format Support | PDF, DOCX, XLSX, PPTX, PNG, JPEG, WebP, PDF/A, PDF/UA, HTML, Markdown |
| Digital Signatures | PAdES-compliant CMS and CAdES (visible and invisible) |
| Redaction | Zone-based permanent redaction; SSN and credit card presets; custom regex; claimed GDPR and HIPAA compliant |
| Agentic Platforms | Claude Desktop, LangGraph, LangGraph.js, OpenAI Agents SDK, LangChain, Cursor, Windsurf, VS Code + GitHub Copilot |
| Workflow Integrations | SharePoint, Exchange, SAP, SQL Server, Deltek, Salesforce, HubSpot, SFTP, Power Automate, Zapier, AWS Lambda |
| Certifications | HIPAA |
| Licensing | MIT (all three MCP servers) |
| Open Source | Yes (MCP servers) |
| Cloud API Free Tier | Available at dashboard.nutrient.io |
| Python SDK Free Trial | Available at nutrient.io/sdk/try/ |
| Document Engine Evaluation | Watermarked output without commercial ACTIVATION_KEY |
Resources
- Current Nutrient vendor page with product information
- Company website (formerly pspdfkit.com)
- AI Assistant expansion announcement covering agentic document editing agent launch
- Agentic document editing technical overview for Web SDK implementation details
- DWS MCP Server for cloud API agentic document processing (MIT)
- Python SDK announcement covering GA release with OCR, redaction, conversion, and extraction
- Nutrient Workflow platform for low-code process automation capabilities and case studies
- Document processing with Python for Python library comparison and implementation
- Self-hosted document processing for deployment architecture and security considerations
- Agentic document processing guide covering how AI agents transform document workflows
Company information
Vienna, Austria. Founded 2011 by Peter Steinberger.
PSPDFKit rebranded to Nutrient in 2024 following a €100M Insight Partners investment in 2021 and acquisitions of Muhimbi, ORPALIS, Aquaforest, and Integrify. Co-founder and CEO Jonathan Rhyne leads the company. See the Nutrient vendor page for current company information.