PSPDFKit (Now Nutrient): PDF SDK and Document AI Platform

On This Page

Overview
How PSPDFKit (Now Nutrient) processes documents
Agentic document editing agent
MCP servers for agentic AI stacks
Python SDK
Nutrient Workflow platform
Use cases
Legal and contracts
Financial services
Procurement and supply chain
Healthcare and government
Developer tooling and AI infrastructure
Technical specifications
Resources
Company information

PSPDFKit is now Nutrient

PSPDFKit rebranded to Nutrient in 2024. See the Nutrient vendor page for current product information, features, and capabilities.

Overview

Founded in 2011 by Peter Steinberger, PSPDFKit built its reputation on high-fidelity PDF rendering, annotation, and editing SDKs across iOS, Android, and JavaScript platforms. As one of the leading document processing vendors, the company expanded through acquisitions of Muhimbi, ORPALIS, Aquaforest, and Integrify following a €100M investment from Insight Partners in 2021. In 2024, those products were unified under the Nutrient brand, marking the company's evolution from a PDF SDK into a broader document processing platform.

By March 2026, Nutrient had repositioned further still. The company launched an agentic document editing agent embedded directly in its Viewer SDK, published three open-source Model Context Protocol (MCP) servers, and released a production-ready Python SDK. The trajectory is deliberate: Nutrient is building document processing infrastructure for agentic AI stacks, not just a PDF tool.

Jonathan Rhyne, co-founder and CEO, framed the shift in the March 25, 2026 press release: "Every enterprise we talk to is trying to solve the same problem: they need AI that works with their documents, not just reads them. But they also need it to run inside their own application, under their own rules."

How PSPDFKit (Now Nutrient) processes documents

PSPDFKit's core processing architecture separates rendering and SDK capabilities from server-side document transformation. The Document Engine handles splitting, merging, watermarking, form filling, redaction, and annotation at the server layer, while client SDKs (iOS, Android, JavaScript) handle viewing, annotation, and editing at the application layer.

As of March 2026, that architecture has a third layer: an agentic reasoning system that plans, executes, and adapts across multi-step document tasks using purpose-built tools for rendering, structure-aware extraction, form operations, annotation, and redaction.

Agentic document editing agent

Nutrient's AI Assistant expansion, announced March 25, 2026, introduced an agentic document editing agent that operates embedded within the Viewer SDK across Web, iOS, Android, and React Native platforms. The editing agent is available now in the Web Viewer SDK; mobile and hybrid platform support arrives later in 2026.

The agent's architecture differs from standard retrieval-augmented generation (RAG) approaches. As Nutrient's technical blog explains, the agent has access to full page renders and low-level document metadata, not just extracted text, enabling it to reason through edge cases, ambiguity, and multi-step workflows that would otherwise require custom code. Edits are staged for review and applied permanently only after approval, maintaining human control of final output.

Policy-governed autonomy is the key differentiator from binary autonomous/manual agent models. Organizations configure three tiers: autonomous actions the agent executes without confirmation, confirmation-required actions that pause for human review, and prohibited actions the agent cannot take. Custom skills, defined as reusable task definitions encoding domain-specific logic, let organizations encode their own governance rules into the agent's behavior.

The agent connects to the organization's choice of LLM provider: OpenAI, Azure OpenAI, AWS Bedrock, Anthropic Claude, self-hosted models, or any OpenAI-compatible API including vLLM, Hugging Face TGI, and Ollama. This keeps inference costs and data residency under enterprise control rather than routing through Nutrient's infrastructure.

SD Times noted that "a reasoning layer plans work, selects tools, and carries multi-step document tasks through to completion using the viewer's high-fidelity rendering and document structure." Planned enhancements include workflow templates for common business processes, deeper visual understanding for complex document layouts, and observability tools for agent decision transparency.

MCP servers for agentic AI stacks

In early 2026, Nutrient published three MIT-licensed MCP servers that expose document processing capabilities to AI agents via natural language. The three servers address distinct deployment contexts rather than overlapping.

The Document Engine MCP Server (@nutrient-sdk/document-engine-mcp-server) targets self-hosted deployments. It wraps the Document Engine and exposes four capability areas to AI agents: content access (text extraction, page rendering, form and table extraction), document editing (splitting, watermarking, form filling), security and redaction (SSN and credit card presets plus custom regex), and annotations (highlights, notes, stamps). The server ships with four reference workflow implementations: a LangGraph contract processing pipeline, OpenAI Agents SDK equivalents, a React plus LangGraph document chat interface, and a LangGraph.js procurement agent that classifies purchase orders, invoices, and payment documents and flags items for human review. The Document Engine requires a commercial ACTIVATION_KEY; evaluation mode adds watermarks.

The DWS MCP Server (@nutrient-sdk/dws-mcp-server) connects AI agents to Nutrient's cloud Document Web Services Processor API via stdio transport, requiring only Node.js 18+ and a NUTRIENT_DWS_API_KEY. Four tools are exposed: document_processor (merge, convert, OCR, watermark, rotate, redact, flatten, extract text, tables, and key-value pairs), document_signer (PAdES-compliant CMS and CAdES digital signatures, visible or invisible), sandbox_file_tree, and directory_tree. Format support spans PDF, DOCX, XLSX, PPTX, PNG, JPEG, WebP, PDF/A, PDF/UA, HTML, and Markdown. Supported platforms include Claude Desktop, Cursor, Windsurf, VS Code with GitHub Copilot, LangChain, LangGraph, and the OpenAI Agents SDK. A free tier is available at dashboard.nutrient.io.

The PDF MCP Server (nutrient-pdf-mcp-server, PyPI: nutrient_pdf_mcp-1.0.0) is explicitly a developer debugging tool, not a production workflow server. It exposes two tools: get_pdf_object_tree (returns a JSON representation of the PDF object tree with lazy loading) and resolve_indirect_object (resolves a specific indirect object by object and generation number). Token efficiency is concrete: lazy mode returns 5-50 lines, shallow resolution 50-100 lines, and deep resolution 500+ lines. Navigation uses dot notation (e.g., Pages.Kids.0.MediaBox.2). Nutrient routes developers across all three servers by use case: this server for debugging, DWS MCP for cloud production workflows, Document Engine MCP for self-hosted production workflows.

Python SDK

The Python SDK launched in February 2026 covers bidirectional document conversion (PDF, Word, Excel, PowerPoint, HTML, Markdown, images), optical character recognition (OCR) across 100+ languages with automated skew correction and noise removal, zone-based permanent redaction claimed GDPR and HIPAA compliant (removing content from file structure rather than applying a visual overlay), key-value pair extraction for invoices, receipts, bank statements, and forms exported to JSON, digital signatures, fillable form creation and batch-fill from databases, and Word template population via JSON. The SDK includes type hints, async support, and compatibility with Django, Flask, and FastAPI. Batch processing scales linearly with CPU cores.

Two capabilities are on the roadmap but not yet available: page-aware architecture enabling true concurrent processing of individual page objects, and AI-powered document understanding returning full document structure as JSON including relationships and hierarchies. A free trial is available at nutrient.io/sdk/try/.

The SDK addresses format preservation failures, library fragmentation, and batch scaling limitations that developers currently solve by stitching together multiple libraries like PyMuPDF and pdfplumber. Developers evaluating Python-based alternatives can find a broader comparison in the Python PDF libraries guide.

Nutrient Workflow platform

Nutrient Workflow is a low-code workflow automation platform combining document processing, AI-powered data extraction, and approval routing. The platform supports AI-powered document data extraction without templates, auto-populating forms with names, dates, and totals from unstructured files. Deployment models include cloud, private cloud, self-managed on-premises, and hybrid options.

Published case studies document concrete outcomes: automation of multilevel asset acquisitions across six Latin American offices handling 236 requests; capital finance approvals with SAP integration and SOX compliance reducing approval cycles from weeks to a single day for 150 employees; and academic workflows reducing approval cycles from weeks to days. Integrations cover SharePoint, Exchange, SAP, SQL Server, Deltek, Salesforce, HubSpot, Email, SFTP, Power Automate, Zapier, and AWS Lambda.

Use cases

Legal and contracts

PDF annotation, redaction, and form filling SDKs are embedded in legal document review workflows. The Document Engine MCP Server ships with a LangGraph contract processing reference implementation, enabling AI agents to extract, classify, and route contract documents autonomously. The agentic editing agent's three-tier approval model addresses the compliance requirement that final document changes remain under human control before execution.

Financial services

Key-value pair extraction via the Python SDK targets invoices, receipts, bank statements, and forms, exporting results to JSON. The DWS MCP Server exposes the same extraction capabilities to AI agents via cloud API. Redaction presets cover SSN and credit card patterns out of the box, with custom regex for additional personally identifiable information (PII) types. Tiny IDP takes a complementary approach in this space, using GPT-4o and Claude Sonnet 3.7 to convert similar financial documents directly to structured JSON via API.

Procurement and supply chain

The Document Engine MCP Server's LangGraph.js procurement agent reference implementation classifies purchase orders, invoices, and payment documents and flags items for human review, serving as a concrete workflow template for procurement automation teams. Unstract is an open-source alternative in this space, offering no-code LLM-based extraction with hallucination mitigation for similar document types.

Healthcare and government

Nutrient positions the agentic system for healthcare (patient demographics extraction, PHI redaction) and government (personal identifier redaction in public records). The HIPAA certification and zone-based permanent redaction that removes content from file structure rather than overlaying it are the relevant technical controls for these verticals.

Developer tooling and AI infrastructure

The three-tier MCP architecture and Python SDK together position Nutrient as document processing infrastructure for agentic AI stacks. Developers building on Claude Desktop, LangGraph, OpenAI Agents SDK, Cursor, Windsurf, or VS Code with GitHub Copilot can connect to Nutrient's document engine without custom integration work. LangExtract, Google's open-source Python library for structured extraction from unstructured text using LLMs, addresses a complementary layer of this same developer stack. Teams evaluating self-hosted alternatives for sensitive document workflows may also consider Captova, which targets government and defense deployments with on-premises processing at claimed speeds exceeding 100 pages per second.

Technical specifications

Attribute	Detail
Deployment	Client SDK (iOS, Android, JavaScript), Server component (Document Engine), Web component, Cloud API (DWS), Embedded Viewer SDK
MCP Servers	Document Engine MCP (self-hosted), DWS MCP (cloud), PDF MCP (debugging)
MCP Install	`npx` (Node.js servers), `pip` (Python PDF MCP server)
AI Assistant	Agentic document editing agent; available now in Web Viewer SDK; mobile/hybrid later in 2026
LLM Providers	OpenAI, Azure OpenAI, AWS Bedrock, Anthropic Claude, self-hosted, vLLM, Hugging Face TGI, Ollama
Approval Policies	Three-tier: autonomous, confirmation-required, prohibited
Python SDK	GA with OCR, redaction, conversion, digital signatures, key-value extraction, form fill, template population
Python SDK Roadmap	Page-aware parallel processing (not yet available), AI-powered document understanding (not yet available)
OCR Languages	100+ (Python SDK)
Format Support	PDF, DOCX, XLSX, PPTX, PNG, JPEG, WebP, PDF/A, PDF/UA, HTML, Markdown
Digital Signatures	PAdES-compliant CMS and CAdES (visible and invisible)
Redaction	Zone-based permanent redaction; SSN and credit card presets; custom regex; claimed GDPR and HIPAA compliant
Agentic Platforms	Claude Desktop, LangGraph, LangGraph.js, OpenAI Agents SDK, LangChain, Cursor, Windsurf, VS Code + GitHub Copilot
Workflow Integrations	SharePoint, Exchange, SAP, SQL Server, Deltek, Salesforce, HubSpot, SFTP, Power Automate, Zapier, AWS Lambda
Certifications	HIPAA
Licensing	MIT (all three MCP servers)
Open Source	Yes (MCP servers)
Cloud API Free Tier	Available at `dashboard.nutrient.io`
Python SDK Free Trial	Available at `nutrient.io/sdk/try/`
Document Engine Evaluation	Watermarked output without commercial `ACTIVATION_KEY`

2011Founded by Peter Steinberger

€100MInsight Partners investment (2021)

100+OCR languages supported

3Open-source MCP servers (MIT)

Resources

Current Nutrient vendor page with product information
Company website (formerly pspdfkit.com)
AI Assistant expansion announcement covering agentic document editing agent launch
Agentic document editing technical overview for Web SDK implementation details
DWS MCP Server for cloud API agentic document processing (MIT)
Python SDK announcement covering GA release with OCR, redaction, conversion, and extraction
Nutrient Workflow platform for low-code process automation capabilities and case studies
Document processing with Python for Python library comparison and implementation
Self-hosted document processing for deployment architecture and security considerations
Agentic document processing guide covering how AI agents transform document workflows