On This Page

Box evolved from cloud storage to intelligent content management, now competing directly in document processing with agentic AI extraction, autonomous multi-step workflows, and governance controls built for regulated industries.

Box

Overview

Box has transformed from a file-sharing service into an AI-powered intelligent content management platform that competes directly with traditional intelligent document processing (IDP) vendors. Founded in 2005, the company pivoted in 2024 with its acquisition of Alphamoon, an IDP specialist whose document classification and structured extraction technology now underpins Box Extract.

Box Extract reached general availability on January 15, 2026 for Enterprise Advanced plan customers, marking the first time Box shipped a purpose-built IDP product positioned explicitly against ABBYY, Hyland, and Kofax. Then, on April 2, 2026, Box Agent reached general availability for Enterprise Plus and Enterprise Advanced customers, adding autonomous multi-step workflows that chain document discovery, extraction, reasoning, and output generation in a single flow. The two products together represent Box's clearest statement yet that it is building content intelligence infrastructure, not just storage.

Analyst Alan Pelz-Sharpe of Deep Analysis called Box Extract "a good first step" while flagging vertical-specific accelerators as the growth ceiling. Constellation Research's Holger Mueller called it "a significant upgrade for the future of work." By April 2026, TechTarget reported that CTO Ben Kus had sharpened the competitive claim: "We're just better at unstructured data than anyone else who has tried it. We have a whole team dedicated to doing this stuff well."

CEO Aaron Levie frames the shift as turning content into context: "Enterprises everywhere are looking to harness AI to transform their businesses, but AI can only reach its full potential if it understands the unique context of an organization. That context lives within contracts, research materials, marketing assets, financial documents, and other forms of enterprise content." Box reported Q3 fiscal 2026 revenue of $301 million, representing 9% year-over-year growth, with strategic partnerships spanning TCS for enterprise digital transformation, Microsoft 365 Copilot, IBM watsonx models, ServiceNow, Salesforce, and Workday.

10xFaster contract processing at RWS Global
7xFaster contract review at Congo Brands
$301MQ3 FY2026 revenue (9% YoY growth)
62%Organizations cite poor data governance as top AI obstacle

How Box processes documents

Box Extract's pipeline breaks documents into paragraphs, tables, and charts rather than flat text, then applies targeted field extraction stored as Box metadata. This is a meaningful departure from traditional optical character recognition (OCR), which converts images to text without understanding structure or semantic relationships. As Moor Insights & Strategy noted, the agentic reasoning approach enables Box Extract to handle variable layouts and formats without scenario-specific prompts, addressing a core limitation of template-based extraction tools.

Two built-in agent tiers handle different document complexity. The Standard Extract Agent processes simple, uniform documents faster and at lower cost. The Enhanced Extract Agent applies multimodal document structure reasoning for large, complex, or highly variable documents. Customers configure Custom Extract Agents tied to specific folders, triggering extraction automatically on document arrival. Monitoring dashboards surface extraction timestamps, source folder, file, and status per agent. Consumption is metered in Box AI Units.

Box Agent, which reached GA in April 2026, extends this further into autonomous multi-step workflows. A single query can trigger document discovery across content libraries with source references, data extraction, reasoning across multiple files, and output generation in Word, PDF, Excel, or PowerPoint formats. File creation remains in beta for Enterprise Advanced customers only. Matt Terrell, Director of Product Management for AI Agents at Box, put the strategic logic plainly: "Metadata is one of the fundamental building blocks" for AI-driven workflows, positioning Box as infrastructure that downstream AI systems consume rather than a consumer of AI itself.

The multi-LLM architecture runs on Google Gemini, Anthropic Claude, and OpenAI models for extraction, with customers selecting their preferred provider. Box AI Studio, with enhancements reaching GA for Enterprise Advanced customers in April 2026, supports models from Amazon, Anthropic, Google, IBM, Meta, OpenAI, and xAI. A remote Box Model Context Protocol (MCP) Server, launched in February 2026, extends Box content to external AI systems including Figma, Cursor, OpenAI, Slack, and Salesforce, which can invoke Box content and write outputs back. Analyst Shashi Bellamkonda summarized the shift: "Box is no longer a file cabinet. It is now a content retrieval engine for agents."

The data extraction outputs sync downstream to Databricks and Snowflake and feed Box Relay for end-to-end workflow automation. Confirmed integrations include Microsoft Office, Oracle Fusion Cloud HCM, Salesforce, ServiceNow, Slack, and Workday. The ServiceNow integration, announced December 2025, embeds Box AI Agents directly into ServiceNow Now Assist. The Salesforce integration automatically extracts document data and populates Salesforce fields, eliminating manual data entry. Box Sign for Workday, launched February 2026, integrates e-signatures directly into Workday workflows. Box Apps surfaces extracted metadata in no-code dashboards with natural language querying. Box Automate, a low-code/no-code visual workflow builder supporting Q&A, Compose, Extract, and Research agents, was announced at BoxWorks 2025 but remains on an unconfirmed GA timeline for Enterprise Advanced customers.

Unlike cloud-only competitors like Rossum, Box supports hybrid deployment through Box Edge for organizations requiring on-premise processing. Developer API access is available across Business through Enterprise Advanced plans. The Custom Extract Agent UI and autofill metadata templates are Enterprise Advanced only. Developer integrations include LangChain, LlamaIndex, Pinecone, and Weaviate. Teams evaluating open-source LLM-based extraction alternatives may also consider Unstract, which takes a no-code approach to production-grade document processing with hallucination mitigation built in.

Security runs through Box Shield Pro, which adds three AI agents: an AI Classification Agent for automatic sensitivity labeling, a Ransomware Agent that detects mass content encryption in Box Drive and offers remediation, and an AI Threat Analysis Agent for alert summarization. Shield Pro also specifically addresses MCP server security risks through user-based access controls and input cleansing for AI pipelines. It is available as an add-on for Box Shield, Enterprise Plus, and Enterprise Advanced subscribers; pricing is not disclosed.

Use cases

Contract lifecycle management

Box's strongest early IDP traction is in contract lifecycle management (CLM). Box CIO Ravi Malick identifies the pattern: "A big portion of that is the ability to extract metadata and relevant information from a contract and effectively create structure from unstructured data. That lets you categorize contracts, kickstart workflows, and in many cases automate contract creation."

RWS Global, a live events management company, deployed Box Enterprise Advanced across a contract approval workflow spanning Box Doc Gen, legal approval routing, Box Sign, and cloud storage with no code required. Per-contract processing dropped from 20 minutes to under 2 minutes. A 200-hire batch that previously required 8.5 workdays now completes in 5 hours. Jake McCoy, COO at RWS Global, stated: "Partnering with Box allows us to turn unstructured content into governed, AI-ready assets that help our teams make faster, more informed decisions." This is the only deployment in this period with quantified throughput metrics from a third-party source.

Congo Brands reported 7x faster contract review using Box Extract Agents, though this figure comes via analyst commentary rather than an independent case study. The Judge Group's Matthew Campana, Vice President of Digital Transformation, confirmed in the Box Agent GA announcement that the product "can activate complex workflows and find essential data from files, while maintaining strict compliance and data protection standards."

Invoice and financial document processing

Box Extract processes invoices, receipts, and financial documents with semantic field relationship recognition, integrating with ERP systems for automated accounts payable workflows while maintaining audit trails and approval routing based on extracted data values.

Valmark Financial Group deployed Box Extract to process account forms, insurance illustrations, and commission statements, including handwritten and variable-layout content. CIO Geoff Moore noted: "It understands the meaning of the document and what we are trying to extract. As formats change over time or between carriers, we don't need a specific prompt for each scenario." Moore cited "exceptional gains in both efficiency and accuracy," though no independent benchmarking accompanies that claim. A separate Box-published customer story cites 98% accuracy at Novo Construction in document processing; that figure carries no independent verification.

Financial services teams evaluating purpose-built alternatives for this document type may want to compare Alkymi, which specializes in extracting and transforming unstructured financial documents into standardized datasets. Teams focused on financial analytics and research automation may also consider Acuity Knowledge Partners, which serves 800+ institutions with agentic AI for document-intensive research workflows.

Government and public sector forms

Texas DMV deployed Box AI to automate extraction of permit types, fees, and expiration dates from permit, title, and registration applications, reducing manual review while maintaining the security and compliance standards required of a public agency. The deployment demonstrates Box Extract's applicability to high-volume, standardized government forms, a document type where the Standard Extract Agent's speed and cost profile fits well. Box achieved FedRAMP High authorization in March 2025, enabling deployment in regulated federal environments, an upgrade from the standard FedRAMP authorization previously held.

Organizations processing sensitive government video evidence alongside documents may also evaluate VIDIZMO, which combines document AI with evidence management and redaction for public sector workflows.

Technical specifications

Feature Specification
AI models (Extract) Google Gemini, Anthropic Claude, OpenAI GPT (provider families confirmed; specific version strings unverified)
AI models (AI Studio) Amazon Bedrock, Anthropic, Google, IBM watsonx, Meta, OpenAI, xAI
Extract agent tiers Standard (simple documents, lower cost); Enhanced (complex/variable documents, multimodal reasoning); Custom (folder-triggered, Enterprise Advanced only)
Box Agent capabilities Document discovery, data extraction, reasoning, output generation in Word, PDF, Excel, PowerPoint; file creation in beta (Enterprise Advanced only)
OCR capabilities Scanned PDFs, images, handwritten text processing
Storage capacity Unlimited (Enterprise plans)
File size limits Up to 150GB per file
Security AES 256-bit encryption, SSO, MFA, AI Data Classification
Compliance GDPR, HIPAA, FINRA, FedRAMP High, SOC
Integration (GA) Microsoft Office, Oracle Fusion Cloud HCM, Salesforce, ServiceNow, Slack, Workday; 1,500+ app integrations, APIs, webhooks
Integration (planned) GitHub Copilot, IBM WatsonX Orchestrate, ServiceNow AI Agent Fabric
MCP server Remote Box MCP Server; external systems including Figma, Cursor, OpenAI, Slack, Salesforce can invoke Box content and write outputs back
Downstream data Databricks, Snowflake sync; Box Relay workflow automation
Deployment options Cloud; hybrid with Box Edge
Developer support LangChain, LlamaIndex, Pinecone, Weaviate; API access from Business plan up
Workflow automation Box Relay (GA); Box Automate (announced, GA date unconfirmed)
Consumption metering Box AI Units
Plan gating Developer API: Business through Enterprise Advanced; Custom Extract Agent UI and autofill templates: Enterprise Advanced only; Box Agent: Enterprise Plus and Enterprise Advanced

Resources

Company information

Box, Inc. was founded in 2005 and is headquartered in Redwood City, United States. Q3 fiscal 2026 revenue reached $301 million, with 9% year-over-year growth. The company's IDP strategy is built on the 2024 Alphamoon acquisition and a deliberate positioning as domain-specific content intelligence rather than a general-purpose AI orchestration platform. As Pelz-Sharpe of Deep Analysis put it in April 2026: "People are asking for localized AI that does a good job on this job, not generic 'I can solve the world's problems.'" That framing aligns with Box's stated philosophy: let Claude, OpenAI, and Gemini agents hand off to Box's own agents for security, data governance, and privacy policy enforcement, rather than competing on orchestration breadth.

Website: https://www.box.com Email: ir@box.com Phone: +44 808 189 0504