On This Page

alphamoon was acquired by Box in August 2024 and has since been rebuilt as Box Extract, a prompt-driven AI document extraction platform embedded in Box's enterprise content cloud.

alphamoon

Overview

Poland-based alphamoon, founded in 2017, was acquired by Box in August 2024 as one half of a deliberate two-part platform build. Box simultaneously acquired no-code workflow tool Crooze, with CEO Aaron Levie describing the pair as complementary halves of modern intelligent content management: alphamoon extracts metadata from complex documents; Crooze renders and automates workflows around it. The acquisition ended alphamoon's existence as an independent vendor. Its technology and team folded into Box, and the product is no longer available as a standalone purchase.

The deal closed a gap that Deep Analysis founder Alan Pelz-Sharpe had called "always a bit of an Achilles' heel" for Box: reliance on IDP partners rather than owned technology. IDC Senior Research Manager Amy Machado framed the combined acquisition as giving Box "a complete end-to-end solution, unlike the pure-play IDP vendors," positioning the company to compete directly against standalone IDP players such as Tungsten Automation, ABBYY, and Hyland.

Before its exit, alphamoon had raised just €2.3 million, an unusually modest total for a startup acquired by a public company. Financial terms of the acquisition were not disclosed.

The integrated technology was previewed at BoxWorks 2025 and reached general availability as Box Extract in January 2026. On April 2, 2026, Box announced Box Agent GA for Enterprise Plus and Enterprise Advanced customers, positioning Box Extract as the structured data extraction layer within a wider agentic platform that also includes Box AI Studio, Box Automate, and Box Apps.

Evaluators researching alphamoon should understand they are now evaluating Box Extract within the Enterprise Plus and Enterprise Advanced tiers. The alphamoon brand no longer appears in Box's product marketing.

January 2026Box Extract GA date
2 minutesRWS Global contract processing time (down from 20 minutes)
98%Novo Construction document processing accuracy
7xCongo Brands contract review speed improvement

How alphamoon processes documents

Box Extract, built on alphamoon's technology, takes a prompt-driven approach that Box explicitly positions against template-based legacy IDP. As Box Senior Product Marketing Manager Scott Picanco wrote at GA launch: "Traditional OCR and legacy IDP tools promise automation but lack true understanding. They demand extensive training and maintenance, and quickly break down as quality of content, handwriting, and document formats change."

The technical stack combines advanced OCR with chain-of-thought prompting, extraction-specific retrieval-augmented generation (RAG), and AI graders. No custom model training is required. Extraction is configured through natural language instructions, which the platform converts into extraction rules using LLMs. When the OCR or extraction layer makes an error, the system feeds the correction back into a document-trained LLM designed to understand business documents rather than general internet content.

The pipeline also handles handwriting recognition for both printed and handwritten content, including degraded inputs such as crumpled pages and partially illegible text. Document graph analysis understands entities and relationships across complex multi-page documents, while table extraction handles structured data including mathematical calculations. Confidence scoring is applied to extracted data across all supported LLMs.

Box Extract supports Google Gemini, OpenAI GPT, and Anthropic Claude model families. The Box Extract GA announcement cited Gemini 3, GPT-5.2, and Claude Opus 4.5; a subsequent independent analysis by Shashi Bellamkonda in April 2026 cited GPT-5.4 and Claude Opus 4.6, suggesting Box updates supported models as newer versions become available. Box states it publishes per-use-case model evaluation results covering contract review, data extraction, and report drafting.

Two-tier extraction model

Box Extract offers two agent tiers that address a cost-versus-accuracy trade-off familiar in IDP procurement. The Standard Extract Agent handles documents under 50 pages with fewer than 20 fields. The Enhanced Extract Agent applies chain-of-thought reasoning to larger, more complex documents. Customers with high-volume, low-complexity documents such as invoices and purchase orders can use the cheaper tier; complex legal or financial documents route to the Enhanced Agent. This split has direct implications for per-document processing cost estimates during procurement.

Use cases

Enterprise document processing

Box Extract automatically classifies documents and identifies key data points across PDFs, document scans, and images. At acquisition, Box named three specific use cases: loan processing via Box for Salesforce integration, clinical study analysis for life sciences, and financial document processing covering invoices, contracts, bank statements, purchase orders, financial statements, and leases.

Financial services

The platform's financial document processing covers the full range of structured and semi-structured instruments. The Salesforce integration, which launched in December 2025, automatically extracts document data and populates Salesforce fields, enabling loan processing workflows without manual data re-entry between systems.

Life sciences

Clinical study analysis was named as a launch use case at acquisition, targeting document-heavy research workflows where extraction accuracy on complex, multi-format documents is critical.

Workflow automation

Box Extract exports structured metadata directly to Salesforce, Databricks, and Snowflake. Workflow automation integrates with Box Relay. A ServiceNow integration making Box AI Agents accessible from Now Assist workflows also launched in December 2025. Box Automate integration is forthcoming. Full no-code workflow connections to Crooze and Box Relay covering forms and document generation remain on the roadmap without a confirmed date.

Box also launched a Model Context Protocol (MCP) server in February 2026, allowing external AI systems including Figma, Cursor, OpenAI, Slack, and Salesforce to securely invoke Box content and write outputs back.

Customer outcomes

Three named customers have disclosed measurable results from Box Extract deployments.

RWS Global, covered by Computer Weekly in February 2026, reduced contract processing from 20 minutes to under two minutes per contract. The same deployment cut onboarding processing for 200 hires from 8.5 workdays to five hours. RWS Global COO Jake McCoy stated: "Partnering with Box allows us to turn unstructured content into governed, AI-ready assets that help our teams make faster, more informed decisions."

Novo Construction reported 98% accuracy in document processing, per Box's own GA announcement. Congo Brands reported contract review running 7x faster using Box AI Extract Agents; this figure comes from Shashi Bellamkonda's independent analysis citing Box blog posts, with no separate third-party verification. Valmark Financial and Texas DMV are also cited as customers in Box's GA materials, without disclosed outcome metrics.

With Box Extract, that information is now unlocked and can transform how businesses analyze information and make decisions. By turning unstructured content into structured, usable data, organizations can deliver real-world impact by having their content actively work for them across their most important lines of business.

Aaron Levie, co-founder and CEO, Box (SiliconAngle, January 2026)

Technical specifications

Feature Specification
Deployment Cloud (via Box platform)
Supported languages English, Spanish, German, French, Czech, Romanian, Slovakian, Bulgarian
Document formats PDF, DOCX, TIF, TIFF, JPG, RTF, PNG, document scans
LLM support OpenAI GPT, Anthropic Claude, Google Gemini (versions updated as new models release)
Agent tiers Standard Extract Agent (under 50 pages, under 20 fields); Enhanced Extract Agent (complex documents, chain-of-thought reasoning)
Enterprise integrations Salesforce, Databricks, Snowflake, Box Relay, ServiceNow, Microsoft Office, Oracle Fusion, Slack, Workday
MCP server Launched February 2026; supports Figma, Cursor, OpenAI, Slack, Salesforce
Governance NIST AI Risk Management Framework; OECD AI Principles; customer data not used to train third-party LLMs
Pricing Included for Enterprise Plus and Enterprise Advanced within usage limits; additional volume purchasable
Vendor status Acquired by Box (August 2024). No longer available as standalone product.
Roadmap AI-recommended metadata templates, confidence score review, automated prompt refinement (undated)

Competitive positioning

Box CTO Ben Kus told TechTarget in April 2026 that Box deliberately does not compete in AI orchestration, instead letting external agents hand off to Box's own task-specific agents for security and governance: "We spend a lot of time making sure our agents work really well on our live data. We're just better at unstructured data than anyone else who has tried it."

This positioning matters for enterprise buyers evaluating whether to consolidate document AI into a general-purpose platform or retain a specialized IDP layer. Box is not competing with general-purpose AI orchestration layers; it is positioning as the secure, governed content execution layer that those orchestrators invoke. Alan Pelz-Sharpe of Deep Analysis characterized the broader enterprise demand as being for "localized AI that does a good job on this job, not generic LLMs."

The governance posture is a genuine differentiator. NIST AI Risk Management Framework and OECD AI Principles alignment, combined with no LLM training on customer data and security inherited from Box's existing permissions model, gives Box Extract a compliance posture that standalone IDP tools often cannot match without significant additional investment. This is directly relevant for financial services, insurance, government, and legal buyers.

One limitation is worth flagging for buyers with mature information governance requirements. Independent analyst Shashi Bellamkonda notes that Box Extract helps extract and structure data, but customers must still build and maintain their own content ontology. Box does not yet offer a data governance platform. Extracted metadata lands in Box and downstream systems such as Salesforce, Snowflake, and Databricks, but classifying and governing content at scale remains a customer responsibility. Dedicated IDP vendors with built-in taxonomy management may be a better fit for organizations that need governed classification alongside extraction.

Resources

Company information

  • Status: Acquired by Box in August 2024. Technology integrated into Box AI and Box Intelligent Content Management platform. No longer available as a standalone product.
  • Founded: 2017
  • Headquarters: Wroclaw, Poland
  • Parent company: Box
  • Website: alphamoon.ai
  • Email: support@alphamoon.ai
  • Total funding before acquisition: €2.3 million