Dataiku: IDP Software Vendor
On This Page
Universal AI platform with document intelligence capabilities, preparing for 2026 IPO with $350M ARR and governance-by-design approach.

Overview
Founded in 2013 in Paris, Dataiku provides an enterprise AI platform that includes intelligent document processing through its Universal AI Platform. The company is preparing for a U.S. IPO in H1 2026 with Morgan Stanley and Citigroup as lead underwriters at a $3.7 billion valuation - up from a $4.6 billion Series E valuation in 2021, reflecting recalibrated market conditions. Dataiku surpassed $350M ARR in October 2025, up from $300M+ ARR in January 2025.
The company was recognized as a Leader in IDC's MarketScape for Worldwide Unified AI Governance Platforms 2026 - its first major governance-specific analyst recognition - with CEO Florian Douetteau framing the shift: "AI governance has shifted from a checkpoint to a foundation." By February 2026, that positioning had accumulated three external validations in a single month: a #33 ranking on G2's Best Analytics Software Products list based on verified user reviews, a Most Innovative Agent Development Platform award from SiliconANGLE Media, and the launch of the 575 Lab open-source office. The G2 ranking carries customer-validated weight - drawn from verified reviews across a marketplace reaching all Fortune 500 companies - while the SiliconANGLE award is analyst-judged. Neither is an independent analyst report.
Former Salesforce President Alexandre Dayon joined the Board of Directors in January 2026, strengthening enterprise sales expertise ahead of the IPO.
How Dataiku Processes Documents
Dataiku's document processing runs through the Natif.ai IDP plugin, a modular pipeline that handles PDF, TIFF, and JPEG inputs using computer vision, deep learning, and NLP. The pipeline converts native and scanned content into structured data, with vision-language models (VLMs) extracting information from text, tables, and images in a single pass. Governance controls - audit trails, end-to-end traceability, and compliance checkpoints - are embedded directly in the workflow rather than applied as post-processing overlays.
The Agent Hub extends this into multi-step agentic workflows: a collaborative workspace where AI agents can be built, shared, and scaled with ROI measurement attached. The AI Factory Accelerator, powered by NVIDIA, accelerates enterprise-scale deployments with native governance integration baked in.
In February 2026, Dataiku moved its governance infrastructure into open source through the 575 Lab, its dedicated Open Source Office. Two toolkits are generally available: Agent Explainability Tools, which traces decision-making across multi-step agentic workflows and surfaces agent reasoning for data scientists, compliance teams, and end users; and Privacy-Preserving Proxies, which protects sensitive data end-to-end when enterprises use closed-source models, designed for local deployment. Licensing terms and GitHub repository URLs were not disclosed in available sources. Dataiku simultaneously joined the Linux Foundation and the newly formed Agentic AI Foundation - a standards play, not a product feature. As Douetteau put it: "Enterprises need reusable building blocks that can become the standards for how agentic systems are controlled and inspected."
Use Cases
Enterprise AI Governance
Organizations leverage Dataiku's unified governance platform to close the gap where 95% of data leaders can't fully trace AI decisions end-to-end while 86% report AI embedded in daily operations. The platform embeds governance controls - traceability, audit logs, compliance checkpoints - directly into AI development workflows rather than as afterthought controls. The 575 Lab's Agent Explainability Tools extend this to agentic pipelines, making multi-step agent reasoning inspectable by compliance teams without requiring custom instrumentation.
Retail AI Transformation
Retailers use Dataiku's Retail Accelerator Pack for customer experience optimization and back-office automation. The pack includes seven ready-to-use use cases covering entity extraction and LLM-enhanced predictions. Head of AI Architecture Jed Dougherty notes the tension: "The riskiest place to use GenAI in retail is also the most valuable one: the customer experience." The accelerator is designed to compress deployment timelines for teams that cannot build from scratch.
Document Intelligence Workflows
Teams process document collections through the modular Natif.ai pipeline, converting native and scanned content to structured data with embedded governance controls for regulatory compliance and audit trails. The AWS Agentic AI and Healthcare Software Competency certifications extend this into healthcare-specific document workflows on AWS infrastructure, where compliance requirements are most stringent.
Technical Specifications
| Feature | Specification |
|---|---|
| Core Platform | Dataiku Data Science Studio (DSS), Universal AI Platform |
| Document Processing | Natif.ai IDP plugin, modular pipeline |
| AI Governance | Native governance controls, end-to-end traceability, 575 Lab open-source toolkits |
| Agent Platform | Agent Hub with collaborative workspace and ROI measurement |
| Open Source | 575 Lab: Agent Explainability Tools, Privacy-Preserving Proxies (GA; licensing terms not disclosed) |
| Cloud Partnerships | AWS Agentic AI and Healthcare Software Competency, NVIDIA AI Factory Accelerator |
| File Formats | PDF, TIFF, JPEG |
| Deployment | Cloud, on-premises |
| Accelerators | Retail (7 use cases), healthcare, manufacturing |
| Industry Foundations | Linux Foundation member, Agentic AI Foundation member |
Resources
Company Information
Headquarters: Paris, France (US HQ: New York City)
Founded: 2013
Offices: New York, Denver, Washington DC, Los Angeles, Paris, London, Munich, Frankfurt, Sydney, Singapore, Tokyo, Dubai
Funding: $4.6B valuation (Series E, 2021), $200M Series F (2022); IPO targeted H1 2026 at $3.7B valuation with Morgan Stanley and Citigroup as lead underwriters
ARR: $350M+ (October 2025), up from $300M+ (January 2025)
2026 Partner of the Year Award Winners
| Category | Winner |
|---|---|
| Global Data Partner of the Year | Snowflake |
| Global Cloud Partner of the Year | AWS |
| Global Systems Integrator of the Year | Accenture |
| Global Reseller Partner of the Year | K.K. Ashisuto |
| Americas SI of the Year | Aimpoint Digital |
| EMEA SI of the Year | Eulidia |
| APJ SI of the Year | ST Engineering (Mission Software & Services) |
| Americas Innovator of the Year | v4c.ai |
| EMEA Innovator of the Year | Infomotion |
| APJ Innovator of the Year | Datasolution |
Snowflake, AWS, and Accenture taking the three global slots reflects a deliberate indirect-channel strategy: deep alignment with the two dominant enterprise data infrastructure providers embeds Dataiku before a procurement conversation begins, while Accenture's SI role routes Dataiku's governance tooling into large-enterprise transformation programs. No revenue or deal-volume metrics were disclosed for any winner.