FormX.ai — Document Extraction API Platform
On This Page
- Overview
- GPT-4 integration and training efficiency
- Document types and geographic coverage
- Integration and mobile capabilities
- Use cases
- Identity verification onboarding
- Expense management automation
- Invoice and banking document processing
- Pricing
- Technical specifications
- Company information
- Resources
FormX.ai is a Hong Kong-based intelligent document processing (IDP) API provider specializing in automated data capture from identity documents, receipts, invoices, and business certificates using OCR and machine learning.

Overview
FormX.ai delivers cloud-based document extraction APIs that return structured JSON from scanned documents and images. The platform combines OCR, NLP, and machine learning to extract data from identity documents, receipts, invoices, bank statements, and business certificates. Pre-built extractors cover the most common document types out of the box; custom model training via a web portal handles organization-specific formats without requiring developer resources.
The platform achieves 92% overall extraction accuracy using proprietary image pre-processing and OCR post-processing. Field-level accuracy reaches 70% or higher across all document types, climbing to 90% or higher on date, time, and amount fields when source image quality is good. Receipt processing completes in 4 to 8 seconds, excluding upload time.
In early 2026, FormX.ai launched zero-training extraction capabilities that create new document extractors with minimal sample data, reducing deployment friction for teams without dedicated ML engineers. The company also introduced Smart Learning AI, which adapts to user corrections in real time without requiring full retraining cycles, and ImageChecker, a document quality validation tool that screens submissions before extraction begins.
FormX.ai holds ISO 27001 and SOC 2 Type II certifications. The platform is hosted on GCP and Azure, with private cloud deployment available on enterprise plans. Documents are processed in memory and disposed of after extraction; all API calls are encrypted via HTTPS. The vendor states that documents are not used for AI model training unless explicitly requested by the customer.
GPT-4 integration and training efficiency
The most significant recent development is FormX.ai's private beta integrating GPT-4 into its extraction pipeline. The integration targets three specific problems: reducing OCR errors through advanced language processing, improving entity extraction accuracy, and cutting the number of labeled samples needed to train a custom extractor.
The practical impact on custom model deployment is meaningful. Conventional model training typically requires 15 to hundreds of labeled samples per document type. The FormX team states that GPT-4-enhanced extractors require only 3 to 5 samples to reach production-ready accuracy. That reduction removes a significant barrier for organizations processing low-volume or highly variable document types where assembling a large training set is impractical.
The private beta status means no independent benchmark comparisons are yet available. The accuracy improvement claims are self-reported by the FormX team, who noted they have "seen some positive results" and "promising results that can revolutionize document processing and data extraction." Evaluators should treat these figures as directional until third-party validation is published.
Document types and geographic coverage
FormX.ai's pre-built extractors cover identity documents from Hong Kong, Macau, Singapore, and Taiwan, including national ID cards, passports, work permits, and driver's licenses. On the financial document side, the platform handles invoices, receipts, bank statements, and bills of lading. Business certificates round out the pre-configured portfolio.
This geographic focus reflects a deliberate market choice. Rather than competing on breadth against global vendors like ABBYY or Hyperscience, FormX.ai concentrates on Southeast Asian identity document formats where regional specificity matters for extraction accuracy. The tradeoff is limited coverage outside that region for identity use cases, though invoice and receipt extraction is format-agnostic.
Integration and mobile capabilities
FormX.ai uses a REST API architecture with Extraction API v2, which returns key-value objects for streamlined downstream consumption. Native multi-page PDF support is included by default. The platform connects directly to UiPath RPA for invoice processing automation, and supports no-code workflow connections via Zapier, N8N, and Google Workspace integrations.
The mobile SDK extends extraction to camera-based capture workflows. It includes real-time document detection, live preview, blur detection, camera stabilization, and long receipt stitching for multi-section receipts. These capabilities target retail expense management and field-based identity verification scenarios where documents are photographed rather than scanned.
Use cases
Identity verification onboarding
Financial services firms and sharing economy platforms automate customer identity verification by integrating FormX.ai's ID extraction API into mobile apps and web portals. Users photograph government-issued IDs; the API extracts names, ID numbers, addresses, and expiration dates, returning structured JSON for validation against customer-provided information. The mobile SDK's blur detection and camera stabilization reduce rejection rates from poor-quality captures before they reach the extraction layer.
Expense management automation
Corporate finance teams process employee expense receipts by submitting images through the receipt API. The system extracts merchant names, transaction dates, amounts, and line items, returning JSON that populates expense management software fields. The 4 to 8 second processing time supports near-real-time submission workflows.
Invoice and banking document processing
Companies process invoices, bank statements, and bills of lading using pre-built extractors that require no coding. The zero-training capability enables rapid deployment for new document types with minimal sample data, making it practical for organizations that encounter occasional document variants outside the standard portfolio.
Pricing
FormX.ai offers four tiers. New accounts receive 100 free pages for testing. Pay-as-you-go processing costs $0.30 per page with no monthly commitment. The Starter plan covers 3,000 pages per month at $299/month. Enterprise plans carry custom pricing and add private cloud deployment as an option.
The $0.30 per-page rate positions FormX.ai above free-tier competitors but below enterprise-only vendors that require annual contracts. The pay-as-you-go option makes it accessible for teams evaluating the platform on real document volumes before committing to a monthly plan.
Pay-as-you-go
No monthly commitment. 100 free pages to start.
- REST API access
- All pre-built extractors
- HTTPS encryption
- In-memory processing
Starter
3,000 pages per month. Best for mid-market teams.
- Everything in pay-as-you-go
- Custom model training via web portal
- Zapier, N8N, Google Workspace integrations
- ISO 27001 and SOC 2 Type II compliance
Enterprise
Private cloud deployment available.
- Everything in Starter
- Private cloud on GCP or Azure
- UiPath RPA integration
- Custom volume pricing
Technical specifications
| Feature | Specification |
|---|---|
| Core technology | OCR, NLP, machine learning, GPT-4 integration (private beta) |
| API version | Extraction API v2 with key-value objects |
| Supported formats | JPEG, PNG, PDF (multi-page) |
| Image requirements | Minimum 1000x750 pixels or 100 DPI recommended |
| Output format | Structured JSON; CSV available |
| Overall accuracy | 92% (self-reported) |
| Field-level accuracy | 70%+ all fields; 90%+ on date, time, amount fields |
| Processing speed | 4–8 seconds per receipt (excluding upload) |
| Document types | ID cards, passports, licenses, receipts, invoices, bank statements, business certificates |
| Geographic coverage | Hong Kong, Macau, Singapore, Taiwan identity documents |
| Languages | Chinese Mandarin, English |
| Mobile SDK | Real-time detection, blur detection, camera stabilization, receipt stitching |
| Integrations | UiPath, Zapier, N8N, Google Workspace |
| Deployment | Cloud API (GCP, Azure); private cloud on enterprise plans |
| Pricing | $0.30/page; $299/month Starter; enterprise custom |
| Compliance | ISO 27001, SOC 2 Type II |
| Data handling | In-memory processing; no training data reuse without consent |
Company information
FormX.ai is headquartered in Hong Kong and targets mid-market and enterprise organizations in financial services, retail, government, and caregiver services across the Southeast Asian market. The company does not publicly disclose employee count or funding history.
The vendor's compliance posture, ISO 27001 and SOC 2 Type II, combined with in-memory document processing and explicit opt-in for training data use, positions it for regulated industries where data residency and privacy controls are procurement requirements. Private cloud deployment on enterprise plans extends that positioning to organizations that cannot use shared cloud infrastructure.
Resources
- Website
- API Documentation
- Product Updates
- GPT-4 Integration Blog
- IDP Platform Evaluation Guide