Grooper: IDP Software Vendor
On This Page
Grooper is an intelligent document processing platform combining patented OCR technology with RAG-enabled AI for no-code document automation and data extraction.

Overview
Grooper is developed by BIS, combining multiple OCR engines, natural language processing, and machine learning for document classification and data extraction. The platform uses a patented multi-pass OCR approach, where recognized text is removed between passes to reduce visual interference on subsequent runs. BIS has developed Grooper over 35 years at its Oklahoma headquarters, positioning it as a "transparent AI" alternative to black-box IDP solutions - with built-in unit testing and explainable AI capabilities that let business analysts audit extraction decisions without developer involvement.
The Grooper 2021 version replaced form types with a "first, middle, and last page concept" classification paradigm, directly addressing false document match rates that users had flagged as a competitive weakness. That change earned praise from a President and COO at a software company, who cited it as resolving duplicate document issues. Despite 4.6/5 user satisfaction from 24 verified enterprise users - 71% giving 5 stars - Grooper holds 0.6% IDP market share as of February 2026, ranking #32 in the category. For context, that is roughly one-tenth of UiPath IXP (6.6%) and ABBYY Vantage (6.2%), though the top two vendors together control only 12.8% of tracked share, signaling a highly fragmented category with no dominant player.
Its verified customer list - Oklahoma DOT, Mercy Hospital System, the U.S. Nuclear Regulatory Commission, and American Airlines Credit Union, among others - points to a public-sector and regulated-industry footprint rather than broad commercial penetration. The platform supports 268 languages and 523 regional cultures with automatic language detection for dates, currency, and numeric formats.
How Grooper Processes Documents
Grooper's processing pipeline begins with document capture from email, scanners, network folders, and enterprise content management systems via CMIS connectivity. Ingested documents pass through image enhancement - more than 70 commands including bound region detection, segment reprocessing, cellular validation for multi-column layouts, and font pitch detection - before reaching the OCR layer.
The patented multi-pass OCR engine (U.S. patents 10,740,638 and 10,679,089) synthesizes results from multiple OCR engines, removing recognized text between passes so each subsequent run operates on a cleaner image. Handwritten content is handled separately through Intelligent Character Recognition using Microsoft Azure and neural networks that learn from new handwriting styles. Trainable OCR extends coverage to custom fonts not addressed by standard engines.
Classified documents - using the "first, middle, and last page concept" introduced in the 2021 version - route to extraction jobs configured without code. Todd S. from Change Healthcare IT Services describes the result: "One of the key features of Grooper is the ability to identify, interpret, enhance, and extract data from a large number of media types... there wasn't a requirement for a software developer to write code." Users report 98% codeless configuration in practice.
Extracted data is validated against configurable rules - including cellular validation that divides multi-column layouts into independent grids - before routing to downstream systems. Exceptions surface for human review; straight-through records flow directly to integrated databases, cloud platforms, and enterprise applications via the Grooper xChange marketplace. The RAG-enabled AI assistant sits across this pipeline, allowing natural language queries against document repositories and integrated databases without requiring a separate search implementation. Teams evaluating open-source alternatives for similar RAG-based document querying may also consider Unstract, which offers a no-code LLM platform with hallucination mitigation for production extraction workflows.
Use Cases
Financial Document Processing
Banks and lending institutions use Grooper to process checks with mixed fonts, handwriting, and security backgrounds. The multi-pass OCR approach captures account numbers, routing information, and written amounts by removing recognized text between passes. American Airlines Credit Union is among the named customer references. Grooper validates data completeness across loan applications and account opening documents, integrating with core banking systems for straight-through processing while routing exceptions for manual review. Organizations evaluating Grooper for banking compliance workflows may also compare Impactsure, a Mumbai-based platform specializing in banking compliance and global trade finance automation.
Healthcare Records Digitization
Healthcare organizations process patient records, insurance claims, and medical correspondence containing typed clinical notes and handwritten physician entries. Mercy Hospital System and Change Healthcare are both listed as customers. The RAG-enabled AI assistant allows clinicians to query patient document history using natural language, searching across document repositories and integrated databases without custom search development.
Oil and Gas Document Processing
Energy companies process leases, revenue statements, and division orders with varied layouts and structures. Grooper handles unstructured lease agreements, tabular revenue data, and semi-structured division orders within single workflows. The cellular validation feature divides multi-column layouts into grids for independent processing. Kyle S. from the Oil & Energy sector reports performance improvements of up to 90% for document-intensive operations; a Data Wizard at P&P Oil & Gas Solutions rated the platform 5.0 out of 5.0 on PeerSpot.
Government and Regulated Industries
Oklahoma DOT, Oklahoma State University, OLERS, and the U.S. Nuclear Regulatory Commission are among Grooper's named public-sector customers - a footprint that reflects the platform's on-premises deployment flexibility and transparent AI architecture, both relevant to agencies with strict data governance requirements. The built-in unit testing capability allows government IT teams to validate extraction logic before production deployment. Organizations in this segment that require specialized on-premises document processing with deep regulatory alignment may also evaluate Captova Technologies, which targets government and defense markets with claimed 100+ pages/second processing speeds.
Technical Specifications
| Feature | Specification |
|---|---|
| Core Product | Grooper IDP platform |
| OCR Technology | Multi-engine synthesis, patented multi-pass approach |
| Language Support | 268 languages, 523 regional cultures |
| Patents | US 10,740,638 (flexible OCR extraction), US 10,679,089 (OCR systems) |
| Character Recognition | OCR, ICR with neural networks, trainable for custom fonts |
| Image Processing | 70+ commands (bound region detection, cellular validation, font pitch detection) |
| AI Technology | RAG-enabled assistant, NLP, machine learning, K-Means clustering, transparent AI with unit testing |
| PDF Processing | Handles image-based, text-based, and mixed-content PDFs |
| Configuration | No-code visual automation (98% codeless per user reports) |
| Integration | CMIS connectivity, Grooper xChange marketplace, cloud platforms, databases |
| Document Capture | Email, scanners, network folders, enterprise systems |
| Development | 100% developed in USA (Oklahoma headquarters) |
| Market Position | 0.6% IDP market share (#32), 4.6/5 user satisfaction (24 verified users) |
Resources
- Website
- BIS Website
- Patented OCR Technology
- Grooper 2024: AI and IDP White Paper
- Press Releases
- Competitive Analysis: Grooper vs. ABBYY, Tungsten, and Microsoft
Company Information
Developer: BIS, Inc.
Headquarters: Edmond, Oklahoma, USA
Founded: Over 35 years of operation
Phone: 1-800-662-0777
Email: info@grooper.com
Contact: Contact Page