An IDP solution is one of several tools an AI agent might call on to execute an E2E document-based process. However, could you replace an IDP ‘tool’ with a large language model (LLM) like ChatGPT or Claude?
AI models have typically required significant upfront training, with employees manually annotating many documents. However, the latest LLMs have shown strong performance in smaller use cases, using their native understanding and reasoning capabilities to extract the correct data with no training. Yet, larger enterprise-scale processes need much more rigor and reliability.
LLMs excel at creative, unstructured work, but they struggle to maintain accuracy in the long term. If an agent calls on an LLM to extract specific information from a complex document, it might succeed on the first few attempts. However, mistakes are inevitable. It might hallucinate an incorrect output and, without monitoring capabilities, you have no way of knowing without manually reviewing every document. At that stage, you might as well be processing them all manually. It’s also difficult to get consistent, structured outputs from LLMs. This usually takes many hours of trial-and-error prompt engineering, and even then, there’s no guarantee the model won’t hallucinate or deviate from the output you’ve asked for. Chat-based LLMs are ideal for ad-hoc use, but out-of-the-box they don’t provide the confidence or reliability that an enterprise needs for high-volume repeatable document extraction without significant tuning. They excel in tasks where there’s a lot of flexibility and uncertainty involved, and you don’t always need a consistent output. But when you’re in a business setting, processing thousands of documents for the exact same goal, you really need reliable, repeatable, and structured outputs. The challenge is to turn models that are non-deterministic by their very nature; and turn them into more deterministic and predictable tools for repeatable processes.