I'm working with a client who deals with a massive amount of data coming from various suppliers, and right now, they're stuck doing a lot of manual processing. Most of this data consists of invoices from hundreds of different suppliers, all mostly pertaining to the same type of service. Imagine getting cable bills from all the cable companies across the country, but each one presents its bills in totally different formats with different terms for similar items. Currently, some of these invoices come through the mail, and people have to sort and scan everything manually, which takes hours. After sorting, someone inputs the data into a spreadsheet, raising my concerns about human error, though they have some checks to assure accuracy. They have tried OCR solutions, but the varying formats make it a nightmare for mapping to specific fields. I'm looking for suggestions on how to automate this process or make their lives easier. Has anyone dealt with something similar and found a working solution?
1 Answer
One approach could be setting up some Electronic Data Interchange (EDI) arrangements if the suppliers are open to it. That method has been around for ages and can be HTTP(S) based for ease of use. If there’s a contentious relationship with suppliers, then that adds layers to the issue, making it harder to automate workflows without their cooperation. Although smaller businesses often think it’s simple, larger enterprises usually have a more realistic view from their years of experience.

Unfortunately, EDI might not be feasible. Every supplier has its own invoicing systems, and pushing them all to use EDI would just create chaos. Plus, an EDI expert would be needed to set up everything and that could become a nightmare, as they've experienced before.