I'm an estimator in the HVAC field in Belgium, and I often face a significant challenge in my work. When I receive a specification document, which can be over 100 pages long, and a bill of quantities in an Excel format listing around 200 to 400 line items, my job is to extract the technical requirements from the PDF and match them to each line item in the Excel file. This process is very tedious and time-consuming, requiring hours of repetitive searching and copying information.
I've tried using tools like ChatGPT, Gemini, and Claude, but they've all fallen short. They tend to extract the wrong sections, jumble standards together, and provide overly long excerpts instead of concise summaries. Even when I correct one error, new ones seem to arise. I'm looking for solutions from anyone who knows how to tackle this issue effectively. Are there better tools or workflows that can reliably link the specifications in a PDF to the item list in Excel? While I'm not a developer, I'm open to any practical approaches. Additionally, I'm interested in future possibilities where such a system could connect supplier catalogs to streamline sourcing. However, first, I need a way to accurately merge these two documents without errors.
3 Answers
You might want to consider hiring a developer. It can be complex to pull data from PDF specs since they're often formatted primarily for human readers. A developer could help streamline the process.
If the specs come in a consistent format, moving away from PDFs might help. You should request the specifications in a more manageable format like CSV. This would simplify your data extraction process greatly.
Extracting data from a CSV is straightforward, but PDFs can be tricky depending on their design. A practical approach might involve first parsing the Excel file to identify keywords, then using those as guides to chunk the PDF accordingly. An AI could potentially handle this if it's set up correctly.

That could really cut down on time! Plus, it would make it easier to automate the extraction and matching process.