AI Prep, Part II: Why Construction Firms Must Start Capturing “All The Other Stuff”
In the past year, AI conversations in the construction industry have evolved from curiosity to urgency. Contractors and owners have experimented with the first wave of AI-enabled tools, many finding them interesting, but not entirely reliable. Those early tests taught an important lesson: you can’t just “switch on” AI.
Before any organization can extract real value from advanced analytics or generative tools, two foundational steps must be in place: (1) the right digital infrastructure and (2) clean, well-organized data. Those are the pillars that construction firms must first build the plumbing before turning on the tap. Companies need to modernize their systems, standardize processes, and get their existing project data into shape; only then could AI deliver the functionality that companies wanted – and expected.
Now, a new phase of readiness is emerging, and it’s one most construction leaders haven’t yet considered. If the first two steps were about building solid foundations, the next step is about expanding the universe of data you’re preparing to feed into AI. In other words: once your systems are ready and your schedule data is clean, you must begin gathering all the other information: the contextual, adjacent, and often messy data that actually explains why projects go off-track.
A More Mature AI Mindset
Many contractors approach me with a more mature mindset than they had even six months ago. They’re no longer asking, “When will AI be ready?” Instead, they’re asking:
“What do we need to do now so we’re ready when the tools arrive?”
This shift is a direct result of broader exposure to generative AI. As people use ChatGPT and other AI assistants, they’ve learned firsthand that AI is powerful, but only when the inputs are solid.
They’ve also seen what happens when data is incomplete, inconsistent, or flat-out wrong. Now they’re asking the right questions; they understand that preparation matters, and they want to be clear about know which sources of information will matter in an AI-driven future. That’s where the conversation takes an important turn.
Construction schedules will always be central to project planning and forecasting. But a schedule alone only tells you what happened, not why it happened.
Anyone who has ever managed a project has seen this scenario: a schedule update in September says the project will finish on October 31. The next update says the new completion date is late January. The cause of that slip rarely appears in the schedule itself. The genesis of a problem is almost never in the schedule. You’ll see the impact in the dates, but the cause lives somewhere else.
That “somewhere else” includes:
- Progress reports
- Coordination logs
- RFI volumes
- Design documents
- Issue analyses
- Handover reports
- Field notes
- Natural-language descriptions of field problems
- Even one-off memos or screenshots
Traditionally, these materials sit in scattered folders, sometimes read once and never referenced again. They’re often unstructured, written in prose, embedded in PDFs, or captured in inconsistent formats. Consequently, they were historically impossible for machine-learning models to process at scale. But that era is coming to a close.
Structured Insights From Unstructured Material
The next generation of AI tools will be capable of extracting structured insights from highly unstructured material. That means models will soon be able to read all of the disparate sources above, then correlate patterns across all of them. This capability is months, not years, away. Tools can already ingest large, unstructured documents and pull out key metrics that humans simply don’t have the bandwidth to extract manually. But these systems are only as valuable as the information you feed them.
So if Step 1 was infrastructure and Step 2 was data cleanliness, Step 3 is start capturing everything that might matter later.
Companies should begin identifying and recording non-schedule data that reveals stress points in a project. For example:
- Tag the number of RFIs associated with major work packages.
- Capture recurring communication issues noted by project managers.
- Save progress reports, even if they’re narrative instead of numerical.
- Record subcontractor coordination problems at a high level.
- Document early warning signs, not just their schedule impact.
These data sources may seem small or subjective today, but taken together, they form the “why” behind project outcomes. Future AI tools will be able to scan these documents, identify patterns, and alert teams to risks that were previously invisible.
Importantly: not all information is relevant, and companies will need to exercise judgment. But it’s best to err on the side of collecting more rather than less, because AI will play an integral role in determining what truly matters.
Prep Work Is Expanding
Even as AI advances at record speed, the fundamental truth hasn’t changed: you cannot adopt AI overnight. What has changed is the scope of preparation. Clean data is still essential. Infrastructure still matters. But now construction firms must start building a thoughtful backlog of relevant, contextual, real-world information that AI can use to explain why delays happen, not just when.
These tools are coming — fast. In 12 to 18 months, they’ll be able to extract meaningful insights from the kinds of documents that used to be useless to machine learning. Companies that start capturing that information now will have a massive head start.
AI prep is no longer just about organizing your spreadsheets. It’s about capturing the full story of your project, so future AI tools can finally tell you what really happened.
###
ABOUT THE AUTHOR
Daniel Hewson is the Data Capability Manager for Elecosoft. He has a strong background in mathematics, computer science, and engineering, with a focus on machine learning and how to apply it to real-world processes, including construction. Follow him on LinkedIn.