What does good data in construction look like? John Millar, technologist at BIM Academy, offers his views.
In many industries, and certainly the construction industry, data is the new oil. Our industry has moved away from conventional paper-based documentation – the prevalent but deeply flawed currency of valuable information for many decades – towards digitalised approaches. Thus the inherent limitations and troublesome nature of the former are replaced with the versatility and reliability of the latter.
Data – aggregations of raw facts that serve as the basis of information when given context – comes in many shapes and sizes. Regardless of a dataset’s specific form, purpose, status or content, there are certain universal characteristics that determine its overall quality and suitability for use.
As part of any holistic, outcomes-focused approach, those involved in the production and management of data should strive to understand and maintain awareness of these characteristics for their data to be fit for purpose and add value in the manner intended. Here, I look at some of these in more detail.
Accuracy and precision
“Perhaps the most fundamental measurement of data quality, a high degree of accuracy in the data is necessary in order for it to be fully representative of its subject matter.”
Perhaps the most fundamental metric, a high degree of accuracy in the data is necessary for it to be fully representative of its subject matter. That matter could be delivery phase data that communicates the design intent. Or it could be operational data that reflects the live performance of the built asset and its constituent systems.
This accuracy is critical in the use of the data, and its importance is neatly summarised in the infamous GIGO principle: garbage in, garbage out! Valuable outputs depend on reliable inputs, and the structure and process in between is immaterial.
Precision is distinct from accuracy in that it is not the overall conformance to the truth that is being measured, but rather the degree of exactness in this conformance. Take an architectural example: have the building’s dimensions or gross internal floor area values been provided as integers or to two decimal places?
Standards for precision can be prescribed or subjective depending on the needs of the project, but either way, they should be consistently upheld.
Completeness and timeliness
Another fundamental measurement of data quality is completeness: any given dataset at the point of delivery should contain all the required data.
This is accounted for in the ISO 19650 methodology, where the project’s Asset Information Requirements (AIR) precisely define the required content of the built asset’s structured data.
Meanwhile, the project’s gateways/decision points (e.g. the stages of the RIBA Plan of Work 2020) provide a temporal framework for defining how this data should develop over the course of the project’s delivery phase. They also determine where periodic audits can be scheduled to ensure sustained development and commitment to the relevant information requirements.
“The right data is needed at the right time in order to fully realise the benefits of digitalised methodologies.”
A deep understanding of these requirements is necessary to ensure that the dataset delivered will be fit for purpose. Missing data points can create exactly the type of issue that the provision of structured data seeks to solve, where unavailability (e.g. missing model number, warranty or spare parts data for a faulty boiler) can result in the need for manual investigative processes as part of the job order. This has implications for the total time, cost and carbon invested, particularly where the fault in question is likely to lead to significant disruption and operational downtime.
As always, the right data is needed at the right time to fully realise the benefits of digitalised methodologies. Thus careful consideration of information requirements is key to a successful outcome.
Relevance and consistency
Although this may seem obvious, it is important to ensure that the data provided is relevant to the needs of those who will be using it.
Relevance is subjective and the only way to guarantee this quality in the data is via a close and sustained engagement with the end user(s). This should include comprehensive discussion on both existing and potential practices.
The production and management of redundant data is wasteful for all involved and careful elicitation of information requirements will ensure that only that which is useful will be produced. All that is produced should be consistent with the formats adopted elsewhere, for example in the other datasets utilised by the end user in asset management at the portfolio level.
Versatility and interoperability
One key benefit of structured data is in its flexibility and versatility. Freed from the restrictions of unreliable, unsearchable and easily degradable paper-based formats – or a collection of unorganised PDFs on a pen drive – data can flow from one application to another with no loss or degradation and no need for significant human intervention.
Open data formats, such as IFC and COBie, are widely recognised within the wider ecosystem of digital tools that can be implemented over the entire lifecycle of the built asset and this enables an impressively wide array of potential use cases. This versatility adds considerable value for any organisation looking to embrace the digital economy, and thus the interoperability of the data provided will always be of great importance.
With some consideration of these universal characteristics of data quality, all involved in the production and management of data can ensure that the datasets they are providing will be reliable, fit for purpose and will add maximum value for the projects and clients for which they serve.
Don’t miss out on BIM and digital construction news: sign up to receive the BIMplus newsletter.