Advancing Automated Construction Monitoring through AI and Point Cloud Registration

As part of the TARGET-X European project, co-funded by the Horizon Europe programme, BUILT CoLAB partnered with WatchBuilt to develop and refine an advanced solution for automatic construction progress monitoring. The platform combines Building Information Modelling (BIM), Point Cloud Data (PCD), and Artificial Intelligence (AI) to improve the automation, accuracy, and scalability of monitoring processes within real-world construction environments.

A core challenge in this domain is the precise alignment of unstructured 3D point cloud data with structured BIM models to enable reliable as-built vs. as-designed comparisons. Reality capture methods such as laser scanning produce dense point clouds that often contain noise, occlusions, and varying levels of resolution. These limitations, coupled with the dynamic and constantly evolving nature of construction sites, significantly affect the quality and usability of progress tracking solutions.

To overcome these issues, the project focused on automating the recognition of built structures within point cloud data and accurately associating them with corresponding elements in the BIM model. This enables consistent, objective progress assessments while reducing reliance on manual inspection. Recognizing early in the project that traditional approaches to registration and voxel-based quantification struggled with real-world data variability, three major enhancements were introduced and iteratively refined: (1) semantic segmentation of point clouds prior to registration; (2) a key point-based registration strategy to reduce noise and improve geometric alignment; and (3) dynamic voxelization to allow fine-grained element-wise progress reporting.

By incorporating these improvements into a unified pipeline, the resulting platform offers a robust and efficient foundation for construction monitoring at scale. The following sections detail the iterative development process, design rationale, and performance impact of each component.

From Concept to Function: An Iterative Development Journey

The solution developed in the scope of the TARGET-X project aims to automate and improve the accuracy of construction progress monitoring by integrating Building Information Modelling (BIM), Point Cloud Data (PCD), and Artificial Intelligence (AI) into a unified and modular pipeline. Built upon an original concept that compared as-planned (BIM) with as-built (scan) conditions, the platform evolved through iterative development cycles to overcome key limitations observed during deployment on real construction sites.

To move beyond these constraints, the project team implemented a modular architecture that integrates several advanced techniques, each targeting specific challenges encountered in real-world environments. These techniques were iteratively developed and validated across multiple workflow revisions, leading to a robust and adaptable platform capable of supporting real-time or near-real-time construction monitoring.

The final solution is structured around four main stages: semantic segmentation of raw point cloud data, key point-based registration between real and design models, dynamic voxelization for detailed element-wise progress analysis, and the integration of BIM-PCD analysis tools for structured interpretation of results. Each of these components contributes to a pipeline that is more resilient to common field conditions, such as incomplete scans, data noise, and occluded geometry, while also enabling fine-grained tracking of progress at the level of individual construction elements. Figure 1 shows a flowchart concerning all the implementation phases.

This diagram represents the outcome of the iterative development process and illustrates the complete workflow implemented in the final platform – from the initial preprocessing of the scanned point cloud (including outlier removal and downsizing) to the final progress estimation stage, through the introduction of semantic segmentation, the generation of key points for both real and BIM-derived point clouds, and the two-step registration process using RANSAC and ICP. Optional filtering is applied before the dynamic voxelization stage, which leads to an updated element-wise progress status. Additionally, although not represented, metadata may be generated depending on availability (e.g., colours, normal values). This modular and logic-driven structure will be detailed in the following sections.

1. Initial Alignment and Registration Workflow

The starting point of the development focused on establishing a baseline registration pipeline for aligning reality-captured data with the BIM model. Figure 2 showcases the selected point cloud and BIM model used as examples. This first iteration relied on direct registration between raw point clouds, without any semantic preprocessing or filtering stages.

Figure 2 – Captured point cloud (left) and developed BIM model (right), used as examples through this insight.

The real-world scan data, referred to as PC_R (Real Point Cloud) and the synthetic point cloud generated from the IFC model, PC_A (Artificial Point Cloud) were aligned through a key point-based registration process. The pipeline began by assessing whether PC_R included essential metadata, e.g., if scale information was missing, scale adjustment was enabled during registration; or if normal values were absent, an approximation was computed. Following this, key point extraction was applied to both PC_R and PC_A to reduce computational complexity. Multiple extraction methods were tested, and the GeDi network [1] was ultimately selected due to its stability and suitability for noisy point cloud environments. The registration was conducted in two steps: (1) global alignment using RANdom SAmple Consensus (RANSAC) [2], estimating an initial transformation matrix based on spatial correspondences between key points, and (2) local refinement using Iterative Closest Point (ICP) [3], optimizing the alignment through iterative minimization of point distances. Figure 3 details the final result obtained through this registration procedure.

[1] F. Poiesi and D. Boscaini, “Learning General and Distinctive 3D Local Deep Descriptors for Point Cloud Registration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, early access, 2022

[2] M. A. Fischler, R. C. Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Communications of the ACM 24 (6) (1981) 381–395.

[3] P. J. Besl, N. D. McKay, Method for registration of 3-d shapes, in: Sensor fusion IV: control paradigms and data structures, Vol. 1611, Spie, 1992, pp. 586–606.

Figure 3 – First registration results (PCR in blue; PCA in red).

Once aligned, both PC_A and PC_R were voxelized using a fixed voxel grid. Progress estimation was carried out by comparing voxel occupancy: for each structural element defined in the IFC model, the number of intersecting voxels was compared to the total voxel count. For instance, if 5 out of 10 voxels of a beam intersected between PCA and PCR, the beam was considered 50% complete. Although this first version of the workflow established the basic structure, its limitations quickly became evident in real construction scenarios, such as the lack of semantic filtering, which resulted in reduced robustness to noise and irrelevant geometry. These shortcomings prompted a redesign of the preprocessing pipeline, leading to the second iteration.

2. Semantic Segmentation and Filtering Integration

The second iteration introduced major improvements in the preprocessing stage, with the goal of reducing noise and aligning only geometrically relevant components. The key was the addition of a semantic segmentation step, allowing the system to classify and isolate structural elements (e.g., walls, slabs, columns) from the raw point cloud prior to registration.

To implement this, the SuperPoint Transformer (SPT) model [4] was adopted, based on its strong benchmark performance on the S3DIS dataset [5] and efficient inference time suitable for near real-time applications. Due to the large size of the raw point clouds, the segmentation process was preceded by a downsizing step to prevent memory-related crashes and speed up processing. This step involved voxel-based downsampling and was integrated directly into the preprocessing pipeline. Figure 4 shows the final results of the raw point cloud pre-processing, including segmentation.

[4] D. Robert, H. Raguet, L. Landrieu, Efficient 3d semantic segmentation with superpoint transformer, in: Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision, 2023, pp. 17195–17204.

[5] I., Armeni, O., Sener, A., Zamir, H., Jiang, I., Brilakis, M., Fischer, S. Savarese, 3D Semantic Parsing of Large-Scale Indoor Spaces, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1534-1543.

In addition to segmentation, two types of filtering methods were tested such as distance and curvature-based filtering. Ultimately, distance-based filtering proved more reliable, especially for large-scale scenes, while curvature-based methods had limited impact and were excluded in later iterations. This version of the workflow allowed registration to operate on clean, semantically meaningful subsets of the real point cloud, reducing misalignment and improving the stability of the transformation matrix. The segmentation-filtering-registration pipeline was executed in two rounds: an initial alignment followed by a refined registration on the filtered output. This produced improved results, especially in scenes with occlusions or repetitive geometry. Figure 5 shows the new registration, now considering the effect of the segmentation.

Figure 5 – Registration result with segmentation (PCR in blue; PCA in red).

3. Granular Analysis with Dynamic Voxelization

The final iteration focused on increasing the granularity and interpretability of progress estimation by adapting voxelization to the geometry and context of each individual structural element.

While previous versions relied on a fixed voxel resolution, this approach often failed to represent partial builds accurately or produced misleading results in edge regions, or regions affected by occlusion (e.g., scaffolding, rubble). To address this, the platform implemented dynamic voxelization, where the grid resolution was automatically adjusted based on the scale and shape of each element and the density of surrounding point data.

The progress calculation strategy was updated. Initially, it assumed that if a voxel in the BIM model had no corresponding match in the scan, it was incomplete. However, this approach led to underestimations in occluded areas. To correct for this, a rule-based logic was added: when patterns consistent with structural continuity (e.g., horizontal slab layers) were detected, missing voxels were inferred as completed unless contradicted by other geometric evidence.

The combination of these enhancements led to a significantly more robust and accurate workflow, capable of detailed element-wise tracking and adaptable to varied construction scenarios. Figure 6 shows the final alignment, as well as Figure 7 shows how the progress can be estimated.

Figure 6 – Final alignment (PCR in blue; PCA in red).

Figure 7 – Dynamic voxelization. Red voxels refer to smaller resolutions, as they near elements’ intersections.

Figure 8 – Progress results. Red areas have the lowest progress, followed by orange, yellow, and green.

Key Components of the Final Solution

The resulting registration engine combined multiple innovations:

Semantic segmentation for element-wise filtering and categorization;
Key point-driven registration using geometrically meaningful features;
Dynamic voxelization adapted to element geometry and scale
Rule-based progress estimation for detecting incomplete or obstructed elements
Modular architecture allowing easy integration into broader monitoring platforms.

These components worked together to deliver detailed, automated analysis of as-built vs. as-planned conditions, facilitating accurate tracking and timely detection of deviations.

Real-world Testing and Outlook

The solution was tested in a reference construction site at the RWTH Aachen University’s Center Construction Robotics, within the 5G Industry Campus Europe. These trials validated the technical robustness and operational value of the registration pipeline under realistic construction site conditions.

The results mark a substantial advance in the automation of construction monitoring workflows. By integrating semantic understanding, adaptive data processing, and AI-driven insights, the solution represents a promising step toward scalable, real-time progress and quality monitoring in the AEC sector.

BUILT CoLAB remains committed to translating such R&D efforts into practical tools that support the digital transformation of the built environment.

Advancing Automated Construction Monitoring through AI and Point Cloud Registration

From Concept to Function: An Iterative Development Journey

1. Initial Alignment and Registration Workflow

2. Semantic Segmentation and Filtering Integration

3. Granular Analysis with Dynamic Voxelization

Key Components of the Final Solution

Real-world Testing and Outlook

Promote the digital transformation of your company. Ask us for help.

Promova a transformação digital e sustentável da sua empresa.

Contacts

Useful links

LEGAL NOTICES

NEWSLETTER