Page 67 - 《武汉大学学报（信息科学版）》2025年第6期

P. 67

第 50 卷第 6 期隋百凯等：基于几何先验约束的高点多视角损毁建筑物检测方法 1089

deep-shallow feature synchronization module that aggregates perception and spatial-channel attention. The
aim is to enhance deep features rich in semantic information such as building shape and category, as well as
deeper features, and to focus on shallow features rich in spatial information such as building edges, tex‑
tures, and lines. It addresses the spatial and semantic differences brought by features from different perspec‑
tives. Subsequently, based on the transformer attention network, an instance segmentation model that
takes into account Canny edge detection and entropy disorder is proposed, further enhancing the feature ex‑
pression of damaged buildings and achieving precise detection of damaged structures. The network consists
of three modules: a pixel-level feature extraction module, a transformer attention extraction module, and a
detection module. The detection results include both the pixel-level segmentation categories of buildings
and the bounding box target detection information for each building. It provides rich data support for subse‑
quent 2D to 3D geographic scene matching and mapping. Finally, a geometric constraint-based position
matching method using vertical and horizontal field of view segmentation is designed. This method projects
target detection information onto a 3D geographic information scene. It is tailored for cameras rich in angu‑
lar information, including ground cameras, high-point gimbal cameras, and drone cameras. By leveraging
the actual geographic location of cameras and the horizontal and tilt angles of the observed targets, it infers
and calculates the real geographic locations of detection targets, and then performs scene matching and map‑
ping based on a 3D model using ray cutting techniques. Results: The empirical findings reveal that the pro‑
posed method outperforms current approaches in detecting buildings and damaged structures from multiple
perspectives. The mean average precision of bounding boxes (bbox_mAP) and mean average precision of
segmentation (seg_mAP) achieve 50.33% and 46.69%, respectively. At intersection over union (IoU)
threshold of 50%, bbox_mAP50 and seg_mAP50 are 83.10% and 81.91%, respectively, showing a signi-
ficant enhancement in detection accuracy. In the case studies, high-point long-distance monitoring cameras
are used to capture image data, and instance segmentation of damaged and normal buildings is performed to
obtain detection results. The detected buildings are then matched to a 3D geographic scene using the pro‑
posed geographic matching method, which utilizes vertical and horizontal field of view segmentation and en‑
ables the precise matching and mapping of 2D detection results to 3D geographic scene. Conclusions: The
proposed method can not only effectively improve the detection effect of damaged buildings in the high-
point multi-view images, but also accurately match with the 3D geographic scene to provide technical sup‑
port for the emergency rescue command at the disaster site.
Key words： high-point monitoring； damaged buildings； multi-view scene； feature alignment； spatial map‑
ping

地震、洪水等自然灾害常对城镇造成毁灭性对其进行了深入探讨［1-3］。现有损毁检测方法主
破坏，其中建筑结构往往是受灾最严重的部分。要可分为监督学习与无监督学习两类。随着技
受损建筑的检测与评估过程复杂且耗时，即使针术发展，计算机视觉与机器学习方法在结构损伤
对单体建筑亦不例外。然而，灾害的实际影响范检测领域得到广泛应用。学者们尝试采用神经
围通常远超单个建筑或局部区域，亟需高效的大网络技术进行建筑物损伤识别，如文献［4］通过
范围评估手段。因此，从高点多视角遥感监测数神经网络确定未损伤和受损结构，文献［5］采用
据中自动发现受损建筑物并将其映射到地理场基于神经网络的方法，通过整合多源监测数据实
景能够为应急救援决策、规划提供足够的数据信现建筑物健康状态评估。此外，较多研究聚焦于
息。高点监测包括地面高点和低空无人机高点，自然灾害（如地震）造成的建筑损伤检测。地震
与卫星遥感影像不同，其成像角度、方式不一，监灾害破坏的建筑外观主要根据 4 个参数发生变
测影像呈现多视角，且不同视角建筑的空间分布化，即平台和传感器的类型、捕获的图像质量、图
［6］
特征不同，为高点多视角下影像损毁建筑物的检像的角度、建筑物损坏的类型，因此，基于神经
测与识别带来极大挑战。网络所创建模型的性能评价也主要基于这 4 个
损毁检测已有较多研究，学者们从不同角度参数。

62 63 64 65 66 67 68 69 70 71 72