Page 67 - 《武汉大学学报(信息科学版)》2025年第6期
P. 67

第 50 卷第 6 期          隋百凯等:基于几何先验约束的高点多视角损毁建筑物检测方法                                    1089


                deep-shallow feature synchronization module that aggregates perception and spatial-channel attention. The
                aim is to enhance deep features rich in semantic information such as building shape and category, as well as
                deeper  features,  and  to  focus  on  shallow  features  rich  in  spatial  information  such  as  building  edges,  tex‑
                tures, and lines. It addresses the spatial and semantic differences brought by features from different perspec‑
                tives.  Subsequently,  based  on  the  transformer  attention  network,  an  instance  segmentation  model  that
                takes into account Canny edge detection and entropy disorder is proposed, further enhancing the feature ex‑
                pression of damaged buildings and achieving precise detection of damaged structures. The network consists
                of three modules: a pixel-level feature extraction module, a transformer attention extraction module, and a
                detection  module.  The  detection  results  include  both  the  pixel-level  segmentation  categories  of  buildings
                and the bounding box target detection information for each building. It provides rich data support for subse‑
                quent  2D  to  3D  geographic  scene  matching  and  mapping.  Finally,  a  geometric  constraint-based  position
                matching method using vertical and horizontal field of view segmentation is designed. This method projects
                target detection information onto a 3D geographic information scene. It is tailored for cameras rich in angu‑
                lar information, including ground cameras, high-point gimbal cameras, and drone cameras. By leveraging
                the actual geographic location of cameras and the horizontal and tilt angles of the observed targets, it infers
                and calculates the real geographic locations of detection targets, and then performs scene matching and map‑
                ping based on a 3D model using ray cutting techniques. Results: The empirical findings reveal that the pro‑
                posed method outperforms current approaches in detecting buildings and damaged structures from multiple
                perspectives. The mean average precision of bounding boxes  (bbox_mAP) and mean average precision of
                segmentation  (seg_mAP)  achieve  50.33%  and  46.69%,  respectively.  At  intersection  over  union  (IoU)
                threshold of 50%, bbox_mAP50 and seg_mAP50 are 83.10% and 81.91%, respectively, showing a signi-
                ficant enhancement in detection accuracy. In the case studies, high-point long-distance monitoring cameras
                are used to capture image data, and instance segmentation of damaged and normal buildings is performed to
                obtain detection results. The detected buildings are then matched to a 3D geographic scene using the pro‑
                posed geographic matching method, which utilizes vertical and horizontal field of view segmentation and en‑
                ables the precise matching and mapping of 2D detection results to 3D geographic scene. Conclusions: The
                proposed  method  can  not  only  effectively  improve  the  detection  effect  of  damaged  buildings  in  the  high-
                point multi-view images, but also accurately match with the 3D geographic scene to provide technical sup‑
                port for the emergency rescue command at the disaster site.
                Key words: high-point monitoring; damaged buildings; multi-view scene; feature alignment; spatial map‑
                ping

                    地震、洪水等自然灾害常对城镇造成毁灭性                         对其进行了深入探讨          [1-3] 。现有损毁检测方法主
                破坏,其中建筑结构往往是受灾最严重的部分。                           要可分为监督学习与无监督学习两类。随着技
                受损建筑的检测与评估过程复杂且耗时,即使针                           术发展,计算机视觉与机器学习方法在结构损伤
                对单体建筑亦不例外。然而,灾害的实际影响范                           检测领域得到广泛应用。学者们尝试采用神经
                围通常远超单个建筑或局部区域,亟需高效的大                           网络技术进行建筑物损伤识别,如文献[4]通过
                范围评估手段。因此,从高点多视角遥感监测数                           神经网络确定未损伤和受损结构,文献[5]采用
                据中自动发现受损建筑物并将其映射到地理场                            基于神经网络的方法,通过整合多源监测数据实
                景能够为应急救援决策、规划提供足够的数据信                           现建筑物健康状态评估。此外,较多研究聚焦于
                息。高点监测包括地面高点和低空无人机高点,                           自然灾害(如地震)造成的建筑损伤检测。地震
                与卫星遥感影像不同,其成像角度、方式不一,监                          灾害破坏的建筑外观主要根据 4 个参数发生变
                测影像呈现多视角,且不同视角建筑的空间分布                           化,即平台和传感器的类型、捕获的图像质量、图
                                                                                           [6]
                特征不同,为高点多视角下影像损毁建筑物的检                           像的角度、建筑物损坏的类型 ,因此,基于神经
                测与识别带来极大挑战。                                     网络所创建模型的性能评价也主要基于这 4 个
                    损毁检测已有较多研究,学者们从不同角度                         参数。
   62   63   64   65   66   67   68   69   70   71   72