Page 458 - 《软件学报》2025年第10期
P. 458
孙锐 等: 隐式多尺度对齐与交互的文本-图像行人重识别方法 4855
L CMPM = L v2t + L t2v (11)
CMPM CMPM
同时, 考虑到多头注意模块中不同 head 的关注块可以捕获彼此冗余和重叠的语义, 为了充分挖掘图像和文本
中的细粒度细节, 希望不同尺度的特征聚焦于不一致的信息, 我们对不同尺度的特征施加多样性约束损失 L div , 避
免信息冗余, 如公式 (12) 所示.
t
N f f v f f t
v
N ∑ ∑
i j i j
L div =
+
(12)
t
t
v
v
i
i
j
i=1 j=1, i,j
f
f
2
f
f
2
2
j 2
L id 将行人图像或文本按身份划分为不同的群体, 保证了身份层次的匹配. 它明确地
此外, 我们采用身份损失
考虑了模态间的距离, 保证了同一图像/文本组的特征表示在联合嵌入空间中紧密地聚类在一起. 其中, W id 是用于
调整不同标签重要性的权重向量, GN(X) 是通过全局规范化处理得到的归一化图像特征向量, 身份损失表示为:
L id (X) = −log(Softmax(W id ×GN(X))) (13)
通过上述跨模态投影匹配损失、多样性损失和身份损失的约束, 我们可从图文中获得不同的语义对齐感知特
征. 综上, 最终的损失函数表示如下:
(14)
L = L CMPM + L div + L id
3 实验结果与分析
3.1 数据集与性能评价指标
为了验证本文方法的有效性, 我们在 3 个具有挑战性的文本到图像的人物检索数据集 CUHK-PEDES、ICFG-
PEDES 及 RSTPReid 上进行了广泛的性能评估.
[6]
CUHK-PEDES 是第 1 个专门用于文本到图像的人检索的数据集, 如图 6 所示, 包含了 40 206 幅图像和
80 412 个文本描述, 用于 13 003 个身份. 按照官方数据分割方法, 训练集由 11 003 个身份、34 054 个图像和 68 108
个文本描述组成. 验证集包含 3 078 张图像和 6 156 个文本描述, 而测试集包含 3 074 张图像和 6 148 个文本描述,
它们都有 1 000 个标识.
A woman in a pink shirt, a pair of blue jean shorts A woman in blue jean shorts, light colored shoes
and a pair of gray shoes. and a pink top carries a light colored shoulder bag
Ⅰ The woman is seen from behind wearing a light outside.
colored t-shirt with a pair of dark capris, and a tan A woman in a pink shirt, a pair of blue jean
purse slung across her body from her left shoulder shorts and a pair of gray shoes.
to her right hip.
A lady with long black hair.Wearing a black shirt Female with dark hair parted down the middles,
and black short pants.With tan or light colored wearing upper garment that is partially white but
Ⅱ high heels ,she is also carrying a red purse and mainly black. Black pants that end just below
walking next to bickes. knees and light colored shoes.
A woman in a black shirt, a pair of black pants A woman in a white shirt, a pair of black pants
and a pair of pink shoes. and a pair of white socks.
A man in a white shirt with a picture on the front, a The man is carrying a piece of paper in his left
pair of gray shorts and a pair of gray shoes. hand. He has black hair.
The pedestrian with short, dark hair walks with This person is visible from the back, they are
Ⅲ wearing a white short sleeve tee shirt, gray
their left hand over their stomach. He wears a
white, graphic t-shirt with gray shorts and shoes. Bermuda shorts and is carrying something in his
left hand.
A woman with black hair is wearing a yellow and This woman has long dark hair. She is wearing a
black top, light pants, light pink purse and white jacket, jeans and sneakers. She is carrying a large
Ⅳ sneakers. purse.
A woman wearing a black shirt, a pair of blue jeans A woman wearing a white and black shirt, a pair
and a pair of black and white shoes. of blue jean pants and a pair of white and black
shoes.
图 6 来自 CUHK-PEDES 数据集的行人图像-文本对

