Human pose estimation integrating self-attention mechanism and event point cloud
-
Abstract
In complex environments such as extreme lighting conditions, high-speed motion scenes, and limited computational resources, the detection accuracy of human pose estimation using frame cameras is prone to degradation due to overexposure and motion blur. The method of human pose estimation based on event point cloud by taking advantage of the high dynamic range, high temporal resolution of event camera was explored. Efficient event preprocessing was achieved by designing a representation of the event stream to the point cloud, combined with a fixed-time window sampling strategy. On this basis, the point cloud residual multilayer perceptron and self-attention were further fused to construct a multi-level feature extraction network structure to achieve the mapping of human joint point coordinates from 3D event space to 2D image plane. Experiments on the DHP19 event dataset showed that our method has significant results in the task of human pose estimation based on event data, with a low 2D mean per joint position error (MPJPE) of 5.91 pixel and a 3D MPJPE of 67.48 mm.
-
-