Radu Bogdan Rusu, Jan Bandouch, Franziska Meier, Irfan Essa and Michael Beetz (2009) “Human Action Recognition Using Global Point Feature Histograms and Action Shapes”, in Journal of Advanced Robotics, volume 23, pages 1873–1908, Koninklijke Brill NV, Leiden and The Robotics Society of Japan, 2009. [ DOI | PDF]
Abstract
This paper investigates the recognition of human actions from three-dimensional (3-D) point clouds that encode the motions of people acting in sensor-distributed indoor environments. Data streams are time sequences of silhouettes extracted from cameras in the environment. From the 2-D silhouette contours we generate space–time streams by continuously aligning and stacking the contours along the time axis as third spatial dimension. The space–time stream of an observation sequence is segmented into parts corresponding to subactions using a pattern matching technique based on suffix trees and interval scheduling. Then, the segmented space–time shapes are processed by treating the shapes as 3-D point clouds and estimating global point feature histograms for them. The resultant models are clustered using statistical analysis and our experimental results indicate that the presented methods robustly derive different action classes. This holds despite large intra-class variance in the recorded datasets due to performances from different persons at different time intervals.
© Koninklijke Brill NV, Leiden and The Robotics Society of Japan, 2009

Overview of the approach.
Keywords: Action recognition, point cloud, global features, action segmentation



AI researchers are interested in building intelligent machines that can interact with them as they interact with each other. Science fiction writers have given us these goals in the form of HAL in 2001: A Space Odyssey and Commander Data in Star Trek: The Next Generation. However, at present, our computers are deaf, dumb, and blind, almost unaware of the environment they are in and of the user who interacts with them. In this article, I present the current state of the art in machines that can see people, recognize them, determine their gaze, understand their facial expressions and hand gestures, and interpret their activities. I believe that by building machines with such abilities for perceiving, people will take us one step closer to building HAL and Commander Data.