The goal of this project is to detect and mathematically model activity and interaction patterns between humans in a variety of environments, such as offices, corridors, and courtyards. This mathematical model should enable interactions to be probabilistically predicted and indicate when unusual interactions are detected. For instance, a person following another and a group of people meeting by a blackboard should be recognized as interactions. The model should be able to represent correlations such as following often leads to meetings. (In this project, identities of people are not used in modeling interactions). Those areas of the environment where these interactions occur should be identified.
This information can be used in applications such as surveillance, and in the design of interiors that facilitate interactions between people. If a robot has knowledge of the interactions its environment, it can adapt its behavior so that it does not get in the way of interacting people.
The motion detection equipment consists of multiple laser range-finders placed around the edges of the environment. Multiple lasers are used to handle occlusions. The laser range-finders are calibrated with respect to each other using mesh relaxation. The foreground readings from the laser range-finders are clustered and serve as inputs to a particle filter based tracker. Since the lasers only measure distances, identities of people are not used. We recorded tracks of people in the Interaction lab and corridors. Interactions in these areas include following and playing ping-pong.
The sequence of positions of a person are converted into a sequence of displacements. These continuous displacements are then discretized into a small set of canonical displacements. Sequences are then segmented. Each segment corresponds to a different type of motion. For instance, a person might walk along a straight line and then turn and walk along a straight line in another direction. In this case, each of the straight line motions would be separated into segments. Each segment is represented as a probability distribution over the set of canonical displacements.
Distance between people is not always a good indicator of interaction between people. For instance, desks and chairs in a office are often located close together and therefore two people sitting at adjacent desks are not necessarily interacting. On the other hand, two people playing ping-pong are separated by a considerable distance, yet their movements are correlated. Thus, in this project we detect interactions by measuring the amount of correlation between people's movements. Two segments are from interacting persons if their corresponding probability distributions are similar. Similarity is measured using the Kullback-Liebler (KL) distance between the two probability distributions. The entropy of each probability distribution is also used in the detection of interactions. Two segments are more likely to come from interacting persons if their probability distributions have high entropy. For instance, segments from stationary people have a low KL-distance but they also have low entropy unlike the case of ping-pong players whose movements generate high entropy distributions.
Given the above general representation of spatial activity and an automatic method of segmenting tracks into activities, we build Markov models to describe the pattern of behaviors observed in that environment. Updating a model of activities requires a method to check if two tracks correspond to the same kind of activity. We cluster the probability distributions representing activity segments to test if two segments represent the same activity: we assume that the two tracks come from similar activities if their corresponding probability distributions lie in the same cluster. We use hierarchical clustering to generate the activity clusters. The (symmetric) KL distance between two probability distributions is used as the distance metric in the clustering algorithm. At every step of this algorithm, the two probability distributions with the smallest KL distance between them are replaced by their average distribution. This step is repeated until the minimum KL distance among all pairs of distributions exceeds a pre-fixed threshold.
The states of the Markov model correspond to the activity clusters. The transition probabilities between two states represent how likely the person will perform the corresponding activities one after the other. A Markov model is created for every new activity sequence. The transition probabilities are learned by counting transitions between every pair of representative activities in the sequence. We have built such models for the pattern of activity observed in the Interaction lab over a few hours. These activities included playing ping-pong, moving between desks and doors, meetings, and sitting at desks.
We have also attempted to detect anomalous activities and interactions in the tracked data. An anomalous activity is defined as one that occurs with a frequency markedly different from what is observed over a long period. For instance, at the end of a lecture, the number of people exiting the lecture hall is much higher than at other times, and this time could thus be marked as showing anomalous activity. We model the occurrence of each activity as a Poisson distribution. This enables us to compute the expected probability of seeing a particular number of such activities over a given time period. If this probability falls below a pre-fixed threshold, anomalous activity may be flagged.
|Video showing two people playing ping-pong. [AVI] (1.5M)|
|Video showing the corresponding laser scans. [MPG] (5.75M)|
|Segmenting of a player's motion track occurs when he moves away to pick up the ball (indicated by a color change). Note that interaction (indicated by a connecting line) is detected only when the ball is in play, i.e., when the players exhibit similar motion patterns. [AVI] (96K) [MPG] (110K)|
This work is supported in part by the DoE RIM grant DE-FG03-01ER45905, and in part by the ONR MURI grant SA3319.