Kanav Kahol Website

Human Motion Analysis: Gesture Annotation

Research Synopsis

We present a novel technique for motion annotation that adapts to a user’s style and vocabulary of basic movements, herein called gestures. Initial testing suggests that software based on this technique could be an effective aid in teaching, documentation, and dissemination of stylized motions, such as dance and sports.

 

Research Summary

For centuries choreographers and dancers have created human motion sequences to communicate emotions and storylines. They represent our cultural heritage and must be conserved for our future generations to experience and enjoy. One of the major challenges in teaching, as well as the documentation of, these motion sequences is the lack of a formalized language and notation of generic motion. Such a language and notation system would (1) facilitate teaching and learning of movement styles, (2) permit the writing of universally-understood scores of movement, and (3) provide a universal language through which movement specialists (such as choreographers and dancers) from all over the world could communicate. The lack of a widelyaccepted movement notation has been a major obstacle in teaching, documentation, and dissemination of dances.

To provide a satisfactory movement notation, it is important to develop a methodology that can capture each individual’s personalized gesture vocabulary, and then use that vocabulary of gestures to analyze and record motion sequences. This approach is based on the findings documented in the psychological literature [6] that elucidate the fact that music and spoken language have basic units of creation (notes and alphabetic characters, respectively) that are fixed and accepted by a community. However, movement is created by combining gestures that are creator-specific. Each choreographer has a personal vocabulary of gestures (i.e. elementary units of motion). The choreographer concatenates these units of motion into a performance. Using a customized vocabulary facilitates notation, because the motion is annotated in the same vocabulary in which it was conceived by the choreographer – thus making the notation more intuitive and natural. We use our gesture segmentation and gesture recognition algorithms for the purpose of generation of the motion annotation.

We developed a hierarchical display for the motion annotation that was consistent with the human body hierarchy. This allowed users to view activity in each of the body segments. Because our gesture recognition scheme analyzes gestures in terms of a sequence of events in the body segments and joints, it is possible to display the entire sequence of events in the body hierarchy. This facilitates comparison and contrast of two motion sequences that are captured while different people perform the same gesture. Such a comparison can be very useful for comparing a student’s performance to a teacher’s performance, and for pointing out exactly how the student might be faltering. It also allows users to compare two gestures and note their similarity.

During the training phase, the time taken to perform each sample of a gesture is noted to calculate the average speed of performing a gesture. The time required to perform the same gesture in the testing sequence is compared with average speed of the gesture. This helps identify how fast or slowly the gesture is performed, and a message is displayed to the user stating whether the gesture is above, below or at average speed. This is an important parameter in the repeatability of the dance sequence and indicative of a choreographer’s style. To display the final annotation, we used the Anvil video annotation software[11]. Anvil is Java-based software that is used for video annotation in a wide range of research areas, including linguistics, human computer interactions, gesture research, and film studies. It allows a user to define an annotation scheme (based on a finite number of event types) which is then stored as an xml file. The user can then use that scheme to manually annotate video clips, by marking events in a set of event tracks. After the user manually annotates a video, that annotation is stored as an xml file, and the event tracks can then be displayed and scrolled simultaneously with the display of the video.

Figure 1. Annotation Board

We created an annotation scheme for each of choreographer based on their gesture libraries and it was applied to all the the dance sequences created by them. This annotation scheme consisted of tracks for marking gestures boundaries, events within the body and joint hierarchy, and the time duration of each gesture. This scheme allowed certain tracks in a parent child link wherein if the annotation of the parent track is changed, the software automatically changes the child track’s annotation. This is helpful in certain scenarios. For example if the time duration of a upper body gesture is changed, the change is automatically applied to child segments of the upperbody in our annotation scheme automatically. Our gesture segmentation and recognition algorithm outputs an xml file that is compatible with the input format of an Anvil file. This allows a choreographer to open the automatically annotated motion sequence in Anvil, and examine the manner in which the gestures were segmented and recognized, based on his/her own profile. The system then allows the choreographer to modify the generated score if he/she is not satisfied with the results. It also allows the choreographer to work with with 3D data animation display or video data of the recorded performance. See Figure 1 for the annotation software main interface.
The top-left window in Figure 1 is the main window for the annotation software. The middle window is the video or 3D display panel. The top-right window is the element window, in which the generated annotation for tracks is displayed, and can be modified. The bottom window is the main annotation board. In the annotation board’s left blue column, the tracks for which annotation is generated are shown. The annotation board’s timeline through which the user can navigate, using navigation buttons in the main annotation window or by clicking at a particular point in the timeline, is colored blue. For a given motion sequence, the gesture sequence is marked. For each gesture, the event sequence (in the segment and joints, as determined by our gesture recognition algorithm) is shown. These events are also shown as colored columns in their respective tracks – i.e. joint and segment tracks. The tracks are grouped in a manner that is consistent with the body segment hierarchy. A user can modify any track annotation. If the annotation of the gesture sequence or the event sequence is changed, the annotation automatically makes changes in all tracks that are linked to the gesture sequence or event sequence track. For example, modification of the length of the gesture in the gesture sequence track automatically modifies the length of original score and detailed score. In addition to the duration information for each gesture, the detailed score and original score are displayed in the final motion annotation.

Publications

K Kahol, P Tripathi, S Panchanathan, “Documenting Motion Sequences: Development of a Personalized Annotation System”, accepted for publication in IEEE Multimedia Magazine.

K Kahol, P Tripathi, S Panchanathan, " Recognizing Whole Body Movements and Gestures through Activities in Human Anatomy", published at International Conference on Systemics, Cybernetics and Informatics, January 6-9th Hyderabad India

K Kahol, P Tripathi, T McDaniel, S Panchanathan, "Hand anatomy based modeling of manual haptic gestures" submitted for review at First International Conference on Pattern Recognition and Machine Intelligence (PReMI'05), to be held in Kolkata, INDIA.

K Kahol, P Tripathi, S Panchanathan, "Computational Analysis of Mannerism Gestures", accepted for publication at IEEE IAPR International Conference on Pattern Recognition, to be held in Cambridge UK.

CUbiC | ASU | ©2005 Kanav Kahol