d*kirkland_masters: May 2010

Tuesday, May 25, 2010

Three*9

WRITTEN/PRACTICAL COMPONENT ENTRY Avatar Presentation @ Autodesk University

Friday, May 21, 2010

Three*8

WRITTEN COMPONENT ENTRY Style Translation [intial notes].

- style translation is the process of transforming an input motion into a new style while preserving its original content.

- Our solution learns to translate by analysing differences between performances of the same content in input and output styles.

- Style is a vital component of character animation. In the context of human speech, the delivery of a phrase greatly affects its meaning.

- Basic actions such as locomotion, the difference between a graceful strut and a defeated limp has a large impact on the tone of the final animation.

- Applications of human animation often require large data sets that contain many possible combinations of actions and styles.

- A database of normal locomotion, could be translated into crouching and limping styles while retaining subtle content variations such as turns and pauses. Our system can be used to extend the capabilities of techniques that rearrange mocap clips to generate novel content.

- Two motions in different styles typically contain very different purposes.

- Our model doesn’t account for kinematic interaction with the environment, the raw output may contain small visual artifacts, such as feet sliding on the ground.

- Such models (as in the paper’s) often require explicit frame correspondences to be solved during the translation process.

- Our solution draws its main inspiration from retargeting methods, which preserve content, but transform motion to adjust for skeleton size, kinematic constraints, or physics.

- Style is often difficult to encode in the procedural forms required by these methods.

- Statistical analysis of input-output relationships is a fundamental tool in nearly all areas of science and engineering. In computer graphics, such relationships have been used to translate styles of line drawings, complete image analogies and so on. Out method is the result of the comprehensive application of these principles to style translation.

- Blending and morphing approaches extrapolate from motion sequences with linear combinations.

- Such techniques allow for flexible synthesis of stylised motion, often with intuitive and tuneable parameters.

- Parametric approaches postulate a generative model of motion.

- Such general models of content would be needed to enhance a novice dance performance by removing mistakes.

- In all our evaluations, we acquired data in a mocap lab and processed it with standard commercial tools.

- Our data set contained various stylised locomotion’s: normal, limp, crouch, sneak, flap, proud, march, catwalk and inverse. Each style was performed at three speeds; slow, medium and fast.

- We found it took few (less than ten) to approach a near-optimal solution and a few more for full convergence.

- Style translation can operate on complex inputs; however, the translation itself must be relatively simple.

- Our system is intended for applications in which motions are specified by performance.

- It was not our objective to supplant existing motion synthesis techniques, but rather to present concepts that could be integrated into past and future work.

E. Hsu, K. Pulli, J. Popovic. “Style Translation for Human Motion”. Massachesetts Institute of Technology, Nokia Research Center.

Thursday, May 20, 2010

Three*7

WRITTEN COMPONENT ENTRY Practical Mocap in Everyday Surroundings [intial notes].

- commercial mocap systems produce excellent in-studio reconstructions, but offer no comparable solution for acquisition in everyday environments. We present a system for acquiring motion almost anywhere.

- Experimental results show that even motions that are traditionally difficult to acquire are recorded with ease within their natural settings.

- Motion data has revolutionised computer animation in the last decade.

- An entire industry has emerged in support of these activities, and numerous recordings of human performances are available in large motion repositories (e.g. mocap.cs.cmu.edu and www.moves.com)

- The majority of current acquisition systems inhibit broader use of motion analysis by requiring data collection within restrictive lab-like environments.

- Recording the activities, routines and motions of a human for an entire day is still challenging.

- We explore (in this paper) the design of a wearable self-contained system that is capable of recording and reconstructing everyday activities such as walking, biking and exercising.

- ‘our’ system is not the first acoustic-inertial tracker, but it is the first such system capable of reconstructing configurations for the entire body.

- The best reconstructions are not perfect, but their quality (with small size and improved versatility) suggest that our system may lead to new applications in augmented reality, human-computer interaction and other fields.

- Several mocap technologies have been proposed in the last 2 decades.

- Optical mocap systems track rerto-reflective markers or light emitting diodes placed on the body. Exact 3D marker locations are computed from the images recorded by the surrounding cameras using triangulation methods. Optical mocap is favoured in the computer-animation community and the film industry because of their exceptional accuracy and extremely fast update rates. Disadvantages of this approach are extreme cost and lack of portability.

- Image-based systems use computer vision techniques to obtain motion parameters directly from video footage without the use of special markers. These approaches are less accurate) than optical systems but they are more affordable and portable. They also suffer from line-of-sight problems.

- Mechanical systems require performers to wear exoskeletons. These systems measure joints angles directly (rather than estimating the positions of points on the body) and can record motions almost anywhere. Exoskeletons are uncomfortable to wear for extended periods and impede motion, although the problems are alleviated in some modern systems.

- Magnetic systems detect the positions and orientation using a magnetic field. These systems offer good accuracy and medium update rates with no line-of-sight issues. They are expensive, have high power consumption and are sensitive to the presence of metallic objects in the environment.

- Inertial mocap systems measure rotation of the joint angles using gyroscopes or accelerometers placed on each body limn. Like the mechanical systems, they are portable, but cannot measure positions and distances directly for applications that must sample the geometry of the environment. The measurements drift by significant amounts over extended time periods. Also, the motion of the root cannot be reliably recovered from inertial sensors alone, though (in some cases) the issue may be alleviated by detecting foot plans.

- Acoustic systems use the time-of-flight of an audio signal to compute the marker locations. Most current systems are not portable and handle only a small number of markers with the ‘Bat’ system. An ultrasonic pulse emitter is worn by a user, while multiple receivers are placed at fixed locations in the environment. A system by Hazas and Ward extends ultrasonic capabilities by using broadband signals, Vallidas alleviates occlusion problems with a spread-spectrum approach; Olson and colleagues are able to track receives without known emitter locations.

- Hybrid systems combine multiple sensor types to alleviate their individual short-comings. They aim to improve performance, rather than decrease cost and increase portability.

- Our system is capable of acquiring motions ranging from biking and driving, to skiing, table tennis and weight lifting.

- The results are processed at a rate of 10fps and visualised without any post-processing using an automatically skinned mesh.

- We evaluated the accuracy of our system by comparing it with Vicon’s optical motion capture system, known for its exceptional precision.

- Optical mocap is able to recover the root transformation without drift.

- Our sensing capabilities have lead us to explore multiple facets of our pose recovery system.

- Due to inherent physical limitations of our hardware components, we were unable to acquire high impact motions.

- Other types of motions that we are unable to acquire with the current prototype include interaction between multiple subjects, such as dancing.

- Our distance measurements depend on the speed of sound, which is affected by temperature and, to a lesser extent, humidity. To obtain a more precise speed of sound, one could use a digital thermometer or a calibration device prior to each capture session.

- Most significant limitation of our system is the lack of direct measurements of the root transformation.

- We have presented a wearable mocap system prototype that is entirely self-contained and capable of operating for extended periods of time in a large variety of environments.

- We should enrich motion repositories with varied data sets to further understand human motion. Restrictive recording requirements limit the scope of current motion data sets, which precents the broader application of motion processing. An inexpensive and versatile mocap system would enable the collection of extremely large sets of data. This enhanced infrastructure could then support large-scale analysis of human motion, including its style, efficiency and adaptability.

D. Vlasic, R. Adelsberge, G. Vannucci, J. Barnwell, M. Gross, W. Matusik, J. Popovic. “Practical Motion Capture in Everyday Surroundings”. Computer Science and Artifical Intelligence Laboratory, Massachesetts Institute of Technology, Mitsubishi Electric Research Laboratories, ETH Zurich.

Three*5

PRACTICAL COMPONENT ENTRY Main idea ‘line’ work.

- ‘A group of individuals observe the world around them in an attempt to see what you cannot see’.

- ‘ A group of individuals lead different lives, each unrelated to the next, observe what possibilities may lie beneath the skin of humanity and within their own subconscious’.

- ‘What do you see in this world?’ (tagline)

- ‘What do you see?’ (tagline)

- ‘Everyone is different, we all see different things in the world around us – and it is not necessarily ion the way that some would think’

- ‘Trapped in your own little world, you form you own visions of who people may be. It is unlikely that they appear as who they actually are’.

- ‘Your perception of the world is unique’.

- A group of people experience the world in different ways; its people, its events and its times’.

- ‘Everyone’s experience of the world is unique. This act of individuality is evident through their perceptions of people and events. What do you see?’.

- ‘An individual human being is unique/ What they see, and how they experience it is something that only they may perceive for themselves’.

- ‘The human condition is one of prejudice, preconception and selfishness. Each of us has a unique view and opinion of each other and our world. What do you see?’.

- ‘Each individual is unique, what they see and experience is dependant upon their perceptions. What do you see?’.

- ‘What would it be like to see through someone else’s eyes? Exploring the world through a different set of rules and guidelines – exploring another’s perception of the world through their psyche’.

- ‘We are all different. Psychologically and physically. What does this mean for the way we see others’.

- ‘See something a little different’. (tagline)

- ‘Something a little different’. (tagline)

- ‘Do you see what I see?’. (tagline)

- ‘Escape into my world’. (tagline)

- ‘Mine’s different to yours’. (tagline)

- ‘Mine not Yours’. (tagline)

- ‘We are all different. Our perceptions and experiences are unique. What does this mean for the way we see the world and its people?’.

Three*6

PRACTICAL COMPONENT ENTRY Monthly Milestones notes.

May 2010

- main idea ‘line/description’ finalised

- program learning/tests

June 2010

- story

- storyboard

- scout locations

- test reel

- concept art

- program learning/tests

July 2010

- scout locations

- concept art

- refine (finalise) story

- refine (finalise) storyboard

- program learning/tests

- test reel(s)

- animatic (rough)

August 2010

- test reel(s)

- animatic (finalise)

- concept art

- program learning/tests

September 2010

- test reel(s)

- final concept art

- program learning/tests

October 2010

- In Progress due (Creative Practice) – 15/10/10

o Pre-production complete

o Storyboard

o Animatic

o Concept art

o Test/examples reels

November 2010

- filming

December 2010

- filming

- edit

January 2011

- edit (finalise)

February 2011

- post-production

March 2011

- post-production

April 2011

- post-production

May 2011

- post-production

- extra filming (winter)

- re-edit

June 2011

- post-production

- extra filming (winter)

- re-edit

July 2011

- post-production

- finalise vfx

- finalise edit

August 2011

- sound

September 2011

- sound

October 2011 – 14/10/11

- Submission due.

Tuesday, May 18, 2010

Three*4

WRITTEN COMPONENT ENTRY Low Cost Mocap notes.

- motion capture, or mocap, is a technique of digitally recording the movements of real beings, usually humans of animals.

- mocap is considered to be a better technique for accurately generating movements for computer animation.

- 3 types;

o optical motion capture

o magnetic motion capture

o electro-mechanical motion capture

- in this paper, we describe the design and implementation of a low cost motion capture system that requires two low cost calibrated webcams.

- as all mocap systems involve a tracking phase, we adopt the mean-shift algorithm as the basis of object tracking.

- system has an inability to handle occlusion.

- provides input to animation applications, such as Poser.

- the mean-shift algorithm is one of the tracking techniques commonly used in computer vision research when the motion of the object to be tracked cannot be described by a motion model.

- the key notion in the mean-shift algorithm is the definition of a multivariate density function with a kernel function ‘K’ over a region in the image.

- the commonly used kernel functions are the normal, uniform and epanechnikov kernel functions.

- amongst the commonly used kernel function above, the epanechikov kernel, which has its kernel profile as a uniform distribution, is preferable than the other two.

- the Bhattacharyya coefficient is used to calculate the similarity measure between the two distributions.

- the necessary pieces of equipment required are two low-cost webcams, two tripods and a calibration frame.

- each webcam mounted on a tripod must be calibrated prior to any experiments. The current version of the system focuses on the capture of movements of the lower part of white circular markers to be put on the following joints; hip, two upper legs, knees, ankles and feet.

- to simplify the tracking process we darken the background.

- subject wears a dark non-glossy tight suit so that the white circular markers can be easily detected.

- the two webcams are directly connected to a computer via two USB ports.

- we currently use functions under the Matlab Image Acquisition Toolbox for image acquisition.

- camera calibration is a step for determining the 3x4 matrix that maps coordinates in the 3D world into the 2D image.

- in our system, we use a calibration target with two orthogonal faces, each of which has 6 reference points. The calibration target also implicitly defines in the scene a global coordinate system that can be referenced to in some other applications, such as Poser, for graphics rendering.

- markers were detected via a thresholding process.

- involves choosing a threshold value ‘t’ from the pixel intensity range of 0 to 255.

- the threshold value can then be estimated from the flat region in the intensity value histogram.

- the 9 markers are automatically labelled using a heuristic method.

- after all 9 markers have been detected, the system labels the top middle marker as marker #1.

- the assignment of marker number on the ‘y’ component of the marker coordinates.

- the mean-shift algorithm is employed to track the 9 white markers independently.

- there are 3 free parameters that can be set to fine tune the performance of the mean shift algorithm;

o radius

o threshold value

o number of histogram bins

- for the computation of the 3D coordinates of each marker, the two 3x4 matrices obtained above are combined to give 4 linear equations for the detected image coordinates of the marker in the two images. The 3D coordinates of each marker, relative to the implicit global coordinate system defined by the calibration frame, can be estimated using least-squares.

- in every experiment, we tested our system to track the movement of markers over 200 frames.

- we found that there is a 0.016 seconds delay between the acquisition of an image by the first webcam and the second.

- we found that the radius of the kernel windows is a crucial parameter to the performance of the mean-shift algorithm.

- issue has also been reported in that a window size that is too large can cause the tracker to become more easily distracted by background clutter and a window size that is too small can cause the kernel to roam around on a likelihood plateau around the mode, leading to poor object localisation.

- system captures movement of the lower body, it can be further extended to include the upper body.

- the notion of low cost mocap is important for demonstrating the fundamental idea of motion capture and for providing inputs for various advanced animation applications.

R. Budiman, M.Bennamoun, D.Q.Huynh. “Low Cost Motion Capture”. University of Western Australia.

Tuesday, May 25, 2010

Four*0

Three*9

Friday, May 21, 2010

Three*8

Thursday, May 20, 2010

Three*7

Three*5

Three*6

Tuesday, May 18, 2010

Three*4