Using machine learning to teach robots to get dressed
In the Siggraph 2018 paper Learning to Dress: Synthesizing Human Dressing Motion via Deep Reinforcement Learning, a Georgia Institute of Technology/Google Brain research team describe how they taught body-shame to an AI, leaving it with an unstoppable compulsion to clothe itself before the frowning mien of God.
The AI uses machine learning tools to "automatically discover robust dressing techniques," and manages to train a robust get-dressed model despite the high computational expense of simulating cloth.
The paper's fascinating: the secret to getting an AI to get dressed is haptics -- a sense of touch that is used to dynamically retune the AI's coordination to adjust to the rippling, slithering, treacherous textiles -- the model incorporates the ripping point of the cloth and penalizes AIs that rent their garments asunder while clothing themselves.
In this task, a t-shirt is initialized on the character's shoulders withthe character's neck contained within the collar. To randomize theinitial garment state, we apply a random impulse force with fixedmagnitude to all the garment vertices at the beginning of the simu-lation. We allow the garment to settle for 1s before the characterbegins to move.
The first control policy completes the task of moving the rightend effector into gripping range of the specified grip feature. Thepolicy attempts to match a given position and orientation targetin the garment feature space. Once the error threshold is reached,control transitions to an alignment policy designed to "tuck" theleft end effector and forearm under the waist feature of the garmentin preparation for dressing the arm. This policy attempts to containthe arm within the triangle formed by the gripping hand and theshoulders. This heuristic approximates the opening of the garmentwaist feature. In addition, this policy is rewarded for contact withthe garment interior and penalized for geodesic distance from aselected point on the interior of the garment. Once interior contactis detected, and the arm is within the heuristic triangle, control istransitioned to the left sleeve dressing controller which attempts tominimize the end effector contact geodesic distance from the endfeature of the sleeve and maximize the containment depth of the armwithin the sleeve entrance feature. A task vector is provided whichindicates the direction the end effector should move to decrease itscontact geodesic distance (or points to the garment feature if not in contact). Once the limb has passed a threshold distance throughthe sleeve, the re-grip controller directs the hands together intoposition to exchange grip from right hand to left. Once the lefthand is within a threshold distance of its gripping target, the gripexchange is triggered and control is transitioned to the second "tuck"control policy with the same purpose and transition criteria as thefirst. The second sleeve policy is then run to pass the right armthrough the right sleeve. At this point, the seventh and final policyis used to guide the character back to its start pose while avoidinggarment tearing.
Learning to Dress: Synthesizing Human Dressing Motion via DeepReinforcement Learning [Alexander Clegg, Wenhao Yu, Jie Tan, C. Karen Liu And Greg Turk/ACM Transactions on Graphics, Vol. 37, No. 6]
(via JWZ)