PERCEIVER-ACTOR: A Multi-Task Transformer for Robotic Manipulation

Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation

기본적으로 알면 이해에 도움이 되는 Transformer

Attention Is All You Need

<aside> 💡 Keywords: Transformers, Language Grounding, Manipulation, Behavior Cloning

</aside>

Abstract

<aside> 💡 “A language-conditioned BC agent that can learn to imitate a wide variety of 6-DoF manipulation tasks with just a few demonstrations per task”

</aside>

NLP 등 영역에서 인기가 많은 Transformer를 Robot Learning에 적용하고자 한다.

transformer structure

transformer structure

Transformer를 활용하려면 많은 데이터셋이 필요하지만 robot manipulation 데이터셋은 한정되어 있고 비싸다. 따라서 manipulation task에 transformer 적용을 위한 모델로 PerAct를 제안한다.

Input : [Language Goals(English)] + [RGB-D voxel observation]

→ Perceiver Transformer Encoding → Actor Network

Output : [Discretized Action] for next best voxel action

Introduction

Untitled.png

Percerver-Actor Model

overview image

overview image

Overview