Strojové učení pro řízení simulovaných vozidel

Kučera, Jiří

Machine Learning for Driving of Virtual Vehicles

diploma thesis (DEFENDED)

View/Open

Záznam o průběhu obhajoby (152.5Kb)

Permanent link

http://hdl.handle.net/20.500.11956/120970

Identifiers

Study Information System: 210814

Referee

Majerech, Vladan

Faculty / Institute

Faculty of Mathematics and Physics

Discipline

Artificial Intelligence

Department

Department of Software and Computer Science Education

Date of defense

14. 9. 2020

Publisher

Univerzita Karlova, Matematicko-fyzikální fakulta

Language

Czech

Grade

Excellent

Keywords (Czech)

Umělá Inteligence, Strojové Učení, Navigace, Simulace

Keywords (English)

Artificial Intelligence, Machine Learning, Navigation, Simulation

Auta ve virtuálních světech jsou typicky ovládána ručně vytvořenými pravidly. Vy- tváření těchto pravidel je často časově náročné a každá úprava prostředí může výsledné chování narušit. Hlavním cílem této práce je prozkoumat vhodné metody strojového učení a vytvořit jejich prostřednictvím dobře vypadající simulaci aut jezdících po městské sil- niční síti. Výsledným modelem je neuronová síť přímo ovládající plyn, brzdu a volant auta. Síť je schopná sledovat cestu a vyhýbat se srážkám s ostatními agenty na křižovat- kách bez semaforů. Pro trénování jsme použili algoritmus Proximal policy optimization a trénování jsme vylepšili technikami curriculum learning, GAIL, curiosity a behavioral cloning. V experimentech jsme ukázali, že ačkoli výsledné chování není zcela perfektní, je dostatečně dobré pro potencionální použití v simulaci. 1

Abstract (English)

Cars in virtual worlds are typically controlled by handcrafted rules. Creating such rules is often time-consuming and has to be repeated every time environment is altered. The goal of this thesis is to explore suitable machine learning techniques and create a good looking simulation of cars driving on an urban road network. The resulting model is a feedforward network directly controlling throttle, steering, and brake. The network is capable of following the assigned road and avoid collisions with other agents on crossroads without traffic lights. The model was trained using the Proximal policy optimization algorithm enhanced by GAIL, curiosity, behavioral cloning, and curriculum learning. In this paper, we have also shown that the resulting behavior, while not completely perfect, is good enough for use in a simulation. 1

Citace dokumentu

Metadata

Show full item record