Multi-agent trading environment for training robust reinforcement learning agents

Mikuláš, Pavel

Multi-agentní burzovní prostředí pro hledání robustních strategií pomocí zpětnovazebního učení

dc.contributor.advisor	Pilát, Martin
dc.creator	Mikuláš, Pavel
dc.date.accessioned	2024-04-08T08:41:10Z
dc.date.available	2024-04-08T08:41:10Z
dc.date.issued	2024
dc.identifier.uri	http://hdl.handle.net/20.500.11956/188486
dc.description.abstract	This thesis presents a comprehensive study of the application of reinforcement learning to algorithmic trading. The main focus of this thesis is on the generalization properties of various reinforcement learning algorithms, both from the data perspective and the applicability of the trained agents to real algorithmic trading. To that end, we develop a training environment taking into account various real-world factors influencing the performance of algorithmic trading strategies. We also experiment with the recurrent replay buffer extension of the DQN algorithm, known as R2D2, being, to the best of our knowledge, the first to employ this algorithm for the task of algorithmic trading. Each algorithm is evaluated against traditional algorithmic trading strategies, including the buy-and-hold strategy, to demonstrate the superior performance of the reinforcement learning strategies. On top of that we also provide a study on how the amount of training data and transaction costs influence the generalization of the algorithms to unseen market conditions. We show how transaction costs significantly increase the task complexity and that the R2D2 algorithm overperforms the commonly used baselines, as well as other state-of-the-art reinforcement learning algorithms in this task. 1	en_US
dc.description.abstract	Tato práce přináší rozsáhlou studii aplikace zpětnovazebního učení v oblasti algo- ritmického obchodování. Práce se zaměřuje zejména na to, jak modely zpětnovazebního učení generalizují, jak z pohledu velikosti trénovací množiny, tak z pohledu jejich ná- sledného přenesení na reálné finanční trhy. Za tímto cílem vytváříme simulační prostředí zohledňující důležité faktory, které ovlivňují výsledky obchodní strategie při reálném ob- chodování. V našich experimentech používáme také rozšíření algoritmu DQN, známé jako R2D2, které dosahuje velice slibných výsledků. Pokud je nám známo, je tato práce první, která algoritmus R2D2 aplikuje na oblast algorimického obchodování. Algoritmy natré- nované ve vytvořeném simulačním prostředí následně vyhodnocujeme oproti obvykle uží- vaným postupům algoritmického obchodování, abychom demonstrovali sílu modelů zpět- novazebního učení. Dále ukazujeme, jak zvyšování transakčních nákladů zvyšuje nároč- nost trénování vybraných modelů a že algoritmus R2D2 svými výsledky překonává běžné postupy algoritmického obchodování i ostatní modely zpětnovazebního učení v úloze al- goritmického obchodování. 1	cs_CZ
dc.language	English	cs_CZ
dc.language.iso	en_US
dc.publisher	Univerzita Karlova, Matematicko-fyzikální fakulta	cs_CZ
dc.subject	zpětnovazební učení\|algoritmické obchodování\|generalizace\|R2D2\|hluboké učení	cs_CZ
dc.subject	reinforcement learning\|algorithmic trading\|generalization\|R2D2\|deep learning	en_US
dc.title	Multi-agent trading environment for training robust reinforcement learning agents	en_US
dc.type	diplomová práce	cs_CZ
dcterms.created	2024
dcterms.dateAccepted	2024-02-13
dc.description.department	Department of Theoretical Computer Science and Mathematical Logic	en_US
dc.description.department	Katedra teoretické informatiky a matematické logiky	cs_CZ
dc.description.faculty	Faculty of Mathematics and Physics	en_US
dc.description.faculty	Matematicko-fyzikální fakulta	cs_CZ
dc.identifier.repId	257479
dc.title.translated	Multi-agentní burzovní prostředí pro hledání robustních strategií pomocí zpětnovazebního učení	cs_CZ
dc.contributor.referee	Neruda, Roman
thesis.degree.name	Mgr.
thesis.degree.level	navazující magisterské	cs_CZ
thesis.degree.discipline	Computer Science - Artificial Intelligence	en_US
thesis.degree.discipline	Informatika - Umělá inteligence	cs_CZ
thesis.degree.program	Computer Science - Artificial Intelligence	en_US
thesis.degree.program	Informatika - Umělá inteligence	cs_CZ
uk.thesis.type	diplomová práce	cs_CZ
uk.taxonomy.organization-cs	Matematicko-fyzikální fakulta::Katedra teoretické informatiky a matematické logiky	cs_CZ
uk.taxonomy.organization-en	Faculty of Mathematics and Physics::Department of Theoretical Computer Science and Mathematical Logic	en_US
uk.faculty-name.cs	Matematicko-fyzikální fakulta	cs_CZ
uk.faculty-name.en	Faculty of Mathematics and Physics	en_US
uk.faculty-abbr.cs	MFF	cs_CZ
uk.degree-discipline.cs	Informatika - Umělá inteligence	cs_CZ
uk.degree-discipline.en	Computer Science - Artificial Intelligence	en_US
uk.degree-program.cs	Informatika - Umělá inteligence	cs_CZ
uk.degree-program.en	Computer Science - Artificial Intelligence	en_US
thesis.grade.cs	Výborně	cs_CZ
thesis.grade.en	Excellent	en_US
uk.abstract.cs	Tato práce přináší rozsáhlou studii aplikace zpětnovazebního učení v oblasti algo- ritmického obchodování. Práce se zaměřuje zejména na to, jak modely zpětnovazebního učení generalizují, jak z pohledu velikosti trénovací množiny, tak z pohledu jejich ná- sledného přenesení na reálné finanční trhy. Za tímto cílem vytváříme simulační prostředí zohledňující důležité faktory, které ovlivňují výsledky obchodní strategie při reálném ob- chodování. V našich experimentech používáme také rozšíření algoritmu DQN, známé jako R2D2, které dosahuje velice slibných výsledků. Pokud je nám známo, je tato práce první, která algoritmus R2D2 aplikuje na oblast algorimického obchodování. Algoritmy natré- nované ve vytvořeném simulačním prostředí následně vyhodnocujeme oproti obvykle uží- vaným postupům algoritmického obchodování, abychom demonstrovali sílu modelů zpět- novazebního učení. Dále ukazujeme, jak zvyšování transakčních nákladů zvyšuje nároč- nost trénování vybraných modelů a že algoritmus R2D2 svými výsledky překonává běžné postupy algoritmického obchodování i ostatní modely zpětnovazebního učení v úloze al- goritmického obchodování. 1	cs_CZ
uk.abstract.en	This thesis presents a comprehensive study of the application of reinforcement learning to algorithmic trading. The main focus of this thesis is on the generalization properties of various reinforcement learning algorithms, both from the data perspective and the applicability of the trained agents to real algorithmic trading. To that end, we develop a training environment taking into account various real-world factors influencing the performance of algorithmic trading strategies. We also experiment with the recurrent replay buffer extension of the DQN algorithm, known as R2D2, being, to the best of our knowledge, the first to employ this algorithm for the task of algorithmic trading. Each algorithm is evaluated against traditional algorithmic trading strategies, including the buy-and-hold strategy, to demonstrate the superior performance of the reinforcement learning strategies. On top of that we also provide a study on how the amount of training data and transaction costs influence the generalization of the algorithms to unseen market conditions. We show how transaction costs significantly increase the task complexity and that the R2D2 algorithm overperforms the commonly used baselines, as well as other state-of-the-art reinforcement learning algorithms in this task. 1	en_US
uk.file-availability	V
uk.grantor	Univerzita Karlova, Matematicko-fyzikální fakulta, Katedra teoretické informatiky a matematické logiky	cs_CZ
thesis.grade.code	1
dc.contributor.consultant	Schmid, Martin
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	O