Provas de CAT do Aluno João Manuel Godinho Ribeiro

Title: "Making Friends In the Dark: Ad Hoc Teamwork under Partial Observability "

Data: 28 de Setembro de 2022

Hora: 10H00

Link para Sessão de Zoom:  https://videoconf-colibri.zoom.us/j/91847799319

Thesis Abstract:

One of many aspects of human intelligence is the ability to cooperate with strangers when unexpected circumstances occur. Such impromptu cooperation situations are studied in the multiagent systems literature under the name of "ad hoc teamwork", in which an autonomous agent is expected to cooperate with unknown teammates in performing unknown tasks, with no pre-coordination or communication protocols available. In order to identify unknown teammate behavior and intentions, current approaches rely on assumptions often infeasible in real-world scenarios, such as the availability of reward signals, visible teammate actions, and full observability of the environment. In this thesis we alleviate such assumptions and explore how ad hoc agents can cooperate with unknown teammates in performing unknown tasks under partial observability, allowing for more robust approaches and pushing ad hoc teamwork to scenarios involving, for example, the cooperation between robots and humans. We start by addressing the sample complexity of existing ad hoc teamwork approaches based on reinforcement learning. We contribute TEAMSTER, a novel model-based approach to ad hoc teamwork that extends the state-of-the-art PLASTIC architecture. We then address the problem of partial observability in planning-based ad hoc teamwork and contribute novel decision-theoretic approaches that do not require that the ad hoc agent has full observability of the teammates' actions or the environment state. Our approaches (named BOPA and ATPO) are able to successfully identify the task and teammates that the ad hoc agent is interacting with, effectively engaging in ad hoc teamwork. We test our proposed approaches in benchmark simulation scenarios from the multiagent systems literature and in a robotic task involving human-robot teamwork. In the remainder of the thesis, we propose to combine the ideas behind ATPO and TEAMSTER, leveraging the power of deep neural networks and reinforcement learning to address large-scale problems. This new line of work will effectively extend TEAMSTER to settings with partial observability and ATPO to large-scale domains, making use of the power of deep reinforcement learning. Finally, we propose to conclude this thesis research by investigating the problem of communication in ad hoc teamwork.