PhD Thesis Defended!
PhDApr 10, 2024
I have successfully defended my thesis titled “Conversational Agents in Human-Machine Interaction: Reinforcement Learning and Theory of Mind in Language Modeling”. This milestone marks the culmination of years of hard work, research, and learning, all of which I am eager to share with you. For those interested, the thesis is available in an open access format on the Sapienza Library System IRIS, an extended abstract is available on Research Gate, and you can watch the presentation on YouTube.
TL;DR
My doctoral thesis tackles the critical challenges and advancements in human-computer interaction, particularly focusing on how large language models (LLMs) can align with human values and societal needs. The thesis scrutinizes whether these AI models, widely used in sectors like medicine and economics, can act autonomously yet remain consistent with desired human outcomes.
Key Points:
-
Human-Centric AI: The research highlights the shift towards AI that supports rather than replaces human efforts, emphasizing the need for AI systems to be aware of their societal impacts.
- Methodology: The investigation unfolds in three phases:
- Agency in AI: Using a modified “The Werewolf” game to test if AI can independently exhibit strategic behavior.
- Human-Like Communication: Exploring reinforcement learning to enable AI to communicate in human-understandable language, despite challenges in knowledge disparity.
- Theory of Mind (ToM): Developing auxiliary models that help AI anticipate and respond to human emotional and informational needs, improving both AI agency and alignment.
- Impact: The thesis proposes a framework that significantly advances the understanding of AI’s potential and limitations in mimicking human-like interaction, setting a robust foundation for future AI development that is both responsible and beneficial for society.
This body of work merges insights from machine learning, cognitive psychology, and ethical AI to ensure that AI technologies are developed in ways that are safe, reliable, and aligned with human societal norms.
My Research
My doctoral thesis explores the complex challenges and advancements in human-computer interaction, focusing primarily on the agency and misalignment of large language models (LLMs) within societal contexts. The core question addressed is whether these models can autonomously act in alignment with human values and societal outcomes, a concern amplified by their pervasive integration into various domains like medicine, chemistry, and economics.
Historically, AI development has shifted towards a human-centric approach, emphasized over the last decade, advocating for technology that augments rather than replaces human effort. The thesis argues for the necessity of enhancing LLMs with an awareness of their societal impacts, proposing a framework for improving AI systems’ comprehension of human contexts. This interdisciplinary challenge centers on training artificial agents to communicate effectively and appropriately within globally spoken languages.
Methodology
The thesis structure comprises three phases, each building upon the previous findings and supported by peer-reviewed research. The initial phase investigates whether AI agents can exhibit agency within artificial setups, specifically through an adapted version of “The Werewolf” game. Findings indicated that AI agents developed emergent communication skills that significantly increased their strategic efficacy in the game, suggesting the potential for AI to achieve autonomous agency through appropriate learning environments.
Subsequent sections delve into how AI can achieve human-like communication through reinforcement learning. This involves training agents to communicate in ways that are interpretable by humans, highlighting both successes in AI communication and the persistent challenges posed by knowledge disparities between communicating agents. The research demonstrates that while AI can adapt to share knowledge effectively, discrepancies in domain-specific understanding still hinder comprehensive communication, thus not fully resolving issues of agency.
The final chapters propose innovative approaches to align AI communication with human-like understanding through the integration of cognitive psychology concepts, specifically the Theory of Mind (ToM). This approach enables AI to predict and adapt to human reactions by understanding their informational needs and emotional responses, thus enhancing both AI agency and alignment with human values. The solution involves auxiliary models, or “simulators,” that refine AI’s communicative accuracy and adaptability without the need for extensive retraining of the primary AI models.
Summary
In summary, the research contributes significantly to understanding and enhancing the agency and alignment of AI within human contexts, proposing a multi-disciplinary approach that integrates advancements in machine learning, cognitive psychology, and ethical AI practices. The insights and methodologies developed offer a promising foundation for future explorations aimed at ensuring AI technologies operate responsibly and beneficially within society.