About me

I am a master’s student in Computer Science at Simon Fraser University with Sharan Vaswani, working on Reinforcement Learning theory from an optimization perspective. Specifically, my research has resulted in practical entropy-regularized policy gradient algorithms with proven convergence guarantees. The papers featuring my work are accepted in the NeurIPS Optimization for Machine Learning (OPT2023), and in the Reinforcement Learning Conference (RLC 2024). I am currently doing an internship at Huawei Noah’s Arc Lab, working on LLM agents in the Embodied AI setting.

From 2017-2022, I studied for a bachelor’s in Computer Engineering at Amirkabir University of Technology. My BSc thesis was a comparison between Actor-Critic Reinforcement Learning algorithms for stock trading. In high school, I advanced to the 3rd stage of the Iranian National Olympiad in Informatics, which only 80 students reached. At age 11, I qualified as a member of the National Organization for Development of Exceptional Talents, where I found my interest in programing, fascinated by how fast computers can explore the solution space of mathematical problems.