Welcome to my professional webpage.
I am a postdoctoral fellow in the Department of Mathematics and Computer Science at the Eindhoven University of Technology. I obtained my PhD in the topic of sequential learning with partial feedback at INRIA SequeL and Université de Lille under the supervision of Prof. Philippe Preux and Dr. Tanguy Urvoy. I received my master’s degree in Computer Science & Engineering from Indian Institute of Technology Madras under the supervison of Prof. Balaraman Ravindran.
My Erdös number is 3 thanks to Peter Auer (-> Pal Revesz -> Paul Erdös).
My research interests span across sequential decision-making and machine learning with ethical considerations such as fairness and privacy.
In sequential decision-making, I seek to understand how an intelligent agent “learns” from its interactions with the environment. Generally, I am interested in devising machine learning algorithms with strong mathematical foundations which can work with non-standard forms of feedback often encountered in real-life scenarios.
Secondly, I am also interested in interdisciplinary research with the aim of incorporating human aspects like fairness and privacy in machine learning.
News- As the lead organizer and presenter, I delivered a tutorial on advances in fairness-aware reinforcement learning, partnering with Mykola Pechenizkiy and Yingqian Zhang at the International Joint Conference on Artificial Intelligence (IJCAI) 2024. See additional information about the tutorial, including the slides at https://fair-rl.github.io/. The tutorial presented advances in fairness-aware reinforcement learning, encompassing both theoretical results and real-world applications. We covered motivating applications, fairness notions and the technical details of incorporating fairness objectives into reinforcement learning models, analyzed fairness in reinforcement learning as a multi-objective optimization problem, and explored impactful future directions.
- A paper accepted at the ICML 2024 Workshop on Models of Human Feedback for AI Alignment. In this paper, we introduced the problem of regret minimization in adversarial multi-dueling bandits. Our work addresses a gap in the literature by considering scenarios where the learner selects multiple arms at each round and observes the identity of the most preferred arm, based on arbitrary preference matrices. We propose a novel algorithm and prove that its expected cumulative regret is upper-bounded by $O((K \log K)^{1/3} T^{2/3})$. We also establish a matching lower bound of $\Omega(K^{1/3} T^{2/3})$. See the paper at arxiv:2406.12475.
- I gave a talk at the Conference on Advancing Behavioral Science through AI and Digital Health held in Ann Arbor, MI, USA. The topic was reinforcement learning-driven pain care recommendations. Here are the slides. (May 2024).
- Mykola Pechenizkiy, Yingqian Zhang and I will be delivering a tutorial on the advances in fairness-aware reinforcement learning at the International Joint Conference on Artificial Intelligence (IJCAI) 2024. In this tutorial, we will present advances in fairness-aware reinforcement learning encompassing theoretical results as well as real-world applications. We plan to cover motivating applications and technical details of incorporating fairness objectives into the reinforcement learning model, analyze fairness in reinforcement learning as multi-objective optimization, and explore impactful future directions. See you all in Jeju! (Apr 2024).
- I have been invited to give a talk at the Conference on Advancing Behavioral Science through AI and Digital Health. I will be speaking on reinforcement learning-driven healthcare recommendations. (Mar 2024).
- A pre-print detailing some of our initial work in my ongoing collocation with Prof John D. Piette is online - arxiv:2402.19226. Here our focus is on identifying (and rectifying) gender bias in personalized recommendations for pain care. In this initial work, we show that if certain patient information, such as self-reported pain measurements, is not considered in the decision-making process, then the quality of reinforcement learning-driven pain care recommendations for women can be notably inferior to those for men. (Mar 2024).
- A paper accepeted at the Symposium on Intelligent Data Analysis (IDA 2024). Under my guidance, Ronald C. van den Broek, Rik Litjens, Tobias Sagis, Luc Siecker, and Nina Verbeeke collaborated to produce this paper as a culmination of their course project (Jan 2024).
- I have been awarded the UTQ (University Teaching Qualification) i.e. BKO (Basiskwalificatie Onderwijs) certificate. (Oct 2023).
- I am going to co-deliver a tutorial on fair reinforcement learning at the 15th Asian Conference on Machine Learning (Sep 2023).
- I presented our work about autonomous exploration in reinforcement learning at the 32nd International Joint Conference on Artificial Intelligence (IJCAI), 2023. See the slides for the presentation here (Aug 2023).
- Two papers accepted at the European Workshop on Reinforcement Learning (EWRL), 2023. The first paper is on sparse-reward deep reinforcement learning with Jiong Li, who is one of my master’s students. The other is on multi-armed bandits where rewards arrive partially and they are observed with different delays. This paper is the result of a course project completed by Ronald C. van den Broek, Rik Litjens, Tobias Sagis, Luc Siecker and Nina Verbeeke. (Jul 2023).
- Our joint work with Rosa van Tuijn, Tianqin Lu, Emma Driesse, Koen Franken and Dr. Emilia Barakova has been accepted at the 19th International Conference on Human-Computer Interaction (INTERACT), 2023. In this work, we propose a personalized explainable recommendation device for cardiac rehabilitees (May 2023).
- Our joint work with Dr. Peter Auer and Dr. Ronad Ortner has been accepted at the 32nd International Joint Conference on Artificial Intelligence (IJCAI), 2023. In this work, we propose a meta-algorithm which can convert any RL algorithm with sublinear regret into an exploration algorithm with suitable guarantees on its sample complexity (Apr 2023).
- Our joint work with Dennis Collaris, Joost Jorritsma, Mykola Pechenizkiy and Jack (Jarke) van Wijk has been awarded the Runner-up Frontier Prize at the 21st Symposium on Intelligent Data Analysis (IDA 2023). Click here to see the paper. More information about the work can be found at https://explaining.ml/lemon (Apr 2023).
- Our joint work with Rosa van Tuijn, Tianqin Lu, Emma Driesse, Koen Franken and Dr. Emilia Barakova has been accepted as an extended abstract at the second International Conference on Hybrid Human-Artificial Intelligence (HHAI), 2023. In this work, we propose a personalized explainable recommendation device for cardiac rehabilitees. See the paper here (Apr 2023).
- Recently, I have been working on curiosity-driven exploration in sparse-reward deep reinforcement learning with one of my master’s students. Here’s a preliminary version of the work – arxiv:2302.10825. In this work, we propose a method called I-Go-Explore that combines the intrinsic curiosity module with the Go-Explore framework to address some of the limitations of the state of the art. (Mar 2023).
- Our paper titled – LEMON: Alternative Sampling for More Faithful Explanation through Local Surrogate Models, accepted at the Symposium on Intelligent Data Analysis (IDA), 2023 (Feb 2023).
- My paper on providing local differential privacy for sequential decision making in a changing environment accepted at AAAI Privacy Preserving Artificial Intelligence (PPAI), 2023 (Jan 2023).
- I am in the thesis committee for master’s thesis defense on the topic of multivariate distributional regression techniques at the Eindhoven University of Technology (Jan 2023).
- I completed a pedagogical course on Designing Courses & Projects (Dec 2022).
- I completed a pedagogical course on Teaching Skills (Dec 2022).
- Our paper on posterior sampling for constrained reinforcement learning accepted at the Reinforcement Learning for Real Life Workshop at NeurIPS, 2022 (Dec 2022).
- Our paper on batch-learning in stochastic linear bandits publised at International Conference on Data Mining (ICDM), 2022 (Dec 2022).
- I completed a pedagogical course on Facilitating Learning (Nov 2022).
- I am taking pedagogical courses to obtain University Teaching Qualification (UTQ/BKO) which is regarded as a proof of the competence of teaching in academic settings in the Netherlands (Oct 2022).
- We are applying for a NWO grant (Open Technology Programme 2022). Watch this space for job announcements! (Oct 2022).
- New paper about posterior sampling for constrained reinforcement learning (Sept 2022).
- In the academic year 2022-2023, I am going to supervise 3 MSc students and a group of BSc students from the Honors Academy in addition to the 2 PhD students I am currently supervising (Sept 2022).
- In the 1st quartile of the aceadmic year 2022-2023, I am teaching a course on reinforcement learning (Sept 2022).
- In the 1st quartile of the aceadmic year 2022-2023, I am co-teaching a course on embodying intelligent behavior in social context.
- New paper about batch learning in stochastic linear bandits to appear at ICDM 2022.
- Working on an extensive survey (and a tutorial) on fairness-aware reinforcement learning. See the intitial version here.
- In the 4th quartile of the aceadmic year 2021-2022, I am supervising and evaluating 10 student course projects on reinforcement learning as part of 2AMC15 Data Intelligence Challenge
- I completed a course on supervision of PhD students.
- In the 2nd quartile of the aceadmic year 2021-2022, I am going to a part of the assessment committee for bachelor projects in data science.