Latest applications of robotics learning #RiTA2021
22 December 2021
A new beginning #postdoc
15 October 2021
Only 9 days after my PhD defence, I am happy to share I have started my postdoc at the Mathematics of Imaging and AI group at the University of Twente with the focus on the study of the problem of control of dynamical systems via data-driven methods. This research aims at understanding which priors need to be incorporated and how to do it in practice for making further steps in controlling dynamical systems.
5 take-home messages of my PhD journey
10 October 2021
Four years ago, I embarked on a quest for understanding:
What is “good” prior knowledge for robotics Reinforcement Learning for a) improving generalisation, b) sample efficiency, and c) robustness of the learned behaviours?
What I have found with blood, sweat, and tears, but also great fun, and achievements is that good knowledge for robotics Reinforcement Learning is the prior knowledge of the environment or the task that guides the agent to learn better representations and policies without constraining its learning. In my thesis, that you can find [here], I refer to this prior knowledge as loose prior knowledge.
In particular, here 5 take-home messages:
How can one learn state representations by incorporating prior knowledge of the environment?
By introducing prior knowledge into the loss functions, through the shaping of their kernels, using knowledge of the world (e.g. physics), reward properties, and underlying structures of continuous action spaces, to guide the learning of the meaningful state features and regularise the learned state space.
How can one jointly learn state and action representations by integrating prior knowledge of state and action space?
By assuming an underlying low-dimensional structure of state and action spaces, it is possible to transform a high-dimensional Markov Decision Process (MDP) into a low-dimensional MDP by learning the mappings from observation space to latent state space and from action space to latent action space with neural networks in a self-supervised fashion. The optimal solution of the abstract MDP is sample efficient to find and more robust to disturbances, such as unseen beforehand features, and optimal for the original MDP
How can one learn optimal policies by exploiting prior knowledge of the problem structures?
By decomposing the policies using our knowledge of the problem, we can exploit problem structures, and by learning from scratches the policies, we can improve generalisation, flexibility, and learning efficiency of the overall behaviour.
How can one learn optimal policies by shaping the reward function using prior knowledge of the maps of the environments?
By incorporating map information such as the position of the obstacles, entropy, and novelty of the robot’s pose in the reward function, we can improve performances, training efficiency, generalisation and robustness of the learned policy
How can one survive a PhD journey?
Completing a PhD is by far the hardest challenge I have faced so far in my academic career. Your thesis will ask you uncountable effort and time (way more than what your planned). However, the achievement is worth the effort and what is left after my defence is only joy and excitement for what will follow next (yes, the PhD is only the first step...). Focus on the (little) wins, clench you teeth when things get tough, and keep pushing till the finish line!
7 October 2021
My PhD journey was an extremely tough triathlon race. The swim was chaotic and unpredictable, the waters were choppy and dark, and it was hard to find the correct direction to move forward. Then the bike began, and the speed started rising. Every day was busy and passed quickly, and, before realising it, I was getting closer and closer to the cut-off time. However, the run was yet to come! The clock started ticking faster, the energy started dropping swiftly, and the legs started hurting, but I could not stop. I was tired, but the race was not done! With the finish line in sight, I want to thank all the people who supported me in this endeavour, from family to friends and from supervisors to colleagues. You made this thesis possible!
The final battle of my PhD journey
5 October 2021
The calm before the storm...
Everything is in order and I am ready for the final battle of my PhD journey: the defence! Looking forward to it!
State representation learning: the key Ingredient for sample efficient and robust robot learning #IROS2021
27 September 2021
What a week! Despite only digital, IROS was amazing! So much progress toward intelligent, autonomous, and self-learning robots and systems.
I had the amazing opportunity to not only present our most recent work [article][presentation][code], but also to chair the session of "Representation Learning" (one of my main key expertise and research interests).