The short-loading cycle is a repetitive task performed in high quantities making it a good candidate for automation. Expert operators perform this task to upkeep high productivity while minimizing the environmental impact of using energy to propel the wheel loader. The need to balance productivity and environmental performance is essential for the sub-task of navigating the wheel loader between the pile of material and a dump truck receiving the material. This task is further complicated by behaviours of the wheel loader such as wheel slip depending on the tire-to-surface friction that is hard to model. Such uncertainties motivate the use of data-driven and adaptable approaches like reinforcement learning to automate navigation. In this paper, we examine the possibility to use reinforcement learning for the navigation sub-task. We focus on the process of developing a solution to the complete sub-task by decomposing it into two distinct steps and training two different agents to perform them separately. These steps are reversing from the pile and approaching the dump truck. The agents are trained in a simulation environment in which the wheel loader is modelled. Our results indicate that task decomposition can be helpful in performing the navigation compared to training a single agent for the entire sub-task. We present unsuccessful experiments using a single agent for the entire sub-task to illustrate difficulties associated with such an approach. A video of the results is available online Video available at https://youtu.be/IZbgvHvSltI.
This research was conducted with support from Sweden’s Innovation Agency and the VALD project under grant agreement no. 2021-05035.