ILSA: Incremental Learning for
Robot Shared Autonomy

In submission

1CMU, 2NIST, 3Pitt

Abstract

Shared autonomy holds promise for improving the usability and accessibility of assistive robotic arms, but current methods often rely on costly expert demonstrations and remain static after pretraining, limiting their ability to handle real-world variations. Even with extensive training data, unforeseen challenges—especially those that fundamentally alter task dynamics, such as unexpected obstacles or spatial constraints—can cause assistive policies to break down, leading to ineffective or unreliable assistance. To address this, we propose ILSA, an Incrementally Learned Shared Autonomy framework that continuously refines its assistive policy through user interactions, adapting to real-world challenges beyond the scope of pre-collected data. At the core of ILSA is a structured fine-tuning mechanism that enables continual improvement with each interaction by effectively integrating limited new interaction data while preserving prior knowledge, ensuring a balance between adaptation and generalization. A user study with 20 participants demonstrates ILSA's effectiveness, showing faster task completion and improved user experience compared to static alternatives.

Motivating Example

A motivating example is illustrated in the two figures below: In a cereal pouring task where the system needs to move a cereal container next to a bowl to pour, even a well-designed training set may not account for all possible real-world variations, such as the presence of obstacles between the two objects. If such obstacles exist in deployment, a policy trained without exposure to these cases may struggle to adapt, leading to suboptimal or even infeasible behavior, as shown in the first figure. Our system, ILSA, adresses this by incrementally learning from user interactions. As shown in the second figure, with only a few interactions, by the fourth trials, ILSA has learned effective collision avoidance from past user interactions, assisting the user in completing tasks more smoothly.

Figure 1

First Interaction

Figure 2

Fourth Interaction

Methodology

Action Generation Model

ILSA is built upon a transformer-based robot action generation model, which maps task states and user inputs to robot actions.

Fine-tuning Mechanism

At the core of ILSA is a structured fine-tuning mechanism that enables continual improvement with each interaction. This frequent adaptation is particularly challenging due to limited deployment data, which can affect model stability and generalizability.

To address this, our fine-tuning mechanism consists of three key components:

  • Corrected Trajectory Supervision: This design prevents the model from reinforcing suboptimal actions by generating corrected trajectories from user interactions and using them for fine-tuning.
  • Layered Supervision: This design integrates new interaction data with pretraining knowledge to balance adaptation and retention, without requiring manual data reweighting.
    Figure 1

    Figure 2

  • Partial Model Update: This design selectively fine-tunes critical components to enhance adaptation while maintaining generalization and robustness

Full ILSA Framework




Synthetic Kinematic Trajectory Generation

While our fine-tuning mechanism enables ILSA to adapt continuously, an effective initialization remains essential for first-time use. Instead of relying on costly human demonstrations, ILSA pretrains on lightweight simulated kinematic trajectories, generated using a rule-based procedure.


This approach provides an effective initial assistive policy while minimizing the need for real-world data collection.

Human Study

To evaluate ILSA, we conducted a user study subject to a university-approved IRB protocol with 20 participants (none of whom reported having prior experience teleoperating robots) using a Kinova robotic arm in two long-horizon tasks, which were designed to introduce unexpected challenges:


Demonstrations


In the first interactions, ILSA lacks the knowledge to avoid collisions, leading to objects hitting obstacles.

After just a few interactions, ILSA has learned effective collision avoidance, helping users complete tasks more smoothly.

Pure teleoperation

Detailed Setting and Results

We evaluate the following two hypotheses:
  • H1: ILSA supports faster robot manipulation task completion times and easier manipulator control as compared to pure teleoperation.
  • H2: ILSA effectively adapts to real-world challenges beyond the scope of training data, continuously improving task performance over time, in contrast to a static shared autonomy method.

In order to test the two hypotheses, we divide the participants into two groups: Group 1, consisting of 10 participants, used ILSA and pure teleoperation, while Group 2, consisting of 10 participants, used teleoperation and a static shared autonomy policy, which was pretrained on the initial simulated kinematic trajectories and kept static across the four task trials.

While the within-subject comparison in Group 1 is sufficient to provide evidence towards H1, Group 2 is included to ensure that the reduction in task completion time with ILSA is not simply due to the user's increasing familiarity with the shared autonomy system, which provides support for H2.




Task Completion Times

We primarily evaluate task completion time as a direct and interpretable measure of user efficiency and system adaptability, which is particularly relevant in assistive settings where efficiency and ease of use directly impact user experience.
Figure 1

Figure 2


User-reported Qualitative Metrics

In addition to task completion time, we collected participants' subjective evaluations using a custom-designed 7-point Likert-scale questionnaire (1 = strongly disagree; 4 = neutral; 7 = strongly agree) to assess their perception of robot control. The full questionnaire used in our user study is available here.
Figure 1

Figure 2


User Override Proportion

To further evaluate ILSA, we measured the proportion of time users had to take full control of the robot during shared autonomy. This metric captures how often the system failed to produce satisfactory actions, requiring the user to take over.

Override time is measured as a proportion of the robot's active motion time—excluding periods when the robot was idle, such as when the user paused or was thinking. This provides a more accurate reflection of how often users needed to take over during actual task execution.

Note: This analysis was not included in the formal paper due to space constraints, but we present it here as additional evidence of ILSA's effectiveness in reducing user burden.
User intervention for cereal pouring task

Cereal Pouring Task

User intervention for pill bottle storage task

Pill Bottle Storage Task

BibTeX

@misc{tao2024incrementallearningrobotshared,
      title={Incremental Learning for Robot Shared Autonomy}, 
      author={Yiran Tao and Guixiu Qiao and Dan Ding and Zackory Erickson},
      year={2024},
      eprint={2410.06315},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2410.06315}, 
}