[ETRA’22] Gaze-Hand Alignment: Gaze and Hand Pointers for Interacting with Menus in Augmented Reality

Gaze and gestures are well suited to complement Augmented Reality (AR) and interactions that are situated in the world. Current trends in AR technology reflect this with the integration of both hand- and eye-tracking in headworn display (HMD) devices. However, the design of input techniques that rely on gaze and freehand gestures is challenging, as it requires robust segmentation of input from the continuous movement of the user’s eyes and hands.

In gestural interfaces, manual pointing is completed by a distinct gesture such as a tap on a surface, or a pinch in mid-air. Where both modalities are available, gaze lends itself better for the initial pointing step, as our eyes naturally focus on objects that we aim to manipulate, whereas our hands are more effective for deliberate input to complete a selection. In past work, this has been demonstrated by combining gaze with pinch as the delimiting mid-air gesture. Gaze+Pinch is also supported by emerging AR headsets (such as the HoloLens 2) as state of the art gaze-based selection technique.

We propose Gaze-Hand Alignment as a principle for gaze and freehand input. In contrast to
Gaze&Pinch and comparable techniques, we are using both modalities for pointing, and alignment of their input as selection trigger. The key idea is to leverage that the eyes naturally look ahead to a pointing target, followed by the hands. This enables us to use gaze for pre-selection, and manual crossing of a pre-selected target to trigger a selection as soon as the hand catches up with the eyes and aligns with gaze. We introduce Gaze&Finger and Gaze&Hand as novel techniques for AR context menus based on the concept.

  • Gaze&Finger combines gaze with perspective-based manual pointing where a ray is cast from the eye position over the user’s index finger. A user looks at targets of interest and completes selection by lifting their finger into the line of sight. We use the same principle to invoke a menu and to select from it. The menu is warped to the user’s hand to avoid parallax issues and scaled to target distance to support an illusion of direct touch.
  • Gaze&Hand combines gaze with indirect manual input to reduce effort and arm fatigue. The technique uses dwell time for menu activation and instantiation of a cursor, and alignment of cursor and gaze for selection from the menu.

In this paper, we compare the proposed Gaze&Finger and Gaze&Hand to two baseline techniques of the literature.

Both techniques have in common with Gaze&Pinch (Fig. 1c) that initial selection is by gaze and confirmation by manual gesture, but alignment is spatial and implicitly guided by gaze, whereas a pinch is semantic and requires separate attention to gaze.

The four techniques were evaluated on a task designed to represent contextual AR menus. We found that all three gaze-based techniques outperformed the manual condition. Gaze&Pinch was fastest for menu activation and Gaze&Finger most efficient for item selection. Gaze&Hand was perceived to be least physically demanding but had the highest error rate. While providing evidence of the efficacy of gaze-assistance for mid-air interaction, the study also contributes insight into trade-offs in the design of selection techniques, specifically spatial versus semantic and direct versus indirect use of gestures.

Abstract

Gaze and freehand gestures suit Augmented Reality as users can interact with objects at a distance without need for a separate input device. We propose Gaze-Hand Alignment as a novel multimodal selection principle, defined by concurrent use of both gaze and hand for pointing and alignment of their input on an object as selection trigger. Gaze naturally precedes manual action and is leveraged for pre-selection, and manual crossing of a pre-selected target completes the selection. We demonstrate the principle in two novel techniques, Gaze&Finger for input by direct alignment of hand and finger raised into the line of sight, and Gaze&Hand for input by indirect alignment of a cursor with relative hand movement. In a menu selection experiment, we evaluate the techniques in comparison with Gaze&Pinch and a hands-only baseline. The study showed the
gaze-assisted techniques to outperform hands-only input, and gives insight into trade-offs in combining gaze with direct or indirect, and spatial or semantic freehand gestures.

Mathias N. Lystbæk, Peter Rosenberg, Ken Pfeuffer, Jens Emil Grønbæk, and Hans Gellersen. 2022. Gaze-Hand Alignment: Combining Eye Gaze and Mid-Air Pointing for Interacting with Menus in Augmented Reality. Proc. ACM Hum.-Comput. Interact. 6, ETRA, Article 145 (May 2022), 18 pages. https://doi.org/10.1145/3530886 , PDF

Leave a comment