Frontiers in Robotics and AI

RSS Feed for Frontiers in Robotics and AI | New and Recent Articles
Subscribe to Frontiers in Robotics and AI feed

This paper presents a novel webcam-based approach for gaze estimation on computer screens. Utilizing appearance based gaze estimation models, the system provides a method for mapping the gaze vector from the user’s perspective onto the computer screen. Notably, it determines the user’s 3D position in front of the screen, using only a 2D webcam without the need for additional markers or equipment. The study presents a comprehensive comparative analysis, assessing the performance of the proposed method against established eye tracking solutions. This includes a direct comparison with the purpose-built Tobii Eye Tracker 5, a high-end hardware solution, and the webcam-based GazeRecorder software. In experiments replicating head movements, especially those imitating yaw rotations, the study brings to light the inherent difficulties associated with tracking such motions using 2D webcams. This research introduces a solution by integrating Structure from Motion (SfM) into the Convolutional Neural Network (CNN) model. The study’s accomplishments include showcasing the potential for accurate screen gaze tracking with a simple webcam, presenting a novel approach for physical distance computation, and proposing compensation for head movements, laying the groundwork for advancements in real-world gaze estimation scenarios.

In recent years, virtual idols have garnered considerable attention because they can perform activities similar to real idols. However, as they are fictitious idols with nonphysical presence, they cannot perform physical interactions such as handshake. Combining a robotic hand with a display showing virtual idols is the one of the methods to solve this problem. Nonetheless a physical handshake is possible, the form of handshake that can effectively induce the desirable behavior is unclear. In this study, we adopted a robotic hand as an interface and aimed to imitate the behavior of real idols. To test the effects of this behavior, we conducted step-wise experiments. The series of experiments revealed that the handshake by the robotic hand increased the feeling of intimacy toward the virtual idol, and it became more enjoyable to respond to a request from the virtual idol. In addition, viewing the virtual idols during the handshake increased the feeling of intimacy with the virtual idol. Moreover, the method of the hand-shake peculiar to idols, which tried to keep holding the user’s hand after the conversation, increased the feeling of intimacy to the virtual idol.

Automated disassembly is increasingly in focus for Recycling, Re-use, and Remanufacturing (Re-X) activities. Trends in digitalization, in particular digital twin (DT) technologies and the digital product passport, as well as recently proposed European legislation such as the Net Zero and the Critical materials Acts will accelerate digitalization of product documentation and factory processes. In this contribution we look beyond these activities by discussing digital information for stakeholders at the Re-X segment of the value-chain. Furthermore, we present an approach to automated product disassembly based on different levels of available product information. The challenges for automated disassembly and the subsequent requirements on modeling of disassembly processes and product states for electronic waste are examined. The authors use a top-down (e.g., review of existing standards and process definitions) methodology to define an initial data model for disassembly processes. An additional bottom-up approach, whereby 5 exemplary electronics products were manually disassembled, was employed to analyze the efficacy of the initial data model and to offer improvements. This paper reports on our suggested informal data models for automatic electronics disassembly and the associated robotic skills.

The targeted use of social robots for the family demands a better understanding of multiple stakeholders’ privacy concerns, including those of parents and children. Through a co-learning workshop which introduced families to the functions and hypothetical use of social robots in the home, we present preliminary evidence from 6 families that exhibits how parents and children have different comfort levels with robots collecting and sharing information across different use contexts. Conversations and booklet answers reveal that parents adopted their child’s decision in scenarios where they expect children to have more agency, such as in cases of homework completion or cleaning up toys, and when children proposed what their parents found to be acceptable reasoning for their decisions. Families expressed relief when they shared the same reasoning when coming to conclusive decisions, signifying an agreement of boundary management between the robot and the family. In cases where parents and children did not agree, they rejected a binary, either-or decision and opted for a third type of response, reflecting skepticism, uncertainty and/or compromise. Our work highlights the benefits of involving parents and children in child- and family-centered research, including parental abilities to provide cognitive scaffolding and personalize hypothetical scenarios for their children.

Introduction: The teaching process plays a crucial role in the training of professionals. Traditional classroom-based teaching methods, while foundational, often struggle to effectively motivate students. The integration of interactive learning experiences, such as visuo-haptic simulators, presents an opportunity to enhance both student engagement and comprehension.

Methods: In this study, three simulators were developed to explore the impact of visuo-haptic simulations on engineering students’ engagement and their perceptions of learning basic physics concepts. The study used an adapted end-user computing satisfaction questionnaire to assess students’ experiences and perceptions of the simulators’ usability and its utility in learning.

Results: Feedback from participants suggests a positive reception towards the use of visuo-haptic simulators, highlighting their usefulness in improving the understanding of complex physics principles.

Discussion: Results suggest that incorporating visuo-haptic simulations into educational contexts may offer significant benefits, particularly in STEM courses, where traditional methods may be limited. The positive responses from participants underscore the potential of computer simulations to innovate pedagogical strategies. Future research will focus on assessing the effectiveness of these simulators in enhancing students’ learning and understanding of these concepts in higher-education physics courses.

Along with the development of speech and language technologies, the market for speech-enabled human-robot interactions (HRI) has grown in recent years. However, it is found that people feel their conversational interactions with such robots are far from satisfactory. One of the reasons is the habitability gap, where the usability of a speech-enabled agent drops when its flexibility increases. For social robots, such flexibility is reflected in the diverse choice of robots’ appearances, sounds and behaviours, which shape a robot’s ‘affordance’. Whilst designers or users have enjoyed the freedom of constructing a social robot by integrating off-the-shelf technologies, such freedom comes at a potential cost: the users’ perceptions and satisfaction. Designing appropriate affordances is essential for the quality of HRI. It is hypothesised that a social robot with aligned affordances could create an appropriate perception of the robot and increase users’ satisfaction when speaking with it. Given that previous studies of affordance alignment mainly focus on one interface’s characteristics and face-voice match, we aim to deepen our understanding of affordance alignment with a robot’s behaviours and use cases. In particular, we investigate how a robot’s affordances affect users’ perceptions in different types of use cases. For this purpose, we conducted an exploratory experiment that included three different affordance settings (adult-like, child-like, and robot-like) and three use cases (informative, emotional, and hybrid). Participants were invited to talk to social robots in person. A mixed-methods approach was employed for quantitative and qualitative analysis of 156 interaction samples. The results show that static affordance (face and voice) has a statistically significant effect on the perceived warmth of the first impression; use cases affect people’s perceptions more on perceived competence and warmth before and after interactions. In addition, it shows the importance of aligning static affordance with behavioural affordance. General design principles of behavioural affordances are proposed. We anticipate that our empirical evidence will provide a clearer guideline for speech-enabled social robots’ affordance design. It will be a starting point for more sophisticated design guidelines. For example, personalised affordance design for individual or group users in different contexts.

Introduction: Collaborative robots, designed to work alongside humans for manipulating end-effectors, greatly benefit from the implementation of active constraints. This process comprises the definition of a boundary, followed by the enforcement of some control algorithm when the robot tooltip interacts with the generated boundary. Contact with the constraint boundary is communicated to the human operator through various potential forms of feedback. In fields like surgical robotics, where patient safety is paramount, implementing active constraints can prevent the robot from interacting with portions of the patient anatomy that shouldn’t be operated on. Despite improvements in orthopaedic surgical robots, however, there exists a gap between bulky systems with haptic feedback capabilities and miniaturised systems that only allow for boundary control, where interaction with the active constraint boundary interrupts robot functions. Generally, active constraint generation relies on optical tracking systems and preoperative imaging techniques.

Methods: This paper presents a refined version of the Signature Robot, a three degrees-of-freedom, hands-on collaborative system for orthopaedic surgery. Additionally, it presents a method for generating and enforcing active constraints “on-the-fly” using our previously introduced monocular, RGB, camera-based network, SimPS-Net. The network was deployed in real-time for the purpose of boundary definition. This boundary was subsequently used for constraint enforcement testing. The robot was utilised to test two different active constraints: a safe region and a restricted region.

Results: The network success rate, defined as the ratio of correct over total object localisation results, was calculated to be 54.7% ± 5.2%. In the safe region case, haptic feedback resisted tooltip manipulation beyond the active constraint boundary, with a mean distance from the boundary of 2.70 mm ± 0.37 mm and a mean exit duration of 0.76 s ± 0.11 s. For the restricted-zone constraint, the operator was successfully prevented from penetrating the boundary in 100% of attempts.

Discussion: This paper showcases the viability of the proposed robotic platform and presents promising results of a versatile constraint generation and enforcement pipeline.

Robots have tremendous potential, and have recently been introduced not only for simple operations in factories, but also in workplaces where customer service communication is required. However, communication robots have not always been accepted. This study proposes a three-stage (first contact, interaction, and decision) model for robot acceptance based on the human cognitive process flow to design preferred robots and clarifies the elements of the robot and the processes that affect robot acceptance decision-making. Unlike previous robot acceptance models, the current model focuses on a sequential account of how people decide to accept, considering the interaction (or carry-over) effect between impressions established at each stage. According to the model, this study conducted a scenario-based experiment focusing on the impression of the first contact (a robot’s appearance) and that formed during the interaction with robot (politeness of its conversation and behavior) on robot acceptance in both successful and slightly failed situations. The better the appearance of the robot and the more polite its behavior, the greater the acceptance rate. Importantly, there was no interaction between these two factors. The results indicating that the impressions of the first contact and interaction are additively processed suggest that we should accumulate findings that improving the appearance of the robot and making its communication behavior more human-like in politeness will lead to a more acceptable robot design.

Actuator failure on a remotely deployed robot results in decreased efficiency or even renders it inoperable. Robustness to these failures will become critical as robots are required to be more independent and operate out of the range of repair. To address these challenges, we present two approaches based on modular robotic architecture to improve robustness to actuator failure of both fixed-configuration robots and modular reconfigurable robots. Our work uses modular reconfigurable robots capable of modifying their style of locomotion and changing their designed morphology through ejecting modules. This framework improved the distance travelled and decreased the effort to move through the environment of simulated and physical robots. When the deployed robot was allowed to change its locomotion style, it showed improved robustness to actuator failure when compared to a robot with a fixed controller. Furthermore, a robot capable of changing its locomotion and design morphology statistically outlasted both tests with a fixed morphology. Testing was carried out using a gazebo simulation and validated in multiple tests in the field. We show for the first time that ejecting modular failed components can improve the overall mission length.

One of the greatest challenges to the automated production of goods is equipment malfunction. Ideally, machines should be able to automatically predict and detect operational faults in order to minimize downtime and plan for timely maintenance. While traditional condition-based maintenance (CBM) involves costly sensor additions and engineering, machine learning approaches offer the potential to learn from already existing sensors. Implementations of data-driven CBM typically use supervised and semi-supervised learning to classify faults. In addition to a large collection of operation data, records of faulty operation are also necessary, which are often costly to obtain. Instead of classifying faults, we use an approach to detect abnormal behaviour within the machine’s operation. This approach is analogous to semi-supervised anomaly detection in machine learning (ML), with important distinctions in experimental design and evaluation specific to the problem of industrial fault detection. We present a novel method of machine fault detection using temporal-difference learning and General Value Functions (GVFs). Using GVFs, we form a predictive model of sensor data to detect faulty behaviour. As sensor data from machines is not i.i.d. but closer to Markovian sampling, temporal-difference learning methods should be well suited for this data. We compare our GVF outlier detection (GVFOD) algorithm to a broad selection of multivariate and temporal outlier detection methods, using datasets collected from a tabletop robot emulating the movement of an industrial actuator. We find that not only does GVFOD achieve the same recall score as other multivariate OD algorithms, it attains significantly higher precision. Furthermore, GVFOD has intuitive hyperparameters which can be selected based upon expert knowledge of the application. Together, these findings allow for a more reliable detection of abnormal machine behaviour to allow ideal timing of maintenance; saving resources, time and cost.

Introduction: Children and adolescents with neurological impairments face reduced participation and independence in daily life activities due to walking difficulties. Existing assistive devices often offer insufficient support, potentially leading to wheelchair dependence and limiting physical activity and daily life engagement. Mobile wearable robots, such as exoskeletons and exosuits, have shown promise in supporting adults during activities of daily living but are underexplored for children.

Methods: We conducted a cross-sectional study to examine the potential of a cable-driven exosuit, the Myosuit, to enhance walking efficiency in adolescents with diverse ambulatory impairments. Each participant walked a course including up-hill, down-hill, level ground walking, and stairs ascending and descending, with and without the exosuit’s assistance. We monitored the time and step count to complete the course and the average heart rate and muscle activity. Additionally, we assessed the adolescents’ perspective on the exosuit’s utility using a visual analog scale.

Results: Six adolescents completed the study. Although not statistically significant, five participants completed the course with the exosuit’s assistance in reduced time (time reduction range: [-3.87, 17.42]%, p-value: 0.08, effect size: 0.88). The number of steps taken decreased significantly with the Myosuit’s assistance (steps reduction range: [1.07, 15.71]%, p-value: 0.04, effect size: 0.90). Heart rate and muscle activity did not differ between Myosuit-assisted and unassisted conditions (p-value: 0.96 and 0.35, effect size: 0.02 and 0.42, respectively). Participants generally perceived reduced effort and increased safety with the Myosuit’s assistance, especially during tasks involving concentric contractions (e.g., walking uphill). Three participants expressed a willingness to use the Myosuit in daily life, while the others found it heavy or too conspicuous.

Discussion: Increased walking speed without increasing physical effort when performing activities of daily living could lead to higher levels of participation and increased functional independence. Despite perceiving the benefits introduced by the exosuit’s assistance, adolescents reported the need for further modification of the device design before using it extensively at home and in the community.

Heterogeneous multi-agent systems can be deployed to complete a variety of tasks, including some that are impossible using a single generic modality. This paper introduces an approach to solving the problem of cooperative behavior planning in small heterogeneous robot teams where members can both function independently as well as physically interact with each other in ways that give rise to additional functionality. This approach enables, for the first time, the cooperative completion of tasks that are infeasible when using any single modality from those agents comprising the team.

Introduction: It is crucial to identify neurodevelopmental disorders in infants early on for timely intervention to improve their long-term outcomes. Combining natural play with quantitative measurements of developmental milestones can be an effective way to swiftly and efficiently detect infants who are at risk of neurodevelopmental delays. Clinical studies have established differences in toy interaction behaviors between full-term infants and pre-term infants who are at risk for cerebral palsy and other developmental disorders.

Methods: The proposed toy aims to improve the quantitative assessment of infant-toy interactions and fully automate the process of detecting those infants at risk of developing motor delays. This paper describes the design and development of a toy that uniquely utilizes a collection of soft lossy force sensors which are developed using optical fibers to gather play interaction data from infants laying supine in a gym. An example interaction database was created by having 15 adults complete a total of 2480 interactions with the toy consisting of 620 touches, 620 punches—“kick substitute,” 620 weak grasps and 620 strong grasps.

Results: The data is analyzed for patterns of interaction with the toy face using a machine learning model developed to classify the four interactions present in the database. Results indicate that the configuration of 6 soft force sensors on the face created unique activation patterns.

Discussion: The machine learning algorithm was able to identify the distinct action types from the data, suggesting the potential usability of the toy. Next steps involve sensorizing the entire toy and testing with infants.

The environmental pollution caused by various sources has escalated the climate crisis making the need to establish reliable, intelligent, and persistent environmental monitoring solutions more crucial than ever. Mobile sensing systems are a popular platform due to their cost-effectiveness and adaptability. However, in practice, operation environments demand highly intelligent and robust systems that can cope with an environment’s changing dynamics. To achieve this reinforcement learning has become a popular tool as it facilitates the training of intelligent and robust sensing agents that can handle unknown and extreme conditions. In this paper, a framework that formulates active sensing as a reinforcement learning problem is proposed. This framework allows unification with multiple essential environmental monitoring tasks and algorithms such as coverage, patrolling, source seeking, exploration and search and rescue. The unified framework represents a step towards bridging the divide between theoretical advancements in reinforcement learning and real-world applications in environmental monitoring. A critical review of the literature in this field is carried out and it is found that despite the potential of reinforcement learning for environmental active sensing applications there is still a lack of practical implementation and most work remains in the simulation phase. It is also noted that despite the consensus that, multi-agent systems are crucial to fully realize the potential of active sensing there is a lack of research in this area.

This survey reviews advances in 3D object detection approaches for autonomous driving. A brief introduction to 2D object detection is first discussed and drawbacks of the existing methodologies are identified for highly dynamic environments. Subsequently, this paper reviews the state-of-the-art 3D object detection techniques that utilizes monocular and stereo vision for reliable detection in urban settings. Based on depth inference basis, learning schemes, and internal representation, this work presents a method taxonomy of three classes: model-based and geometrically constrained approaches, end-to-end learning methodologies, and hybrid methods. There is highlighted segment for current trend of multi-view detectors as end-to-end methods due to their boosted robustness. Detectors from the last two kinds were specially selected to exploit the autonomous driving context in terms of geometry, scene content and instances distribution. To prove the effectiveness of each method, 3D object detection datasets for autonomous vehicles are described with their unique features, e. g., varying weather conditions, multi-modality, multi camera perspective and their respective metrics associated to different difficulty categories. In addition, we included multi-modal visual datasets, i. e., V2X that may tackle the problems of single-view occlusion. Finally, the current research trends in object detection are summarized, followed by a discussion on possible scope for future research in this domain.

Cobots are robots that are built for human-robot collaboration (HRC) in a shared environment. In the aftermath of disasters, cobots can cooperate with humans to mitigate risks and increase the possibility of rescuing people in distress. This study examines the resilient and dynamic synergy between a swarm of snake robots, first responders and people to be rescued. The possibility of delivering first aid to potential victims dispersed around a disaster environment is implemented. In the HRC simulation framework presented in this study, the first responder initially deploys a UAV, swarm of snake robots and emergency items. The UAV provides the first responder with the site planimetry, which includes the layout of the area, as well as the precise locations of the individuals in need of rescue and the aiding goods to be delivered. Each individual snake robot in the swarm is then assigned a victim. Subsequently an optimal path is determined by each snake robot using the A* algorithm, to approach and reach its respective target while avoiding obstacles. By using their prehensile capabilities, each snake robot adeptly grasps the aiding object to be dispatched. The snake robots successively arrive at the delivering location near the victim, following their optimal paths, and proceed to release the items. To demonstrate the potential of the framework, several case studies are outlined concerning the execution of operations that combine locomotion, obstacle avoidance, grasping and deploying. The Coppelia-Sim Robotic Simulator is utilised for this framework. The analysis of the motion of the snake robots on the path show highly accurate movement with and without the emergency item. This study is a step towards a holistic semi-autonomous search and rescue operation.

One of the big challenges in robotics is the generalization necessary for performing unknown tasks in unknown environments on unknown objects. For us humans, this challenge is simplified by the commonsense knowledge we can access. For cognitive robotics, representing and acquiring commonsense knowledge is a relevant problem, so we perform a systematic literature review to investigate the current state of commonsense knowledge exploitation in cognitive robotics. For this review, we combine a keyword search on six search engines with a snowballing search on six related reviews, resulting in 2,048 distinct publications. After applying pre-defined inclusion and exclusion criteria, we analyse the remaining 52 publications. Our focus lies on the use cases and domains for which commonsense knowledge is employed, the commonsense aspects that are considered, the datasets/resources used as sources for commonsense knowledge and the methods for evaluating these approaches. Additionally, we discovered a divide in terminology between research from the knowledge representation and reasoning and the cognitive robotics community. This divide is investigated by looking at the extensive review performed by Zech et al. (The International Journal of Robotics Research, 2019, 38, 518–562), with whom we have no overlapping publications despite the similar goals.

We explore an alternative approach to the design of robots that deviates from the common envisionment of having one unified agent. What if robots are depicted as an agentic ensemble where agency is distributed over different components? In the project presented here, we investigate the potential contributions of this approach to creating entertaining and joyful human-robot interaction (HRI), which also remains comprehensible to human observers. We built a service robot—which takes care of plants as a Plant-Watering Robot (PWR)—that appears as a small ship controlled by a robotic captain accompanied by kinetic elements. The goal of this narrative design, which utilizes a distributed agency approach, is to make the robot entertaining to watch and foster its acceptance. We discuss the robot’s design rationale and present observations from an exploratory study in two contrastive settings, on a university campus and in a care home for people with dementia, using a qualitative video-based approach for analysis. Our observations indicate that such a design has potential regarding the attraction, acceptance, and joyfulness it can evoke. We discuss aspects of this design approach regarding the field of elderly care, limitations of our study, and identify potential fields of use and further scopes for studies.