MOBILEHCI Adjunct '24: 26th International Conference on Mobile Human-Computer Interaction

Full Citation in the ACM Digital Library

SESSION: Late Breaking Work/Posters

A Touch of Gold - Spraying and Electroplating 3D Prints to Create Biocompatible On-Skin Wearables

Iterative design cycles for tangible user interfaces and wearable devices require efficient prototyping techniques to optimize development and to elevate the overall design efficacy. A key challenge for rapid prototyping techniques such as cardboard prototyping, 3D printing, or laser cutting is the integration of conductive surfaces. Additional wiring, conductive paint, or special materials like conductive filament often lack the necessary high conductivity and sufficient durability for designing on-skin wearables to measure muscle activity or to electrically stimulate the skin and muscles. To solve this problem we propose to combine spraying and electroplating to create surfaces that exhibit high conductivity, are solderable, corrosion-resistant and skin-friendly, and embody both practical functionality and aesthetic value. In this paper, we describe an effective spraying and electroplating process for rapid prototyping and demonstrate its applicability using several examples of tangible user interfaces. Further, we discuss advantages and disadvantages and describe limitations of the approach.

Understanding Technological needs of Nigerians towards community policing engagement: An interview-based study

Community policing (CP) initiatives are increasingly relying on technology to engage stakeholders and improve public safety efforts. Applying such technology is critical to the effective implementation of such initiatives in Nigeria. However, no research has investigated the technology concerns and needs of Nigerian CP stakeholders to enhance their engagement in CP initiatives. This study, using qualitative interviews, explores the perspectives of the Nigerian citizens and police officers on their technology needs for meaningful participation in community policing activities. Using a purposive sampling approach and thematic analysis of interview data, this study identifies key technological concerns and desired features of mobile surveillance and communication technologies that can enhance participation in community policing. Further findings reveal conflicting options in the choice of visibility of surveillance devices between the citizens and police. We propose a conflict-sensitive design approach for users with conflicting preferences and other design implications that address the specific technology needs of both groups for an enhanced community policing initiative in Nigeria.

"What's this?": Understanding User Interaction Behaviour with Multimodal Input Information Retrieval System

Human communication relies on integrated multimodal channels to facilitate rich information exchange. Building on this foundation, researchers have long speculated about the potential benefits of incorporating multimodal input channels into conventional information retrieval (IR) systems to support users’ complex daily IR tasks more effectively. However, the true benefits of such integration remain uncertain. This paper presents a series of exploratory pilot tests comparing Multimodal Input IR (MIIR) with Unimodal Input IR (UIIR) across various IR scenarios, concluding that MIIR offers distinct advantages over UIIR in terms of user experiences. Our preliminary results suggest that MIIR could reduce the cognitive load associated with IR query formulation by allowing users to formulate different query-component in a unified manner across different input modalities, particularly when conducting complex exploratory search tasks in unfamiliar, in-situ contexts. The discussions stemming from this finding draw scholarly attention and suggest new angles for designing and developing MIIR systems.

DriveStats: a Mobile Platform to Frame Effective Sustainable Driving Displays

Phone applications to track vehicle information have become more common place, providing insights into fuel consumption, vehicle status, and sustainable driving behaviors. However, to test what resonates with drivers without deep vehicle integration requires a proper research instrument. We built DriveStats: a reusable library (and encompassing an mobile app) to monitor driving trips and display related information. By providing estimated cost/emission reductions in a goal directed framework, we demonstrate how information utility can increase over the course of a 10 day diary study with a group of North American participants. Participants were initially interested in monetary savings reported increased utility for emissions-related information with increased app usage and resulted in self-reported sustainable behavior change. The DriveStats package can be used as a research probe for a plurality of mobility studies (driving, cycling, walking, etc.) for supporting mobile transportation research.

Exploring Trigger-Action Programs for Designing Self-Control Tools in Mobile Devices

Individuals may spend three to five hours interacting with their smartphone screens daily. Many of them want to reduce their screen time but fail–despite the many digital wellbeing tools currently available. For example, digital self-control tools (DSCTs) support user self-control of digital device use through awareness of usage patterns or letting users set time limits for specific websites, but their effectiveness in the long term remains little explored. We conducted 7 focus groups with 39 participants to investigate the use and non-use of current DSCTs in mobile devices. We further explored user attitudes about trigger-action programming (TAP, if-this-then-that rules) in designing customized DSCTs and elicited their preferences via a sketching session during the focus groups. Data analysis was grounded in the framework of the Habit Alteration Model. Findings show how nuanced individual self-control needs can be met with TAPs. Two smartphone design prototypes are presented to demonstrate our study findings.

MagSerea: Fingerprinting Magnetic Field of Specified Area with Wearable Sensors

In environments where GPS is impractical for indoor positioning, magnetic information has emerged as a promising alternative. This study proposes a novel location fingerprinting method called "MagSerea ((Magnetic Field of Specified aREA))," which utilizes time-series data of three-axis magnetic information obtained by a wearable device attached to the arm when opening a door. Experimental results indicate that the door-opening motion restricts the deviations in the device’s position and angle, thereby maintaining high identification accuracy despite aging and changes in user’s possessions. These findings suggest that MagSerea demonstrates high identification accuracy in real-world conditions, offering lower installation costs and fewer constraints on reference points compared to existing methods, and thus holds potential for a wide range of applications.

Train Me: Exploring Mobile Sports Capture and Replay for Immersive Sports Coaching

In recent years, a wide variety of instructional materials and applications have been developed to enhance athletes’ learning in different sports. The amount of instructional videos, in particular, has increased significantly, though these often impose a high cognitive load as users must map displayed instructions to their actions and body movements. Mobile Augmented Reality (AR) interfaces can reduce this burden by presenting instructional information directly where it is needed. This paper explores a mobile sports capture and replay approach for immersive self-training, aiming to help users improve their skills without needing coaches on-site. We investigate different capturing methods, including an Exocentric capturing method, to enhance instructor mobility and flexibility. Using the captured data, we visualize sports training instructions in an immersive 3D environment on an AR headset. We propose three visualization methods and create a first prototype that allows us to explore the approach’s feasibility across different sports.

Eyes-free Circular Gestures on Smartphones

Smartphones are used in various situations, such as when users have limited visual focus, such as when walking or driving. Eyes-free gestures offer a way to interact with smartphones without requiring visual attention. This research delves into circular eyes-free gestures and elucidates the advantages they offer over other types of gestures in facilitating eyes-free interaction. We carried out two experiments to explore the ability of participants to accurately draw arcs with varying angles in a smartphone’s eyes-free context. The results of the first experiment revealed that participants commonly tended to exceed the intended arc lengths, regardless of whether they were drawing arcs clockwise or counterclockwise. The results of the second experiment showed that there is a high variation in drawing eye-free circular gestures by the same user. However, this variation decreases if the second gesture is produced immediately after the first one.

Vision Beyond Boundaries: An Initial Design Space of Domain-specific Large Vision Models in Human-robot Interaction

The emergence of large vision models (LVMs) is following in the footsteps of the recent prosperity of Large Language Models (LLMs) in following years. However, there’s a noticeable gap in structured research applying LVMs to human-robot interaction (HRI), despite extensive evidence supporting the efficacy of vision models in enhancing interactions between humans and robots. Recognizing the vast and anticipated potential, we introduce an initial design space that incorporates domain-specific LVMs, chosen for their superior performance over normal models. We delve into three primary dimensions: HRI contexts, vision-based tasks, and specific domains. The empirical evaluation was implemented among 15 experts across five evaluated metrics, showcasing the primary efficacy in relevant decision-making scenarios. We explore the process of ideation and potential application scenarios, envisioning this design space as a foundational guideline for future HRI system design, emphasizing accurate domain alignment and model selection.

Towards Detecting and Mitigating Cognitive Bias in Spoken Conversational Search

Spoken Conversational Search (SCS) poses unique challenges in understanding user-system interactions due to the absence of visual cues, and the complexity of less structured dialogue. Tackling the impacts of cognitive bias in today’s information-rich online environment, especially when SCS becomes more prevalent, this paper integrates insights from information science, psychology, cognitive science, and wearable sensor technology to explore potential opportunities and challenges in studying cognitive biases in SCS. It then outlines a framework for experimental designs with various experiment setups to multimodal instruments. It also analyzes data from an existing dataset as a preliminary example to demonstrate the potential of this framework and discuss its implications for future research. In the end, it discusses the challenges and ethical considerations associated with implementing this approach. This work aims to provoke new directions and discussion in the community and enhance understanding of cognitive biases in Spoken Conversational Search.

DAP PLXENT#x1D107; : Develop pair-Authentication Protocol with DAP

In today’s interconnected world, secure authentication is crucial for both high-security environments and everyday interactions. Traditional authentication methods like passwords and biometrics are designed for individual use, but new challenges emerge in interactive gaming, theme parks, and collaborative virtual reality (VR) where multiple participants must authenticate collectively. This study introduces a multi-person authentication that leverages cooperative actions to enhance security. By analyzing synchronized sensor data from cooperative actions, the system ensures the presence and consent of all participants, making impersonation difficult. We propose a pair authentication using inertial sensors during a complex handshake known as Dignity And Pride (DAP). Our research evaluates the accuracy of pair authentication, the impact of behavioral degradation over time, and resistance to attacks. Experiments with university students demonstrate high authentication accuracy and robustness against time degradation, though vulnerabilities to spoofing attacks were identified, suggesting areas for improvement in secure cooperative authentication.

MultiSurf-GPT: Facilitating Context-Aware Reasoning with Large-Scale Language Model Agents for Multimodal Surface Sensing

Surface sensing is widely employed in health diagnostics, manufacturing and safety monitoring. Advances in mobile sensing affords this potential for context awareness in mobile computing, typically with a single sensing modality. Emerging multimodal large-scale language models offer new opportunities. We propose MultiSurf-GPT, which utilizes the advanced capabilities of GPT-4o to process and interpret diverse modalities (radar, microscope and multispectral data) uniformly based on prompting strategies (zero-shot and few-shot prompting). We preliminarily validated our framework by using MultiSurf-GPT to identify low-level information, and to infer high-level context-aware analytics, demonstrating the capability of augmenting context-aware insights. This framework shows promise as a tool to expedite the development of more complex context-aware applications in the future, providing a faster, more cost-effective, and integrated solution.

Cultural influence on RE activities: An extended analysis of state of the art

Designing mobile software that aligns with cultural contexts is crucial for optimizing human-computer interaction. Considering cultural influences is essential not only for the actual set of functional/non-functional requirements, but also for the whole Requirement Engineering (RE) process. Without a clear understanding of cultural influences on RE activities, it’s hardly possible to elaborate a correct and complete set of requirements. This research explores the impact of national culture on RE-related activities based on recent studies. We conducted a Systematic Literature Review (SLR) of studies published in 2019-2023 and compared them to an older SLR covering 2000-2018. We identified 17 relevant studies, extracted 33 cultural influences impacting RE activities, and mapped them to the Hofstede model, widely used for cultural analysis in software development research. Our work highlights the critical role of national culture in RE activities, summarizes current research trends, and helps practitioners consider cultural influences for mobile app/software development.

Exploiting Air Quality Monitors to Perform Indoor Surveillance: Academic Setting

Changing public perceptions and government regulations have led to the widespread use of low-cost air quality monitors in modern indoor spaces. Typically, these monitors detect air pollutants to augment the end user’s understanding of her indoor environment. Studies have shown that having access to one’s air quality context reinforces the user’s urge to take necessary actions to improve the air over time. Thus, user’s activities significantly influence the indoor air quality. Such correlation can be exploited to get hold of sensitive indoor activities from the side-channel air quality fluctuations. This study explores the odds of identifying eight indoor activities (i.e., enter, exit, fan on, fan off, AC on, AC off, gathering, eating) in a research lab with an in-house low-cost air quality monitoring platform named DALTON. Our extensive data collection and analysis over three months shows 97.7% classification accuracy in our dataset.

To Touch, or Not to Touch: Evaluating Manual Page Turning Modalities for Digital Sheet Music During Piano Play

The widespread adoption of digital sheet music by performers in recent years appears promising, yet not without limitations. While enhancing interactivity and convenience, a persistent challenge remains: the act of turning pages while musicians’ hands are engaged on the piano keys. To investigate perceived user experience of page turning in digital sheet music, we conducted a controlled laboratory experiment with fifteen participants (N = 15), comparing five state-of-the-art page turning modalities. The results revealed that hands-free modalities were generally preferred over touch-based modalities, with head gestures showing promise as hands-free alternatives to foot pedals, particularly in terms of perceived efficiency, attractiveness, stimulation, and novelty.

Draw4CM: Detecting Cervical Myelopathy via Hand Drawings Captured by Mobile Devices

Cervical myelopathy (CM), which causes numbness and pain in the hands, can affect manual dexterity and render daily life difficult. The characteristics of this disease are expected to be apparent owing to the necessity of fine motor control for drawing in daily life. Therefore, this study proposed a method to screen for CM by asking patients to draw a figure on a tablet screen using a stylus. The drawing was captured as an image and fed into our machine-learning model to predict CM. The proposed models exhibited sensitivity and specificity comparable to or better than conventional physical methods while avoiding examiner subjectivity and bias. Further, we developed a smartphone application that uses the camera to capture images of figures drawn on paper for CM screening. Moreover, the proposed methods can enable smartphones and tablets to capture disease characteristics and are ubiquitous, thus rendering it easier to reach potential patients.

Usability Evaluation of a Mobile Application for Foot Health Monitoring of Smart Insoles: A Mixed Methods Study

Increasing research on intelligent wearable technologies, such as smart insoles for mobile health (mHealth) monitoring, brings new challenges for user-centered design, particularly in data visualization. Research shows various developments in mobile apps for monitoring foot health with smart insoles, while emerging trends like chatbot interactions and improved analytic visualizations offer new opportunities to enhance user experience. However, the usability of various health data visualizations remains unvalidated. A mixed-method experimental user study with 30 participants was conducted to assess the usability of three prototype mHealth applications for smart insoles: Analytical, Basic, and Chatbot visualizations. Quantitative results showed that Basic visualization achieved the highest usability, followed by Analytical, and finally Chatbot. Qualitative feedback supported these findings but also highlighted the potential of Chatbot interactions to enhance data understanding in mHealth apps. We discuss implications for future mHealth applications to monitor foot health and propose design recommendations to improve usability in chatbot interactions.

Exploring the Feasibility of a Repeated Mobile One-Minute PVT In-the-Wild

Researchers have established that quality sleep is important for cognitive functions. Now that consumer-grade wearables can provide accurate measurements of sleep quality, we should be able to inform individuals about how their sleep impacts cognitive functions. However, as our internal clock follows cycles of alertness and sleepiness, it is still necessary to identify at what time of day the sleep quality impacts cognitive performance. In this study, we investigated the relationship between sleep stages and psychomotor vigilance using a 1-minute Psychomotor Vigilance Task (PVT) test. Participants wore a sleep-tracking device and performed the PVT test six times daily for three weeks. The results suggest that increased Rapid Eye Movement (REM) sleep duration led to better performance after lunch. Additionally, we developed a model to predict average response time based on the sleep data. The results show a promising step towards giving tangible meaning to sleep data.

Enhancing Mobile Interaction: Practical Insights from Smartphone and Smartwatch Integration

In the realm of mobile technology, smartwatches have emerged as valuable complements to smartphones, offering unique features such as enhanced activity tracking and in-situ data analysis. This study delves into the synergistic relationship between smartwatches and smartphones through practical applications developed by university students. We guided three groups of students from a computer-related program to design and develop cross-mobile device applications tailored to their daily contexts. Our evaluation of these student-developed applications revealed several key insights. Firstly, there was a strong preference for using smartwatches for straightforward notification purposes, indicating their convenience and immediacy in delivering critical information. However, their effectiveness as extended displays was found to depend heavily on a well-crafted approach to information delivery, necessitating careful design considerations. The study also highlighted the complexities inherent in designing intuitive and efficient cross-device interaction gestures, underlining the importance of thoughtful gesture selection and customization to enhance user experience. Furthermore, our findings indicated that users found it unnecessary to use smartwatches for voice input when paired with smartphones, as a single smartphone was deemed sufficient for this purpose. This research provides valuable insights into enhancing cross-device interaction, aiming to foster more seamless and user-focused integration of technology into daily life. These insights can guide developers in creating more effective and user-friendly applications, ultimately contributing to the broader field of human-computer interaction.

DesignWatch: Analyzing Users' Operations of Mobile Apps Based on Screen Recordings

Screen recordings of users’ operations to complete tasks in the mobile app are vital resources for designers to assess the app’s usability. However, analyzing these recordings at a large scale could be mentally challenging. In this paper, we present DesignWatch, which assists designers in analyzing users’ operations of mobile apps based on collected screen recordings. DesignWatch supports interactive visual analyses of multiple users’ operation paths in the app and prompts GPT-4 with vision to simulate users’ thoughts during each operation. We conduct expert interviews with four designers, which highlight DesignWatch’s usefulness in helping them quickly understand users’ operation patterns in the app, identify the potentially problematic UI design page, and get insights for improving the app design. We conclude with design implications for facilitating usability tests with interactive visualization and generative models.

EmoFoot: Can Your Foot Tell How You Feel when Playing Virtual Reality Games?

Understanding feelings in Virtual Reality (VR) games is vital for enhancing engagement in human-computer interaction. Traditional methods for assessing emotions and player experience, both subjective and objective, often fall short of capturing players’ comprehensive and nuanced experiences. This preliminary study introduces EmoFoot, a novel approach leveraging foot pressure sensors to decode player feelings during VR gameplay. We show the diverse patterns of VR game experiences using subjective reports focused on immersion, competence, negative and positive affect, flow, tension, challenge, and engagement. By integrating smart insoles, our research investigates the potential of using foot pressure data to identify valence and arousal levels. We use Machine Learning models to discover how players’ feet can reveal their emotions. EmoFoot aims to introduce a seamless and unobtrusive method for monitoring player experience, contributing to immersive technology by enhancing our understanding of the objective indicators of player emotions and improving the overall gaming experience.

SESSION: Demos

AuditNet: Conversational AI Security Assistant

In the age of information overload, professionals across various fields face the challenge of navigating vast amounts of documentation and ever-evolving standards. Ensuring compliance with standards, regulations, and contractual obligations is a critical yet complex task across various professional fields. We propose a versatile conversational AI assistant framework designed to facilitate compliance checking on the go, in diverse domains, including but not limited to network infrastructure, legal contracts, educational standards, environmental regulations, and government policies. By leveraging retrieval-augmented generation using large language models, our framework automates the review, indexing, and retrieval of relevant, context-aware information, streamlining the process of verifying adherence to established guidelines and requirements. This AI assistant not only reduces the manual effort involved in compliance checks but also enhances accuracy and efficiency, supporting professionals in maintaining high standards of practice and ensuring regulatory compliance in their respective fields. We propose and demonstrate AuditNet, the first conversational AI security assistant designed to assist IoT network security experts by providing instant access to security standards, policies, and regulations.

Demonstrating TOM: A Development Platform for Wearable Intelligent Assistants in Daily Activities

Advanced wearable digital assistants can significantly enhance task performance, reduce user burden, and provide personalized guidance to improve users’ abilities. However, developing these assistants presents several challenges. To address this, we introduce TOM (The Other Me), a conceptual architecture and open-source software platform (https://github.com/TOM-Platform) that supports the development of wearable intelligent assistants that are contextually aware of both the user and the environment. Collaboratively developed with researchers and developers, TOM meets their diverse requirements. TOM facilitates the creation of intelligent assistive AR applications for daily activities and supports the recording and analysis of user interactions, integration of new devices, and the provision of assistance for various activities.

Geofence-to-Conversation: Hierarchical Geofencing for Augmenting City Walks with Large Language Models

This study presents a geofence-based service architecture for city-wide audio augmented reality, tailored for the era of large language models. Traditional geofencing mechanisms, which monitor user entry to geofences, struggle to provide continuous storytelling in areas with few points of interest, degrading the audio tour experiences for pedestrians. Our proposed geofencing architecture consistently incorporates complex and multilayered city features, enabling seamless audio tour experiences. Furthermore, this paper introduces prompt engineering for generating entertaining guide scripts for large language models, that is, the geofence-to-conversation technique. The mobile application developed for the actual field demonstrates the feasibility of our proposed architecture and highlights future challenges in enhancing users’ interaction with a city.

GestureShirt: Exploring Gestures in Front of the Body for Truly Mobile Interaction while Running

Running is the most practiced sport worldwide. Modern technology significantly enhances the running experience. Runners frequently use audio devices, monitoring apps, and wearable sensors like heart rate monitors. Utilizing these technologies while in motion presents unique challenges, as continuous movement can obstruct user interaction. Common devices, such as smartphones, smartwatches, and fitness trackers, are widely employed, but their small interfaces and touch screens often necessitate slowing down for effective use. In this work we explore the space around the body for truly mobile interaction during the run. This demo paper introduces GestureShirt, a novel garment that employs removable distance sensors to explore gestural interaction around the body. Based on an exploratory study (N = 16) we distilled three strategies for utilizing gestures in front of the body for truly mobile interaction. In the demonstration, visitors can explore these strategies through functioning prototypes.

The Atlas of AI Incidents in Mobile Computing: Visualizing the Risks and Benefits of AI Gone Mobile

Today’s visualization tools for conveying the risks and benefits of AI technologies are largely tailored for those with technical expertise. To bridge this gap, we have developed a visualization that employs narrative patterns and interactive elements, enabling the broader public to gradually grasp the diverse risks and benefits associated with AI. Using a dataset of 54 real-world incidents involving AI in mobile computing, we examined design choices that enhance public understanding and provoke reflection on how certain AI applications—even those deemed low-risk by law—can still lead to significant incidents. Visualization: https://social-dynamics.net/atlas

SESSION: Industrial Perspectives

Stress Testing Gestures for Smartphone System Navigation

The Android operating system (OS) started in 2008 with buttons (Home, Back, etc.) to enable users to navigate their smartphone. We describe a novel research methodology used during Android 10’s development of its first gesture based navigation system. Specifically, we describe a methodology that takes inspiration from using wear-and-tear as a measure of the quality of a product. For example, a shoe that can endure more steps is higher in quality than one that can endure fewer steps. We apply this logic to the research methodology to test gestures for smartphone system navigation - the more times users can gesture without discomfort, the higher the ergonomic quality of the gesture. We use this method in conjunction with surveying participants on their stated preferences. The combination of data affords us greater confidence in our conclusions.

Leveraging AI for Improved User Feedback in Mobile Survey Video Prototypes

This paper explores the transformative impact of artificial intelligence (AI) on user feedback collection, particularly focusing on the transition from traditional textual surveys to video-based prototypes with AI-generated voiceovers. As digital landscapes evolve, the methods for capturing user feedback must adapt to meet changing user preferences and technological capabilities. We propose that integrating AI technology in survey design not only enhances the accessibility and engagement of surveys but also significantly reduces the cognitive load on respondents.

Designing GPS: A User Engagement Module for Mobile Software Applications

In a collaboration between Rexuniversal S.L. in Marbella (Spain) and Universidad Carlos III de Madrid, we have designed a mobile application module called GPS that enhances user engagement by benefiting from gamification, personalization, and social networking. The module also considers other user engagement elements, such as interactivity, design quality, privacy, content quality, and utility. It also provides a personalization algorithm that adapts to user preferences. This paper presents the proposed module fundamentals and its design and prototyping process. We have also elaborated on how academic and industrial evaluation was used to evaluate the prototype design using key design and prototyping aspects of usability, aesthetics, functionality, satisfaction, and engagement. The initial findings of the evaluation of the GPS module showed strengths in visual design and engagement but indicated areas for improvement in navigation, consistency, and information presentation. An industrial evaluation with expert feedback confirmed the module's potential and provided actionable insights for refinement. Software designers can benefit from the insights of this study for engineering and designing engaging mobile applications.

SESSION: Workshop Proposals

Affective Computing for Mobile Technologies

Mobile technologies have become integral to daily life, and understanding users’ emotional states during interactions is crucial for enhancing user experience. However, integrating affective perception, behavior analysis, and affective computing for mobile technologies presents multifaceted challenges, ranging from technological limitations to ethical considerations. This workshop proposes a collaborative exploration of cutting-edge solutions for affective computing for mobile technologies. We aim to bring together experts to explore topics such as: user behavior analytics, user experience design, affective computing applications, cultural and contextual considerations, and the ethical implementation of affective computing. This workshop aims to bring together researchers and practitioners from both academia and industry to identify and explore: 1) innovative solutions, 2) novel applications, and 3) key challenges in this area to drive research in the coming decade. The long-term goal is to create a strong interdisciplinary research community that includes researchers and practitioners from HCI, HRI, Ubiquitous Computing, Cognitive Psychology, Mobile Technology, Interaction Techniques, User Privacy, and Design. We envision ongoing research collaborations and accelerating innovations in affective computing for mobile technologies.

mobiCHAI - 1st International Workshop on Mobile Cognition-Altering Technologies (CAT) using Human-Centered AI

The quest for enhanced cognition has been a driving force behind human advancement, fostering innovation and personal fulfillment. Cognition Altering Technologies (CAT) holds immense promise in elevating the quality of life across diverse domains including education, decision-making, healthcare, and fitness. The current proliferation of Artificial Intelligence (AI), particularly the widespread adoption of Generative AI and foundational models, presents an unprecedented opportunity to prototype new CAT that can augment human capabilities. This workshop aims to unite interdisciplinary research communities to explore the potential of leveraging GenAI and human-centered AI to develop relevant CAT. Taking place at MobileHCI 2024, this one-day workshop invites researchers, practitioners, and designers from fields such as artificial intelligence, ubiquitous computing, human-computer interaction, and social sciences to collaborate and chart the future of cognitive enhancement through technology.

Designing Age-Inclusive Interfaces: Emerging Mobile, Conversational, and Generative AI to Support Interactions across the Life Span

We are concurrently witnessing two significant shifts: voice and chat-based conversational user interfaces (CUIs) are becoming ubiquitous (especially more recently due to advances in generative AI and LLMs - large language models), and older people are becoming a very large demographic group (and increasingly adopting of mobile technology on which such interfaces are present). However, despite the recent increase in research activity, age-relevant and inter/cross-generational aspects continue to be underrepresented in both research and commercial product design. Therefore, the overarching aim of this workshop is to increase the momentum for research within the space of hands-free, mobile, and conversational interfaces that centers on age-relevant and inter- and cross-generational interaction. For this, we plan to create an interdisciplinary space that brings together researchers, designers, practitioners, and users, to discuss and share challenges, principles, and strategies for designing such interfaces across the life span. We thus welcome contributions of empirical studies, theories, design, and evaluation of hands-free, mobile, and conversational interfaces designed with aging in mind (e.g. older adults or inter/cross-generational). We particularly encourage contributions focused on leveraging recent advances in generative AI or LLMs. Through this, we aim to grow the community of CUI researchers across disciplinary boundaries (human-computer interaction, voice and language technologies, geronto-technologies, information studies, etc.) that are engaged in the shared goal of ensuring that the aging dimension is appropriately incorporated in mobile / conversational interaction design research.

SESSION: Student Research Competition

CN-T9:Optimization of T9 Chinese input layout

This paper presents a novel T9 keyboard layout designed specifically for Chinese users, considering the influence of the Pinyin system. The traditional T9 layout, originally developed for English typing, is inefficient for Chinese input, leading to a higher cognitive load and reduced typing speed. Our proposed layout optimizes key placement based on the frequency and distribution of Pinyin syllables, incorporating intuitive methods for inputting tone markers. By minimizing keystrokes and enhancing ergonomic design, the new layout aims to improve typing efficiency and comfort. We conducted comprehensive evaluations comparing the new layout with traditional methods, showing significant improvements in input speed and user satisfaction. This research highlights the importance of tailoring keyboard designs to specific language requirements, contributing to the broader field of human-computer interaction and providing a better typing experience for Chinese users.

By the Fire :A Children's Game for Promoting Hani Culture

The Hani ethnic group, which is mainly distributed in China, has produced many distinctive cultural traditions throughout its history. However, the lack of a writing system and the impact of modern society have caused some Hani cultural practices to face decline or even extinction. By the Fire: Hani's Epic is a game made for mobile tablets, such as an iPad. It integrates important events and culturally significant items in the history of the Hani people into the gameplay. The game combines card-based and minigame elements to capitalize on the curiosity of children aged 6-10, leading to increased appreciation and understanding of the Hani culture. The use of physical accessories also enhances the game immersion and strengthens the reflective experience outside the game, culminating in a greater understanding of Hani culture for both children and their parents.

Distancebit: Designing a Technology Probe to Envision AR Glasses Enhancing Embodied Cross-Cultural Social Interaction

In cross-cultural social interactions, understanding non-verbal cues is as crucial as language. Distancebit is a critical design that reflects on how personal computing devices influence these interactions and explores the evolution of Augmented Reality (AR) in enhancing cross-cultural social experiences. Using a speculative design approach, we envision a future where everyone uses AR glasses. Our design focuses on differing interpersonal space (IPS) preferences in contact and non-contact cultures, helping users recognize culturally appropriate IPS and signal discomfort to the outside world when violated. Prototype evaluations indicated that Distancebit increased participant engagement and reflection on cross-cultural interactions. We also discuss cultural diversity, embodiment in cross-cultural settings, biosignal visualization, and human-technology relations, envisioning an future AR wearable device being more seamlessly embedded into the socio-cultural fabric, by communicating user’s information to the outside environment, thereby enhancing understanding and social interaction.

Express Yourself Simply: Mobile AI for Bridging Communication Gaps

Doodle Me is an AI mobile application designed to bridge communication gaps and barriers. This tool is beneficial for individuals with articulation and phonological speech disorders. By using advanced machine learning algorithms, particularly a Convolutional Neural Network (CNN), the application interprets user-created drawings in real-time. This app provides a universal mode of expression that transcends traditional language barriers. The CNN model is trained on the QuickDraw dataset which achieves an accuracy of 92.6%.

The application is accessible and practical, functioning seamlessly across all smartphones. Doodle Me enhances communication by recognizing and categorizing doodles, and provides visual and auditory feedback to users. This tool provides a holistic approach by aiding at rehabilitation of numerous medical conditions. It deals with multilinguistics of different age groups and is also eco-friendly.

The application serves as an assistive tool in the healthcare system. It can support a wide range of medical conditions, including post-ischemic stroke aphasia, facial paralysis, stuttering/stammering, Autism Spectrum Disorder (ASD), post-operative laryngectomy or vocal cord paralysis, Alzheimer’s disease, and rehabilitation for post-cochlear implant patients (where children are taught about speech recognition by giving visual cues) and elderly with presbycusis (hearing loss at high frequencies).

Doodle Me addresses communication barriers and promotes biosustainability by reducing the need for unnecessary stationary items. It is a practical application aimed at making a positive societal impact by facilitating seamless integration.

SESSION: Doctoral Consortium

Ethical Implications of Pervasive Augmented Reality

As a context-aware and omnipresent technology, Pervasive Augmented Reality (Pervasive AR) is poised to become a significant wearable in our daily lives. While extensive research has focused on the technical capabilities and social acceptance of Pervasive AR devices, there remains a notable gap in understanding their broader impacts. Given the continuous augmentation and ubiquitous nature of this technology, it is crucial to examine its effects on users, bystanders, and society as a whole. This work aims to qualitatively explore the potential ethical implications of Pervasive AR by exposing participants to predetermined scenarios using technology probes. Our findings will inform design recommendations to address identified ethical concerns. I believe the Doctoral Consortium would be an excellent platform to present my work, especially in the context of my upcoming projects, to expert researchers and receive constructive feedback on my PhD research.

Exploring Visual Discomfort and Opportunities for Vision Augmentations: Visual Noise Cancellation and Head-worn LCD Light Actuators for Perception Modulation

Recently, Head-Mounted Displays have shown great potential for augmenting vision beyond traditional visual aids, such as compensating for complex visual impairments or providing superhuman vision to the unimpaired eye. Example applications addressed colour vision deficiency or emphasised visual cues for people with low vision. However, Visual Discomfort or Visual Noise and how to address it in vision augmentations are less explored, which is surprising given that acoustic noise control or cancellation is widely applied in modern audio headphones. In my PhD, we explore the area of Visual Noise and Discomfort through a series of studies. Specifically, we explore the general problem of what represents Visual Discomfort and Visual Noise, prototypically explore the first prototypes to receive feedback on the concept of Visual Noise cancellation, including novel prototypes for vision augmentation using LCD light actuators, also review the literature for the current state of vision augmentation.

Meaningful Interaction with Digital Data in Motion

The emergence of Augmented Reality (AR) technologies has revolutionized the way information can be presented, offering novel opportunities for data visualization and enhanced interactivity. However, investigating AR concepts outdoors, especially on-the-move, has proven to be a challenging task. Although previous work has examined use cases for using AR for everyday activities in urban environments, most of the contexts explored with people being on-the-move outdoors are in a conceptual stage and these contexts mainly regard AR from a utilitarian perspective. This thesis’ goal is to design and evaluate meaningful AR concepts that intend to assist people while on-the-move in a context sensitive and personalized manner, hoping to empower them by leveraging new forms of interaction while merging the virtual and physical world. I aim to contribute an overview of the design space for AR, while people are on-the-move, provide insights for meaningful AR visualizations, and establish design guidelines for such concepts.

Situated Instructions and Guidance For Self-training and Self-coaching in Sports

Recent advancements in Virtual Reality (VR) and Augmented Reality (AR) have made remote education immersive and 3D, gaining traction in various fields like medicine, entertainment, education, and engineering. Remote sports training has also gained attention, leading to the development of applications for analyzing movements and improving skills through 3D visualization. However, automatic guidance systems are not well-researched and limitations of existing motion capture setups have been highlighted. In this thesis, we propose to use a combination of video analysis, AR, and activity recognition for remote sports training. Our approach includes two capture modes (egocentric and exocentric) for indoor and outdoor activities and utilizes computer vision-based methods for motion estimation instead of relying on wearable sensors that need expensive facilities. We explore different visualization modes and plan to develop a deep-learning model to provide automatic guidance during training sessions.

Towards Enhanced Context Awareness with Vision-based Multimodal Interfaces

Vision-based Interfaces (VIs) are pivotal in advancing Human-Computer Interaction (HCI), particularly in enhancing context awareness. However, there are significant opportunities for these interfaces due to rapid advancements in multimodal Artificial Intelligence (AI), which promise a future of tight coupling between humans and intelligent systems. AI-driven VIs, when integrated with other modalities, offer a robust solution for effectively capturing and interpreting user intentions and complex environmental information, thereby facilitating seamless and efficient interactions. This PhD study explores three application cases of multimodal interfaces to augment context awareness, respectively focusing on three dimensions of visual modality: scale, depth, and time: a fine-grained analysis of physical surfaces via microscopic image, precise projection of the real world using depth data, and rendering haptic feedback from video background in virtual environments.

Post Training Quantization Strategies for Diffusion Models

This study addresses the critical need for quantizing audio diffusion models to enable efficient synthesis on low-resource-constrained devices. We propose to design objective audio quality metrics for realism and fidelity, which can be utilized to optimize the denoising process of diffusion models. The current state of our research includes the application of quantization strategies from the image domain to off-the-shelf audio diffusion models such as AudioLDM and Make-an-Audio. The proposed work consists of exploring the specific operations in the U-Net architecture that should be quantized, focusing on weights and activations and how to effectively calibrate them. We plan to evaluate our framework on mobile devices to cater to a range of hardware- off the shelf and customized.