Posters & Demos

For the best interactive experience with all the posters and demonstrations, please reference the map and detailed information about the abstract, authors, and more of each presentation below.

Technologies Thrust

Ruihan Gao (CMU), Wenzhen Yuan (PI), Jun-Yan Zhu (PI from CMU)

Generating a high-resolution model of the world is a challenge. In this work, we leverage deep generative models to create a multi-sensory experience where users can touch and see the synthesized object when sliding their fingers on a haptic surface. The main challenges lie in the significant scale discrepancy between vision and touch sensing and the lack of explicit mapping from touch sensing data to a haptic rendering device. To bridge this gap, we collect high-resolution tactile data with a GelSight sensor and create a new visuotactile clothing dataset. We then develop a conditional generative model that synthesizes both visual and tactile outputs from a single sketch. Finally, we introduce a pipeline to render high-quality visual and tactile outputs on an electroadhesion-based haptic device for an immersive experience, allowing for challenging materials and editable sketch inputs.

Vivian Shen, Tucker Rae-Grant, Joe Mullenbach, Chris Harrison, and Craig Shultz

Generalized haptic feedback displays are an open research topic what has received recent attention due to high profile use cases such as AR/VR and telemanipulation. Building such devices, however, is difficult, as it pulls from cross-disciplinary fields of engineering, human computer interaction, material science, psychophysics, and more. In this work, we present a new approach to create high-resolution shape-changing fingerpad arrays with 20 haptic pixels/cm². Unlike prior pneumatic approaches, our actuators are low-profile (5mm thick), low-power (approximately 10mW/pixel), and entirely self-contained, with no tubing or wires running to external infrastructure. We show how multiple actuator arrays can be built into a five-finger, 160-actuator haptic glove that is untethered, lightweight (207g, including all drive electronics and battery). Discussed will be initial results from a technical performance evaluation and a suite of eight user studies, quantifying the diverse capabilities of the system. This includes recognition of object properties such as complex contact geometry, texture, and compliance, as well as expressive spatiotemporal effects.

Zilin Si (CMU), Tianhong Yu (Cornell), Kojo Vandyck,  James McCann (PI from CMU), Wenzhen Yuan (PI)

Digital machine knitting stands at the forefront of transformative technologies, merging traditional textile craftsmanship with digital programming to produce smart fabrics and interactive garments. In the immersive technology landscape, its role is pivotal, particularly in the development of textiles embedded with sensors and actuators. These innovations redefine user experiences in virtual reality (VR) and gaming, offering tactile interfaces that respond to and enhance user interactions.In the realm of immersive technologies, the marriage of digital machine knitting and haptic sensor and actuator development is groundbreaking. By incorporating tactile feedback into knitted garments, we elevate the user experience in virtual reality and gaming. Imagine a knitted glove, its integrated sensors translating nuanced hand movements into virtual actions, enhancing immersion and interactivity. Beyond entertainment, haptic feedback holds promise in healthcare, where a knitted sleeve with embedded actuators can facilitate therapeutic vibrations for rehabilitation exercises.Highlighting the advantages of machine knitting, we emphasize its customizability, scalability, simplicity, and rapid fabrication. Our ongoing RobotSweater project serves as a tangible example, showcasing a machine-knitted pressure-sensitive tactile skin for robots. This project addresses the challenge of creating large-scale and flexible tactile skins for robots, offering a scalable and customizable solution using off-the-shelf yarns and an automated knitting machine.Its parameterized design allows customization, ensuring adaptability to diverse robot shapes and sizes. This project holds immense promise for reshaping Human Robot Interaction (HRI) dynamics and enhancing social touch experiences, ushering in a new era of natural and intuitive engagements between humans and robots. RobotSweater’s influence extends beyond functionality. The textile nature and flexibility of the tactile skin contribute not only to its utility but also to the aesthetic appeal of robots, promoting social acceptability. The impact of our work extends beyond robotics into healthcare, gaming, and VR, highlighting the versatility of machine knitting in creating wearable solutions. The integration of wearable sensors and actuators into knitted fabrics offers a dynamic platform for diverse applications. This presentation aims to showcase the simplicity, scalability, and broad applicability of digital machine knitting, emphasizing its pivotal role in shaping the future of wearable technologies, immersive experiences, and human-machine interactions.

Linzhan Mou (CS), Jun-Kun Chen (CS), Yuxiong Wang (CS)

Bridging Dreams and Reality with Instruct 4D-to-4D. The enthralling, lifelike virtual world depicted in "Ready Player One" has captivated audiences, illustrating a widespread desire for deeply immersive experiences. To achieve such realism, the ability to generate and edit 3D/4D scenes becomes essential. In pursuit of this goal, we introduce Instruct 4D-to-4D, a novel approach revolutionizing the editing of dynamic 4D environments. Our innovation lies in treating 4D scenes as pseudo-3D scenes – 3D scenes with videos as its views, addressing temporal consistency in video editing, and applying editing operations to pseudo-3D scenes effectively. This model employs natural language instructions for manipulating 4D scenes, enhancing the creation of immersive virtual realities. The advanced Instruct-Pix2Pix (IP2P) model, augmented with an anchor-aware attention module, facilitates both intra-batch and inter-batch connections and ensures consistent frame-to-frame edits, surpassing traditional 2D diffusion models' limitations. Optical flow-guided appearance propagation, applied in a sliding window fashion, ensures precise, coherent modifications, capturing motion and transformation smoothly. A depth-based projection method is designed to propagate the editing between pseudo-3D views, maintaining spatial integrity and geometric consistency. The iterative editing process in our pipeline ensures each modification converges toward a cohesive and polished final result. Extensively evaluated across various scenes and editing instructions, our Instruct 4D-to-4D achieves spatially and temporally consistent results, with enhanced detail and sharpness, setting a new standard in immersive computing. Applicable in both monocular and complex multi-camera scenarios, this advancement in dynamic 4D scene editing paves the way for more realistic and interactive immersive technologies, mirroring the aspiration to create dream-like virtual worlds as envisioned in popular culture.

Sirui Xu (CS), Zhengyuan Li (CS), Yuxiong Wang (CS), Lynna Gui (CS)

The generation of virtual human-object interactions is important in immersive technologies, bridging the gap between digital and physical realities. In virtual environments, realistic human-object interactions are crucial for creating convincing, engaging experiences that closely mimic real-life scenarios. This capability is essential not only for enhancing user immersion. For example, in virtual training environments, accurately simulated interactions allow for effective skill learning in fields such as medicine, aviation, and military operations, without the risks or costs associated with real-world training. In design and architecture, the ability to interact with virtual models in a realistic manner aids in better understanding and visualizing complex designs. Therefore, to further approximate real interactions in virtual worlds, we present a novel task of anticipating 3D human-object interactions (HOIs). Most existing research on HOI synthesis lacks comprehensive whole-body interactions with dynamic objects, e.g., often limited to manipulating small or static objects. Our task is significantly more challenging, as it requires modeling dynamic objects with various shapes, capturing whole-body motion, and ensuring physically valid interactions. To this end, we propose InterDiff, a framework comprising two key steps: (i) interaction diffusion, where we leverage a diffusion model to encode the distribution of future human-object interactions; (ii) interaction correction, where we introduce a physics-informed predictor to correct denoised HOIs in a diffusion step. Our key insight is to inject prior knowledge that the interactions under reference with respect to contact points follow a simple pattern and are easily predictable. Experiments on multiple human-object interaction datasets demonstrate the effectiveness of our method for this task, capable of producing realistic, vivid, and remarkably long-term 3D HOI predictions, showing its future promise for being integrated into immersive computing platforms.

Doug Friedel (NCSA), Steven Gao (CS), Jeffrey Liu (CS), Qinjun Jiang (CS), Yihan Pang (CS), Rahul Singh (CS), Mark van Moer (NCSA)

ILLIXR is the first open source end-to-end XR system (1) with state-of-the-art components, (2) integrated with a modular and extensible multithreaded runtime, (3) providing an OpenXR compliant interface to XR applications (e.g., game engines), and (4) with the ability to report (and trade off) several quality of experience (QoE) metrics. Recent efforts have enabled ILLIXR to work with OpenXR applications developed by various game engines (Godot, Unreal, etc.) and offloading several key components in the XR pipeline to the cloud for more efficient power usage on device. This presentation will offer an overview of the ILLIXR project, its objectives, and key features. It will also showcases ILLIXR's potential of empowering researchers with the tools needed to conduct experiments spanning system-level architecture to application-level studies through several published and ongoing projects.

Qinjun Jiang (CS), Yihan Pang (CS), William Sentosa (CS), Steven Gao (CS), Muhammad Huzaifa (CS),  Brighten Godfrey (CS), Javier Perez-Ramirez (Intel Corporation),  Dibakar Das (Intel Corporation), David Gonzalez-Aguirre  (Intel Corporation), Sarita Adve (CS)

To unlock the full potential of extended reality (XR), the availability of comfortable, all-day wearable devices capable of providing rich immersive experiences is crucial. One primary constraint in designing such XR devices is power consumption. With head tracking identified as one of the top consumers of power, in this work, we reduce XR power consumption by offloading head tracking to a remote server. To deal with the unpredictable latency associated with the network, our design offloads the visual-inertial odometry (VIO) component of head tracking, as it is CPU-intensive, while compensating for the delays in its responses by the on-device IMU integration. We implement this approach as a novel end-to-end XR system, allowing us to experimentally evaluate power and user experience. We found that even with considerable compression of data sent to VIO and unusually large network latency, our system produces an end-to-end XR experience comparable to a system without offloading, while cutting average CPU power by 45.5% and full system power by 17.7%. Moreover, we find that user experience does not correspond cleanly to commonly-used trajectory error metrics, motivating the need for future work on benchmarks and metrics tuned for XR research.Meanwhile, to better accommodate bandwidth-limited scenarios, as well as further reduce the power consumption of camera image capturing, compression, and transmission, we employ a selective frame-dropping strategy. Frames that do not contribute sufficient new features for head tracking are intentionally omitted. Rather than performing the compute-intensive feature extraction directly on the device, we adopt an alternative approach, predicting the similarity between two images based on their pose differences. By implementing this method, we anticipate substantial savings in bandwidth usage without compromising the overall user experience.

Steven Gao (CS), Jeffrey Liu (CS), Qinjun Jiang (CS), Talal Touseef (CS), Brighten Godfrey (CS), Sarita Adve (CS)

Power consumption is a major constraint in extended reality (XR) because true immersive experiences require devices that can be worn for prolonged periods of time, and one method to address this problem is to offload computation to a remote server. Rendering is a prime target for offloading because it is a major consumer of power and a client headset's power restraints generally limit the quality of rendering that can be done on-device. In this work, we reduce XR power consumption by offloading rendering to a remote server. To compensate for network-related latency challenges, we implement a set of spatiotemporal reprojection techniques to reuse previous frame data for new frame extrapolation. These are computationally cheaper and faster methods that can be used to temporarily generate frames asynchronously while new frames from the server are still being processed. We implement this approach as an end-to-end XR system in ILLIXR, allowing us to evaluate power and user experience experimentally. We also analyze the tradeoffs between different reprojection techniques as they require differing types of information, which thus leads to differing network bandwidth and latency requirements. This framework also enables future research in asynchronous reprojection and/or supersampling, network protocols, and user experience metrics.

Vignesh Suresh (ECE), Bakshree Mishra (CS), Zeran Zhu (ECE), Ying Jing (ECE), Naiyin Jin (ECE), Charles Block (CS), Paolo Mantovani (Columbia), Davide Giri (Columbia), Joseph Zuckerman (Columbia), Luca Carloni (Columbia), Sarita Adve (CS)

Resource-constrained systems-on-chips (SoCs) like extended reality (XR) headsets are becoming increasingly heterogeneous with specialized accelerators for various tasks. However, end-to-end speedup with hardware acceleration is diminished by acceleration taxes like control and data movement taxes. At the same time, emerging workloads such as XR workloads are becoming increasingly task-diverse and have several fine-grained acceleration candidates. This motivates the need for a paradigm of disaggregated acceleration that provides benefits of flexibility, reuse, and utilization, albeit at the cost of increased impact of acceleration taxes such as control and data taxes.In this work, we seek to minimize these taxes through a combination of (1) a lightweight accelerator synchronization interface (ASI) to reduce the control tax, and (2) the flexible Spandex-FCS coherence protocol to reduce data movement tax. Using an FPGA implementation, we perform a comprehensive evaluation of multiple hardware configurations and multiple benchmarks, that include real-world use cases such as the 3D spatial audio decoder in an XR system. Our results show that ASI and Spandex-FCS, by minimizing the acceleration taxes, enable fine-grained acceleration for different edge systems. Further, we enable new opportunities to efficiently exploit accelerator-level parallelism, such as accelerator chaining and pipelining, and demonstrate their benefits in our real-world use cases. Finally, we show that the efficient composition of disaggregated accelerators, with ASI and Spandex-FCS can provide performance close to a monolithic specialized accelerator.

Rajalaxmi Rajagopalan (student), Yu-lin Wei (student), Romit Roy Choudhury (PI)

We consider the problem of personalizing audio to maximize user experience. Briefly, we aim to find a filter, which applied to any music or speech, will maximize the user's satisfaction. This is a black-box optimization problem since the user's satisfaction function is unknown.Today's black box optimization methods focus on optimizing unknown functions in high dimensional spaces. In certain human-centric applications, evaluating these functions are expensive, hence the optimization must be performed under tight sample budgets. Past work have approached black-box optimization using Gaussian Process Regression (GPR) and sample efficiency has been improved by utilizing information about the shape/structure of the function.We propose to discover the structure of the human-perception function by learning a GPR kernel. Learning the kernel is achieved via an auxiliary optimization in kernel space. The optimal kernel is expected to best ``fit'' the unknown function, helping lower the sample budget. Results show that our method, KerGPR, effectively minimizes the black-box function at a substantively lower sample budget.These results hold not only in synthetic black box functions, but also in real-world audio personalization applications, where music is optimized to maximize a user's personal satisfaction.

Ahan Gupta (UIUC), Yueming Yuan (UIUC), Devansh Jain (UIUC), Yuhao Ge (UIUC), David Aponte (Microsoft), Yanqi Zhou (Google), Charith Mendis (UIUC)

Multi-head-self-attention (MHSA) mechanisms achieve state-of-the-art (SOTA) performance across natural language processing and vision tasks. Unfortunately, their quadratic dependence on sequence lengths has bottlenecked inference speeds. To circumvent this bottleneck, researchers have posited a variety of sparse-MHSA mechanisms, where a subset of full-attention is computed. In spite of the promise of sparse-MHSA, current software tools do not support high performance implementations for a variety of these sparse structures. On one end, code-generation strategies and optimizations used by deep learning compilers and vendor-libraries target either extremely high or low levels of sparsity, leading to suboptimal performance for the moderate levels of sparsity in sparse-MHSA. On another end, hand-written optimized implementations lack generality, are cumbersome to implement, and are specific to only one sparse-MHSA pattern.To fill this gap, we introduce SPLAT: an optimized code generation framework that is general enough to cover a variety of sparse-MHSA structures and appropriately targets their levels of moderate sparsity. The key insight behind SPLAT is to observe that sparsity patterns in these structures are regular. We introduce a new sparse format: affine compressed sparse-row (ACSR) that captures this regularity and develops novel, optimized GPU code-generation algorithms that operate on ACSRs to accelerate sparse-MHSA kernels. We show SPLAT's generality, using it to implement a variety of sparse-MHSA models, achieving geomean speedups of 2.05x and 4.05x over hand-written kernels written in Triton and TVM respectively. Moreover, its interfaces are intuitive and easy to use with existing implementations of MHSA in JAX.

Yuan Li (Zhejiang University), Zhi-Hao Lin (CS), David Forsyth (CS), Jia-Bin Huang (UMD), Shenlong Wang (CS)

Physical simulations produce excellent predictions of weather effects. Neural radiance fields produce SOTA scene models. We describe a novel NeRF-editing procedure that can fuse physical simulations with NeRF models of scenes, producing realistic movies of physical phenomena in those scenes. Our application -- Climate NeRF -- allows people to visualize what climate change outcomes will do to them.ClimateNeRF allows us to render realistic weather effects, including smog, snow, and flood. Results can be controlled with physically meaningful variables like water level. Qualitative and quantitative studies show that our simulated results are significantly more realistic than those from SOTA 2D image editing and SOTA 3D NeRF stylization.

Hongchi Xia (SJTU), Zhi-Hao Lin (CS), Wei-Chiu Ma (MIT), Shenlong Wang (CS)

Creating high-quality and interactive virtual environments, such as games and simulators, often involves complex and costly manual modeling processes. In this paper, we present Video2Game, a novel approach that automatically converts videos of real-world scenes into realistic and interactive game environments. At the heart of our system are three core components: (i) a neural radiance fields (NeRF) module that effectively captures the geometry and visual appearance of the scene; (ii) a mesh module that distills the knowledge from NeRF for faster rendering; and (iii) a physics module that models the interactions and physical dynamics among the objects. By following the carefully designed pipeline, one can construct an interactable and actionable digital replica of the real world. We benchmark our system on both indoor and large-scale outdoor scenes. We show that we can not only produce highly-realistic renderings in real-time, but also build interactive games on top.

Zhi-Hao Lin (CS), Bohan Liu (CS), Yi-Ting Chen (UMD),  David Forsyth (CS), Jia-Bin Huang (UMD), Anand Bhattad (CS), Shenlong Wang (CS)

We present UrbanIR (Urban Scene Inverse Rendering), a new inverse graphics model that enables realistic, free-viewpoint renderings of scenes under various lighting conditions with a single video. It accurately infers shape, albedo, visibility, and sun and sky illumination from wide-baseline videos, such as those from car-mounted cameras, differing from NeRF's dense view settings. In this context, standard methods often yield subpar geometry and material estimates, such as inaccurate roof representations and numerous 'floaters'. UrbanIR addresses these issues with novel losses that reduce errors in inverse graphics inference and rendering artifacts. Its techniques allow for precise shadow volume estimation in the original scene. The model's outputs support controllable editing, enabling photorealistic free-viewpoint renderings of night simulations, relit scenes, and inserted objects, marking a significant improvement over existing state-of-the-art methods.

Rahul Singh (CS), Muhammad Huzaifa (CS), Jeffrey Liu (CS), Anjul Patney (NVIDIA), Hashim Sharif (CS), Yifan Zhao (CS), Sarita Adve (CS)

Extended reality (XR) devices, including augmented, virtual, and mixed reality, provide a deeply immersive experience. However, practical limitations like weight, heat, and comfort put extreme constraints on the performance, power consumption, and image quality of such systems. In this work, we study how these constraints form the tradeoff between Fixed Foveated Rendering (FFR), GazeTracked Foveated Rendering (TFR), and conventional, non-foveated rendering. While existing papers have often studied these methods,we provide the first comprehensive study of their relative feasibility in practical systems with limited battery life and computational budget. We show that TFR with the added cost of the gaze-tracker can often be more expensive than FFR. Thus, we co-design a gaze-tracked foveated renderer considering its benefits in computation, power efficiency, and tradeoffs in image quality. We describe principled approximations for eye tracking which provide up to a 9× speedup in runtime performance with approximately a 20× improvement in energy efficiency when run on a mobile GPU. In isolation, these approximations appear to significantly degrade the gaze quality, but appropriate compensation in the visual pipeline can mitigate the loss. Overall, we show that with a highly optimized gaze-tracker, TFR is feasible compared to FFR, resulting in up to 1.25× faster frame times while also reducing total energy consumption by over 40%.

Boyuan Tian (ECE), Muhammad Huzaifa (CS), Yihan Pang (CS), Shenlong Wang (CS), Sarita Adve (CS)

Mobile vision is crucial for empowering mobile devices to understand and interpret their visual surroundings. Its applications are widespread, spanning various areas such as AR/VR, autonomous driving, smartphones. These applications often require highly accurate results processed in real-time for prompt interaction, but must also operate on mobile platforms with limited battery life and energy constraints. Therefore, it is essential to consider not only low latency and high accuracy but also low energy consumption, a factor that is often overlooked in algorithm designs.

To avoid prioritizing marginal algorithmic latency and accuracy at the expense of excessive energy usage, we introduce the concept of energy-proportional mobile vision - a discipline that seeks to develop and execute mobile vision algorithms consuming energy in proportion to user-perceivable benefits in latency and accuracy.
We present a case study to navigate the trilemma in energy, latency, and accuracy. We adopt TSDF Fusion, a classic 3D reconstruction framework that progressively integrates posed RGB-D frames into a voxel-based signed distance map. We demonstrate energy disproportionality from three perspectives: algorithmic design, execution strategy, and data selection, revealing opportunities to achieve energy-proportional mobile vision. By synergistically co-optimizing all three aspects, we achieve an average reduction in energy consumption by 24.53X and a decrease in frame latency by 1.42X, without compromising reconstruction quality.

Hashim Sharif, Yifan Zhao, Maria Kotsifakou, Akash Kothari, Benjamin Schreiber, Elizabeth Wang, Yasmin Sarita, Nathan Zhao, Keyur Joshi, Vatsin Shah, Arun Sivakumar, Mateus Valverde, Mohd. Abdulrahman, Vikram Adve, Sasa Misailovic, Sarita Adve, Girish Chowdhary 

Future multiuser immersive applications designed around rich, interactive user interface modalities will present difficult performance, energy, and human interface challenges which will require aggressive new optimization strategies.  One important dimension we can exploit in these applications is their interactive nature: because human perception has limits in visual resolution, response time, and other metrics, the applications and the underlying systems can exploit a variety of approximation techniques that enable aggressive optimizations as long as user-perceived quality metrics are preserved.  We are describe ApproxTuner, a framework for selection and tuning of approximation techniques that maximizes performance and energy efficiency while ensuring that desired, measurable application-level quality metrics are preserved.  In the context of immersive applications, we aim to explore new approximation techniques (including techniques for reducing and tolerating user-to-user latencies) and a variety of application-specific quality metrics that can collectively achieve far greater gains than are achievable today.

Yifan Zhao, Hashim Sharif, Vikram Adve, Sasa Misailovic 

Obtaining high-performance implementations of tensor programs, such as deep neural networks, on a wide range of hardware remains challenging. Search-based tensor program optimizers can automatically find high-performance programs on a given hardware platform, but the search process in existing tools suffers from low efficiency, requiring hours ordays to discover good programs due to the size of the search space.To address this problem, we present Felix, a novel optimization framework for tensor programs. As a stark departure from current tensor compilers that target commodity GPUs, Felix creates a symbolic and differentiable space of TVM tensor programs and applies continuous relaxation on the space of programs. Through this process, Felix creates a differentiable estimator of program latency, allowing an efficient search of program candidates using gradient descent, in contrast to conventional approaches that search over a non-differentiable objective function over a discrete search space. We extensively evaluate six deep neural networks for vision and natural language processing tasks on two GPU-based platforms. Our experimental results show that Felix surpasses the performance of both manually written and autotuned kernel libraries in PyTorch and Tensorflow within 5 minutes of search time on average. Felix also finds optimized programs significantly faster than TVM Ansor, a state-of-the-art search-based optimization framework for tensor programs.We are excited to discuss with the broader XR community the opportunities to speed up neural networks and other tensor-based computation (e.g., used in image and video processing) in applications with strict latency requirements and/or on custom hardware for AR/VR. To enable future low-latency and accurate XR, we aim to outline techniques for optimizing trade-offs across multiple properties: execution time, energy, reduced accuracy (due to optimizations like network pruning or quantization), and robustness (i.e., the ability of computations to execute reliably in the presence of input perturbations like image rotations or changes in lighting).

TP5 - Hardware/Software Co-Design of Data-Centric Computers

Ryan Wong (CS), Yiqiu Sun (CS), Minh S. Q. Truong (Carnegie Mellon Univ.), Pratik Sampat (CS), Sudhanshu Agarwal (CS), Yilin Shen (ECE), Arjun Tyagi (CS), Saugata Ghose (CS)

Immersive computing applications rely on retrieving and processing on increasingly large amounts of data. However, conventional computer hardware wastes significant energy and time on large-scale data processing, preventing platforms for immersive computing from scaling down in size, and constraining the ability to efficiently distribute computation in a multi-node setting. Our work examines several approaches to building data-centric computer systems, where we rethink key principles of hardware/software design to avoid the inefficiencies of conventional computers. These range from changes that are implementable for existing systems, to ones that use emerging technologies to propose long-range solutions for data handling. Our poster highlights two key areas where we are focusing our efforts. First, we have a large ongoing effort on hardware and software for processing-in-memory, where we can use non-conventional systems to perform general-purpose operations where data is stored, eliminating the large amount of energy spent by current systems on data movement. Second, we have several projects that rethink key abstractions between memory/storage, operating systems, compilers, and applications, in an attempt to intelligently manage data throughout the system. As these efforts progress, we aim to prototype full-stack solutions for programmable computing platforms.

Yongzhou Chen, Ammar Tahir, Francis Yan, Radhika Mittal

It is challenging to meet the bandwidth and latency requirements of interactive real-time applications such as virtual reality on time-varying 5G cellular links. Today’s feedback-based congestion controllers try to match the sending rate at the endhost with the estimated network capacity. However, such controllers cannot precisely estimate the cellular link capacity that changes at timescales smaller than the feedback delay. We instead propose a different approach for controlling congestion on 5G links for such applications. We send real-time data streams using an imprecise controller (that errs on the side of overestimating network capacity) to ensure high throughput, and then adapt the transmitted content by dropping appropriate packets in the cellular base stations to match the actual capacity and minimize delay. We build a system called Octopus to realize this approach. Octopus provides parameterized primitives that applications at the endhost can configure differently to express different content adaptation policies. Octopus transport encodes the corresponding app-specified parameters in packet header fields, which the basestation logic can parse to execute the desired dropping behavior. Our evaluation shows how real-time applications involving standard and volumetric videos can be designed to exploit Octopus, and achieve 1.5–18× better performance than state-of-the-art schemes.

Talal Touseef, William Sentosa, Milind Kumar Vaddiraju, Debopam Bhattacherjee, Balakrishnan Chandrasekaran, Brighten Godfrey, Shubham Tiwari, Sarita Adve, Shenlong Wang, Javier Perez-Ramirez

Interactive networked applications like XR require high throughput, low latency, and high reliability from the network to provide a seamless user experience. While meeting these three requirements simultaneously is difficult, there has been an emergence of heterogeneous virtual channels (HVCs) which support some subset of them at the expense of the others. For instance, URLLC sacrifices throughput to achieve low latency and reliability in 5G NR. In the case of Wi-Fi, Wireless Time Sensitive Networking (WTSN) provides a channel with deterministic latency at the expense of throughput and Multi-Link Operation (MLO) provides reliability through redundancy. Prior work either focuses on aggregating the bandwidth of these channels whilst neglecting their unique properties or fails to generalize in the sense of achieving high performance across different applications and channels. To utilize HVCs to their fullest, we argue that there are challenges and opportunities across the network, transport and application layers, and the application-transport interface of the network stack. In this work, we identify the constituting principles of a design that is general, performant, and deployable. Finally, we argue for a transport layer solution design based on MPQUIC that is aware of the properties of underlying paths, combines congestion control with steering between individual paths and can utilize information from the application layer to boost performance.

Applications Thrust

Brad Sutton, James Evans, Jenny Amos, Matthew Bramlet, Andres Maldonado, Eliot Bethke, Aaron Anderson, Graham Huesmann, Connor Davey

Epilepsy is one of the most prevalent neurological conditions in the world. For patients with drug resistant epilepsy, stereoelectroencephalographic (SEEG) electrodes are placed into the brain to localize seizure foci for seizure localization and resection. Due to the multimodal complexity of SEEG data collected from the entire brain volume, visualization tools are needed to effectively interpret the recorded electrical activity and improve their clinical use. We developed a Python-based software tool to automatically extract SEEG recorded activity and visualize them in the three-dimensional, virtual reality (VR), anatomical space. We analyzed SEEG data from four epilepsy patients who underwent evaluation at the OSF Saint Francis Medical Center in Peoria, Illinois, in accordance with an IRB approved protocol. All patients underwent preoperative magnetic resonance imaging (MRI) and post-implant computed tomography (CT). Our algorithm handles all the necessary steps to merge the MRI, CT, and SEEG data to display it in a coherent, subject-specific model for the surgical team. Our algorithm performs the following steps: co-registration of the MRI to the post-implant CT, extraction of brain volumes (gray/white matter and CSF) from the MRI; segmentation of the SEEG contacts from the CT image by applying image erosion and thresholding steps followed by replacement with synthetic 3D electrodes; conversion of imaging data into a merged isotropically sampled space; and conversion of all volumes into VR-ready data formats. Imaging data was converted to object files using Lewiner’s marching cubes algorithm. A semi-automated method is used to map the identified electrodes to SEEG channels based on the naming convention used during implantation. SEEG data is filtered to extract activity in the ripple band (80 - 250 Hz). Activity is displayed to the surgeon by modulating the size of the electrodes in the 4D (3D space and time) VR object. For this display, we computed the normalized windowed power of the signal at each frame of the future animation for each contact. These normalized powers are min-max scaled to show the difference between contacts. Animation is automated using Blender scripts by scaling the size of the electrode contacts to the scaled power at that frame, where the electrode grows with high signal power and shrinks with low signal power. A 3D timeline is imported into the model to track where animation is in relation to the seizure onset. The generated model is exported as an FBX file which can be loaded into VR to view the anatomical and temporal data merged into a single model. Viewing the combined model in the VR space enables a better understanding of the anatomical position of electrodes with high activity which contribute to the seizure onset zone. During a seizure, we can watch the signals propagate both along the electrode and from one electrode to another. This provides a brand-new method of viewing seizure activity for surgeons and epileptologists to use.

Daniel Robertson, MD; Matthew Bramlet , MD

Virtual Reality (VR) creates a 3-dimensional (3D) computer-generated environment in which orientation and interaction with the 3D environment are possible. Head-mounted displays (HMD) project the environment in front of the user’s eyes creating a wide field of view and a feeling of full immersion. Motion tracking systems allow interaction with this generated environment. Patient specific 2D information is used for virtual reality modeling (VRM) in pediatric surgical oncology for preoperative planning. Models can be generated within 48 hours of data acquisition or less if required. The anatomic complexity of oncology cases makes use of VRM particularly valuable. Survey data reveals that confidence of pediatric surgeons increases after review of VRM and the operative plan changes 8.3% of the time. More in-depth qualitative assessments are being used to further study the impact. Model creation will be explained. Specific models will demonstrate the utility and clinical impact. Ongoing studies and future directions will be discussed, including the foundation of an NIH 3D library as an open repository for VRM of tumors.

Irfan S. Ahmad (CIMED, HMNTL), Alexa Lauinger (CIMED student), Bilal Karim (BioE student), Alexander Smith (MD/PhD student CIMED/CS), Sarita Adve (CS), Brad Sutton (BioE), Colleen Bushell (NCSA), Mark Cohen (CI MED)

The US NSF recently approved the addition of the Carle Illinois College of Medicine (CI MED) as a site for the Center for Medical Innovations in eXtended Reality (MIXR), an NSF Industry-University Collaborative Research Center (IUCRC) led by the University of Maryland, College Park, in collaboration with the University of Maryland Baltimore and the University of Michigan, Ann Arbor. MIXR's Phase 1 IUCRC integrates computer science and engineering with biomedical and clinical sciences to advance the development of extended reality applications in medicine. The center focuses on enhancing public health, patient management, and healthcare outcomes across various clinical practices, from diagnostics to therapeutics. MIXR's roadmap includes clinical practice, medical education, training, and establishing regulatory standards. Collaborating with industry partners, MIXR addresses challenges in making extended reality (XR) an integral part of medical education and clinical care delivery. In clinical practice, MIXR explores XR efficacy in areas such as cardiac catheterization, intubation, post-trauma care, pain management, extra-ventricular drainage, and rural specialty care.  The MIXR center with the inclusion of the UIUC site, focuses on four key themes: evaluating XR effectiveness in clinical applications, enhancing medical education and training, establishing regulatory standards in partnership with the FDA, and promoting diversity in global healthcare by democratizing XR technologies. With commitments from ~19 companies, MIXR aims to accomplish its mission.

Angel Chatterton (UIUC - Gies Accountancy)

This research proposal explores the innovative integration of Augmented Reality (AR), Virtual Reality (VR), and Artificial Intelligence (AI) as an assessment tool to enhance auditing education within a learning platform. The primary objective is to enhance learning experiences and assessment capabilities. The proposal investigates how AR and VR (collectively known as Extended Reality – XR) together with AI can create immersive, interactive auditing scenarios, allowing learners to engage in auditing tasks and experiences. XR will be utilized to understand if assessments delivered using this technology heightens the learners’ comprehension of auditing concepts in comparison to traditional methods of educational delivery. Using ChatGPT’s API, an interactive dialogue will be deployed to provide simulated challenges and guidance encountered during audits. The research will focus on evaluating learners’ satisfaction, perceived effect of knowledge retained, and confidence in the application of audit learning objectives. Integration to a learning platform should improve access to these technologies, allowing educators to track learners’ progress, and incorporate XR activities into their existing curriculum. This proposal includes development of the auditing assessment simulation and a pilot study. The findings will contribute to the field by enhancing our understanding of XR and AI in transforming business education and inspiring further research.

Kingsley Osei-Asibey (Athletics), Farzaneh Masoud (IFSI), Chief J.P. Moore (IFSI), Brendan McGinty (NCSA), Kalina Borkiewicz (NCSA), David Bock (NCSA)

Videos from VR prototypes developed by the Data Analysis and Visualization (DAV) Lab at NCSA, in collaboration with other units, will be presented. It includes VR prototypes for three different applications: 1) The University of Illinois Football play recognition 3D training project presents a new teaching and learning method for football formations, for improved reaction time and decision-making. 2) The Firefighting VR prototype, developed in collaboration with first responders, simulates a situation of firefighter site evaluation and includes different levels of house fires, as well as realistic details such as condensation on the mask. 3) The Drone Piloting project features a simulation of a pilot view operating drone with testing heads-up displays, data overlays (e.g., buildings, lidar data) and real-time, dynamic terrain generation (Cesium).

Eric Shaffer, Dan Cermak

The Immersive Learning Lab at ECE Illinois is engaged in the development of curated virtual reality (VR) experiences aimed at facilitating classroom instruction on the complex and abstract concepts pertaining to Electricity and Magnetism. Owing to the three-dimensional nature of these concepts and the lack of intuitive familiarity with the phenomena, engineering students often face challenges in understanding these principles that do not translate well to traditional two-dimensional platforms.The team is developing immersive and intuitive learning experiences, leveraging cutting-edge VR technology to enable students to better grasp the principles associated with electromagnetism and thus enhance their academic performance. Each of these VR laboratory experiences is designed to support insights into physical laws or other related phenomena, as well as provide context for the discoveries. Gamification elements are used to maximize the time students spend inside the VR environment while learning, exploring, and engaging with the experiment. This paper discusses topics pertaining to “Laboratory development and innovation,” such as the process of VR experience design, starting from the conception, software development, testing, in-class usage, and learning assessments for one experience. The VR software development is done by undergraduate students, allowing students to participate in the education of their peers while also learning about E&M concepts themselves.The VR tools and evaluation techniques presented here have the potential to change the landscape of how STEM topics are taught both in the classroom and within students' homes. These novel teaching methods fully leverage the technological advancements in user-driven experiences and results-driven teaching methods to facilitate the learning of complex topics. In addition, we discuss the modalities through which the team promotes the participation of diverse groups of undergraduate students in research and creates opportunities for them to acquire technical and critical thinking skills needed by individuals living in a technologically advanced and competitive society.

Anthony Nepomuceno (CIMED), Inki Kim (ISE), Celeste Schultz (Univ. of Illinois at Chicago), Caroline G.L. Cao (ISE)

The paucity of research on the design and integration of VR applications that can effectively enhance nursing practice and education has motivated us to pursue a new simulation modality that closes the loop between care team, patients and their dynamic health conditions,  and the physical care environment. We present a new simulation modality through the integration of immersive virtual reality (VR) simulation and digital twins (DTs) that allows a group of nurses to experience, in a virtual-world platform, realistic care situations that involve patients, their electronic health records, and physical care environments. Specifically, this new simulation architecture integrates client applications for running the virtual world, cloud services for data processing, and a physical environment (i.e., physical twin [PT]) to implement team interactions within DTs. Complex team-based dynamics for decision-making, behaviors, and communication will be represented in a dynamic Bayesian network (DBN). Multi-modal data collection including camera, depth, IMU, BLE, and FMCW based systems are employed to gain real-time insights of the physical environment and on-site users to provide data to the digital twin. A demonstration will illustrate integration of VR with the digital twin to show both online and offline immersive learning capabilities. Future work will be done to validate this new simulation modality with a simulated nursing triage scenario between three patients with various levels of severity: intracranial hemorrhage, septic ketoacidosis, and a routine intensive care checkup. If successful, the DT built here will become the first clinical “experiential learning” reference source for nurses, which allows a user in a virtual world to directly navigate through and learn from the fidelity instances of nursing practice in hospital units.

Beitong Tian (CS), Shiv Trivedi (CS), Mingyuan Wu (CS), Klara Nahrstedt (CS)

Researchers working in academic cleanrooms play a pivotal role in advancing cutting-edge scientific research, particularly in areas like chip manufacturing and material research. When engaged in experiments within the cleanroom environment, researchers not only conduct experiments but also perform various essential tasks. These tasks encompass ensuring that the conditions and equipment are suitable for experiments, tracking and recording the experimental process, searching for information, requesting assistance, and ensuring their own safety. Some of these tasks are tedious and can divert researchers' attention from their primary research objectives. To alleviate researchers from these burdensome tasks and provide intelligent assistance throughout their research activities, we have developed MAINTGLASS, a glasses-like wearable device. MAINTGLASS was designed from scratch, taking into account multiple aspects such as cost, researcher-specific needs, the cleanroom environment, and privacy considerations. It incorporates a monocular display, a camera, a microphone, speakers, and environmental sensors like temperature and humidity sensors. All components are controlled by a Raspberry Pi and powered by a battery bank. Both hardware and software development for MAINTGLASS are currently underway. We have a working prototype that allows users to query and visualize data from two previously developed IoT infrastructures: an environmental monitoring system (SENSELET: and a machine condition monitoring system (MAINTLET: This capability enables users to understand the surrounding environmental and equipment conditions. The end-to-end and open-source design of MAINTGLASS also positions it as a versatile platform and testbed for research in artificial intelligence, system and networking, security and privacy, and human-computer interaction. We aim to introduce MAINTGLASS to our community and seek additional collaboration opportunities to unlock its full potential.

Patrick Naughton* (CS), James Seungbum Nam* (MechSE), Andrew Stratton (CS, graduated), Kris Hauser (CS)

Teleoperated avatar robots allow people to transport their manipulation skills to environments that may be difficult or dangerous to work in. Current systems are able to give operators direct control of many components of the robot to immerse them in the remote environment, but operators still struggle to complete tasks as competently as they could in person. We present a framework for incorporating open-world shared control into avatar robots to combine the benefits of direct and shared control. This framework preserves the fluency of our avatar interface by minimizing obstructions to the operator's view and using the same interface for direct, shared, and fully autonomous control. In a human subjects study (N=19), we find that operators using this framework complete a range of tasks significantly more quickly and reliably than those that do not.

Anthony Nepomuceno (CIMED), Wanning Cheng (CS), Chin-Hao Lo (CS), Nimit Kapadia (ISE), Celeste Schultz (Univ. of Illinois at Chicago),  Avinash Gupta (CS)

The current nursing shortage has been exacerbated from the after effects of the SARS-CoV-2 pandemic along with a large population of the nursing workforce nearing retirement age.  The AACN’s 2021–2022 report states: US nursing schools turned away 91,938 qualified applicants to baccalaureate and graduate nursing programs due to faculty shortages, insufficient clinical sites and clinical preceptors, limited classroom space, and budget constraints. VR provides a way to help address the insufficient staffing and site resources for certain areas of training. Current simulators have been made to focus on procedural skills, but few have focused on bedside manner and communication skills. VR can provide a better medium for teaching communication skills than the current 2D simulators. The aim of this project is two-fold: 1) Create a VR clinical environment, 2) Create VR clinical nursing scenarios focused on bedside manner and communication skills. To develop the VR nursing simulator, Unity, Character Creator 4, and was leveraged to provide the overall game engine, avatar generation, and AI-driven natural language processing (NLP), respectively. This resulted in the creation of the VR environment as well as the three (3) clinical communication scenarios: 1) Health History, 2) Bedside Conversation, 3) Pediatric Conversation. In spring of 2024, human subject trials will take place validate the application.  Such trials will test for face validity, content validity, and useability with the subject populations of nursing students and faculty.

Courtney McBeth (UIUC), Isaac Ngui (UIUC), Ananya Kommalapati (UIUC), Manel Piera (UPC), Fernando Sakabe (Insper), Shreya Vinjamuri (UIUC), Marco Morales (UIUC and ITAM), Luciano Soares (Insper), Nancy Amato (UIUC)

Motion Planning (MP) is the problem of finding a valid sequence of motions for a robot, or other object, to move from a start to a goal configuration in some environment. Using highly immersive virtual reality environments for motion planning opens up new alternatives for solving problems and provides unique opportunities to incorporate human insight with automated task and motion planning methods. Many interesting motion planning questions can be effectively addressed by leveraging virtual and augmented reality techniques with high-quality graphics and instant user input. While the low-precision real-time path planning and collision detection commonly used in virtual reality may not be suitable for certain professional problems, coupling them with sophisticated, high-precision planning solvers can lead to faster and more effective solutions. This potential can be further explored through augmented reality, allowing specialists to visualize motion plans in real places. Moreover, there is also the possibility of integrating these two domains with machine learning techniques like reinforcement learning, which is more feasible today due to the availability of high-performance GPUs. This integration enhances the adaptability and efficiency of motion planning processes, providing an efficient approach to addressing complex challenges.Sampling-based algorithms address the motion planning problem by constructing a graph that includes representative paths for the robot within the environment. The nodes of this graph are valid robot poses, while its edges are local paths between nodes. Although Sampling-based motion planning algorithms are being used to find paths for robots with many degrees of freedom quite efficiently, they can greatly improve their performance when they have some guidance in the exploration. This guidance can be obtained algorithmically, but it can also be obtained from user input. Virtual reality (VR), augmented reality (AR), and haptic interfaces can provide an immersive experience for users to provide guidance, speeding up different aspects of motion planning. The main topics being investigated include:- Co-design of environment and tasks: Users provide environment parameters such as object location and sketch paths. Similarly, they also provide task specifications and constraints relevant to achieving the tasks.- Visualization of motion planning data structures embedded in the environment, either in virtual or augmented reality modes.- Providing feedback to the planner, such as providing approximate solutions, setting up sampling biases to different regions, and nudging paths.- Bringing learning into the planning process and gathering data to be used in learning.

Matthew Bramlet, Brad Sutton, Megan Griebel

XR immersive technology promises the opportunity to expedite the core educational principle of transference of knowledge from educator to learner, but barriers of content creation and distribution can limit institutional adoption. Conversion of slideware formats to VR formats consistently demonstrate learning occurs four times faster in VR than in the classroom and 1.5 times faster than e-learn technologies. This project aims to demonstrate feasibility of iterative transition to XR educational methodologies utilizing an internally developed VR authoring tool, Enduvo, in a flipped classroom format. Enduvo was utilized in a flipped classroom format for a fall 2022 302 Modeling Human Physiology Course at UIUC. For this project, new VR content was created to cover both the nervous system and musculoskeletal system. Each VR assignment was designed to replace a single in-person lecture in content and time. Given the inclusion of interactive assessment tied to each concept, the expected time spent in VR by each learner was a combination of 1. Proctored VR Experience and 2. Learner Interaction including assessment. As such, 25 minutes of created VR content was the target of content creation allowing an equal amount of 25 minutes of learner exploration and interaction outside of the recorded content. Utilizing existing lecture content, it was found that a single proctored digital VR classroom (25 minutes of recorded content) could be created from material from 1.5 to 2x traditional lectures. To test engagement a survey was executed. It was broken up into three question types: social, emotional, and cognitive. Of the 23 questions there were 7 cognitive, 13 emotional, and 3 social questions. Perceived learning was also analyzed in survey form. Performance in Enduvo directly correlated to final exam performance. The purpose of the Enduvo lectures were to replace a specifc lecture of conceptual learning. On average, students spent 53.84 minutes in Enduvo, a good match to lecture time. Another important aspect of assessing students’ interaction with the material is evaluating their engagement. Enduvo had the highest scores for Cognitive learning with a score of 3.75. The neutral rating was categorized as 3 and then 5 was categorized as “highly agree” with the implication of cognitive abilities the Enduvo engaged with. Emotional had a rating of 3.38 and social implication rating was 2.52, which makes sense as the Enduvo activities were performed outside of class on an individual basis. While analyzing the highest average scores for perceived learning, it is clear based on the results that Enduvo has positively impacted many students. There was strong cognitive engagement as well as strong correlation between performance in Enduvo’s flipped classroom digital experience and final exam performance. Based on the results of questions, students believe that their overall understanding of the concepts are enhanced and they are using their notes and knowledge gained during their overall understanding of the course and connecting to the mathematical concepts involved with these systems.

Ann Sychterz (CEE), Eric Shaffer (CS), Jacob Henschen (CEE), Marci Uihlein (Arch), Nishant Garg (CEE)

The use of Virtual Reality (VR) technology has expanded beyond video games and into various fields due to its advancements in the development of real-life models. Research in engineering education has revealed that the implementation of VR has a positive impact on the learning process, motivating students to understand new concepts and enhancing their spatial skills. The increased availability of Building Information Models (BIM) and affordable immersive display systems has led to the increased usage of virtual reality (VR) in civil engineering and architecture education. This work presents the lessons learned from a case study that investigated the feasibility of using VR as a teaching tool for three undergraduate courses: a senior structural design course, a junior civil engineering course in concrete design, and an undergraduate course in the School of Architecture. Gaming engines such as Unity and BIM software such as REVIT were used to develop the VR apps. The study involved conducting affective surveys and cognitive assessments on classes of students in the mentioned courses to evaluate their comfort level using the VR app and their overall experience. The results of the study demonstrate that students were comfortable during their experience, indicating that VR has potential as a teaching tool in these courses. Additionally, based on a qualitative cognitive study it was found that the use of VR can be used as a complementary tool to the traditional teaching method.

Steve Garrou (CEO), Tim Quinn (Client Relations)

Enduvo is a content creation platform for immersive communication and learning. We enable organizations to leverage the power of immersive technology to easily transform their training programs. Our no-code platform makes it easy for anyone to create high-quality simulated content, without any technical experience required. With Enduvo, users can develop interactive learning experiences that help people learn faster, retain more information, and collaborate more effectively.  Our platform is widely deployed across educational, medical, industrial and enterprise use cases.

Our company spun out of the University of Illinois in 2018 and has been successfully providing an immersive content creation and distribution platform since that time.  We have been awarded multiple US government and private industry contracts as well has having 15 patents to date.  We look forward to working with the University of Illinois and the Immerse program going forward.

William Sherman (National Institute of Standards and Technology), Simon Su (National Institute of Standards and Technology), Judith Terrill (National Institute of Standards and Technology), Scott Wittenburg, (Kitware Inc.), Cory Quammen (Kitware Inc.)

The visualization of scientific data has long been one of the applications applied to virtual reality, with the expectation that immersive interfaces could enhance the ability to explore data, to find correlations, and enable researchers to better annotate and explain data.  Over the course of the past decade, the ParaView desktop visualization tool has been adapted to accommodate both headset (HMD) and projection (CAVE) style VR systems.  This talk will look at the promise of immersive visualization, and how ParaView can help researchers jump from the desktop into their data.

Kalina Borkiewicz, Jeff Carpenter, AJ Christensen, Donna Cox, Stuart Levy, Robert Patterson, Brad Thompson, Matthew Turk

Stereoscopic visualization, interactive and pre-rendered, on a large display with a shared view, of several physical-science data sets.

NCSA's Advanced Visualization Lab's work focuses on making cinematic high-quality rendered animations, based on spatially-organized scientific data such as astronomical simulations or earth-science measurements.   In creating those, we often create interactively-viewable, lower resolution previsualizations, and use a Space Pilot to navigate through them.    We will show a mix of pre-rendered movies and interactive scenes, and let participants fly through the latter.

Visitors can sit cozily on couches, wear passive polarizing glasses, and view our ~10' passive stereo display.   This is located in NCSA room 1005, adjacent to the main floor atrium.

 Harris Nisar, Avinash Gupta, M. Jawad Javed,Nicole Rau

Exposure to procedures varies in the Neonatal Intensive Care Unit (NICU). A method to teach procedures should be available without patient availability, expert oversight, or simulation laboratories. To fill this need, we developed a virtual reality (VR) simulation for umbilical vein catheter (UVC) placement and sought to establish its face and content validity and usability. Engineers, software developers, graphic designers, and neonatologists developed a VR UVC placement simulator following a participatory design approach. The software was deployed on the Meta Quest 2 head-mounted display (HMD). Neonatal Nurse Practitioners (NNPs) from a level 4 NICU interacted with the simulator and completed an 11-item questionnaire to establish face and content validity. Participants also completed the validated Simulation Task Load Index and System Usability Scale to understand the usability of the simulator. Group 1 tested the VR simulation, which was optimized based on feedback, prior to Group 2’s participation. 14 NNPs with 2-37 years of experience participated in testing. The participants scored the content and face validity of the simulator highly, with most participants giving a score ≥4/5 regarding questions about those aspects. The simulator was deemed usable based on the relatively high average System Usability Scores for both groups (Group 1: 67.14 ± 7.8, Group 2: 71 ± 14.1) and low SIM-TLX scores indicating the load in various categories was manageable while using the simulator. After optimization, Group 2 found the UVC simulator to be realistic and effective as an educational tool. Both groups felt that the simulator was easy to use and did not cause physical or cognitive strain. All fourteen neonatal nurse practitioners felt the UVC simulator provided a safe environment to make mistakes, and the majority of NNPs would recommend this experience to new trainees.

Heidi Phillips (CVM), Janet Sinn-Hanlon (iLearning@Vetmed)

Although the traditional “See one, Do one, Teach one” model of surgical training has required learners to imitate actions of skilled mentors in the clinical environment for over 100 years, seeing a procedure once is not sufficient to prepare today’s veterinary surgeon. While the volume and complexity of patient load has increased, a growing number of professionals have left veterinary medicine as an epidemic of suicide brought on by the stressors of the profession has overtaken the field. Therefore, the number, capacity, and availability of expert mentors to supervise veterinary students in an apprenticeship model has been drastically diminished, causing worsening anxiety in those requiring mentorship, especially concerning high-stakes surgical procedures. When the Covid-19 pandemic forced academicians to develop remote teaching methodologies, we asked ourselves, “Could we use remote learning tools to create a novel and innovative approach to teaching surgery that would reduce anxiety and stress brought on by learning to perform surgeries for the first time?” Our team of veterinary educators, multimedia specialists, and programmers from iLearning@VetMed, ITG and HCSEC are creating an online course that teaches the surgical techniques needed to perform a canine ovariohysterectomy ("spay").  The multimedia course will include elements of virtual reality that will provide students endless opportunities to observe and practice surgical steps in the safe space of VR before attempting to perform the traditional surgery in a clinical setting.

Mathew Bramlet, Jayishnu Srinivas, Tehan Dassanayaka, Mark Shaddad, Connor Davey, Bradley P. Sutton

Aortic dissection carries a mortality as high as 50%, but surgical palliation is also fraught with morbidity risks of stroke or paralysis.  As such, a significant focus of medical decision making is on longitudinal aortic diameters.  This lab hypothesizes that 3D modeling affords a more efficient methodology toward automated measurement for longitudinal aortic arch surveillance.  The first step is to demonstrate accuracy of automated measurement of manually segmented 3D models of the aorta. This research team developed an algorithm to analyze a 3D segmented aorta and output the maximum dimension of minimum cross-sectional areas in a stepwise progression from diaphragm to aortic root. From January 2021 to June 2022 sixty-six 3D non-contrast steady-state free precession (SSFP) magnetic resonance images of aortic pathology with clinical aortic measurements were identified; 3D aorta models were manually segmented.  A novel mathematical algorithm, with a success rate of 76%, was applied to each model to generate maximal aortic diameters from diaphragm to root, which were then correlated to clinical measurements.  Results: The resulting 50, 3D aortic models were analyzed utilizing the automated measurement tool.  There was an excellent correlation between the automated measurement and the clinical measurement.  The intra-class correlation coefficient (ICC) and p-value for each of the 9 measured locations of the aorta were: sinus of Valsalva, 0.99, <0.001; sino-tubular junction, 0.89, <0.001; ascending aorta, 0.97, <0.001; brachiocephalic artery, 0.96, <0.001; transverse segment 1, 0.89, <0.001; transverse segment 2, 0.93, <0.001; isthmus region, 0.92, <0.001; descending aorta, 0.96, <0.001, aorta at diaphragm, 0.3, <0.001. Automating diagnostic measurements that appease clinical confidence is a critical first step in a fully automated process.  Demonstrating excellent correlation between the measurements derived from manually segmented 3D models and the clinical measurements gives confidence toward transitioning the mindset from automating measurement off of 2D sliced images toward measurements derived from 3D models.  Combined with automated segmentation, this opens the door to cohort specific z-scores and improved longitudinal analysis with non-contrast, non-irradiating imaging formats.

Matthew Bramlet, Jenny Amos, Eliot Bethke, James Evans, Brad Sutton

Pre-surgical planning for pediatric cardiology is a complex and multi-disciplinary exercise which makes it challenging to study. For clinical reasoning during the pre-surgical process, components such as domain-specific knowledge, experience and intuition are essential [1,2]. It is also widely understood that interpreting traditional 2D imaging presents challenges for visualizing and planning surgical approach, especially for newer surgeons [3]. While past studies have considered newer modalities for pre-surgical imaging review including Virtual Reality (VR), they tend to focus on prospective medical records, study the VR tool in isolation from clinical context, or rely solely on subjective, self-reported measures from users [3,4]. In our work, we have designed a study which looks to provide a richer picture of why, when, and how VR confers benefits by implementing a think-aloud exploration of real pre-surgical planning sessions with pre and post questionnaires. Our pre and post questionnaires follow from the well-known NASA Task Load Index (NASA-TLX) and give us a baseline of the surgeon’s expectations and self-reported mental demands of reviewing the case. We have found early emergent themes including surgeons seeking confirmation of their assumptions made in surgical conference, as well as exploring and refining mental models before surgery. On average, users reported that cases were 2.0 points less demanding on a 10-point scale to review and explore in VR compared to surgical conference (u=6.1 vs u=4.1, N=7). Our team is continuing to explore how VR impacts clinical planning and reasoning to further identify and understand specific benefits VR provides.    1. Meterissian SH. A novel method of assessing clinical reasoning in surgical residents. Surgical innovation. 2006 Jun;13(2):115-9    2. Norman G. Research in clinical reasoning: past history and current trends. Medical education. 2005 Apr;39(4):418-27.    3. Lan, L., Mao, R. Q., Qiu, R. Y., Kay, J., & de Sa, D. (2023). Immersive Virtual Reality for Patient-Specific Preoperative Planning: A Systematic Review. Surgical Innovation, 30(1), 109–122.    4. Napa S, Moore M, Bardyn T. Advancing Cardiac Surgery Case Planning and Case Review Conferences Using Virtual Reality in Medical Libraries: Evaluation of the Usability of Two Virtual Reality Apps. JMIR Human Factors. 2019;6(1):e12008. doi:10.2196/12008

Kaiyuan Wang, Yuxiang Zhao, Ishfaq Aziz, Mohamad Alipour (PI)

The behavior of structures governed by engineering mechanics is challenging to grasp in a theory-only classroom environment due to the gap between abstract theoretical descriptions versus hands-on experience and perception of deformations. To bridge this gap, we develop a VR application that allows users to interact with digital twins of physical objects in real time with their hands. Users can deform virtual entities using realistic hand gestures, and the sense of stiffness is evoked by using the speed of deformation as a proxy for the members' resistance to deformation. Results demonstrate a promising pathway for immersive experiential learning of engineering concepts.

Daniel Cermak and Robbie Sieczkowski

The stu/dio is a student driven, work-for-hire management group with the goal of bringing clarity and process to the development of applications on campus.  We do not do development in the stu/dio, we hire project teams based on in-depth design, staffing, and budgeting plans and then manage the team to project completion.  We work with all current technologies, development platforms and tools used by the industry.

Our students are very skilled and our sponsors have excellent ideas but there has been a lack of focus on the processes needed to take an idea to a full design and then that design to a final product.

The main components of the stu/dio are:

  • Our student leads: Our four leads (art, design, programming and project management) do all the hiring, reviews, project support, and sponsor communication.
  • Strong sponsor support: in-depth designs, staffing plans, budgets, and communication plans drive the sponsor relationship.
  • The protocols and review process: We developed protocols to avoid issues related to semester to semester development and we perform ongoing reviews of design, code and art to ensure the protocols are being followed.
  • The Student database: over 120 students have signed up and are available to work on projects. This database is available to projects campus wide.

The goal is to provide students with an awareness of the processes and productivity requirements of today’s industries.  We also want to bring design, fiscal, and budgetary awareness to the sponsors so they can make effective decisions about the projects they want to build.  

Tianhang Cheng, Wei-Chiu Ma, Kaiyu Guan, and Antonio Torralba, Shenlong Wang

Our world is full of identical objects (\emphe.g., cans of coke, cars of same model). These duplicates, when seen together, provide additional and strong cues for us to effectively reason about 3D. Inspired by this observation, we introduce Structure from Duplicates (SfD), a novel inverse graphics framework that reconstructs geometry, material, and illumination from a single image containing multiple identical objects. SfD begins by identifying multiple instances of an object within an image, and then jointly estimates the 6DoF pose for all instances.An inverse graphics pipeline is subsequently employed to jointly reason about the shape, material of the object, and the environment light, while adhering to the shared geometry and material constraint across instances. Our primary contributions involve utilizing object duplicates as a robust prior for single-image inverse graphics and proposing an in-plane rotation-robust Structure from Motion (SfM) formulation for joint 6-DoF object pose estimation. By leveraging multi-view cues from a single image, SfD generates more realistic and detailed 3D reconstructions, significantly outperforming existing single image reconstruction models and multi-view reconstruction approaches with a similar or greater number of observations.

Dakarai Crowder, Wenzhen Yuan (PI)

Investigating human-robot interactions to gather data can be arduous, particularly without an established robot system in place. Challenges like potential occlusion issues arise when relying on cameras for interaction data collection. However, the utilization of virtual reality (VR) is a promising research tool for human-robot interactions. We are developing a VR environment to study human-robot interaction dynamics. This platform enables users to engage with a robot within a virtual space, whether the robot is static or teleoperated. Our system will be able to collect data about user interactions, including touch duration, timing, and location on the robot, without the need of handheld controllers for gesture control. The environment will also allow the user to answer likert questions for a more seamless experimentation experience. One application of this system involves finetuning the specific specifications required for tactile sensors in robots used in human-robot interaction. Leveraging VR simulations allows us to observe and analyze various types of gestures and their respective touchpoints. Through this detailed analysis, we gain valuable insights into interaction nuances, informing the optimal granularity and strategic sensor placements on robots. Ultimately, this research endeavor in virtual reality helps foster deliberate implementations, improving the efficacy of human-robot interaction technologies.

Bo Chen, Klara Nahrstedt, Yinjie Zhang, Zhe Yang, Zhisheng Yan

Video codecs are essential for video streaming. While traditional codecs like AVC and HEVC are successful, learned codecs built on deep neural networks (DNNs) are gaining popularity due to their superior coding efficiency and quality of experience (QoE) in video streaming. However, using learned codecs built with sophisticated DNNs in video streaming leads to slow decoding and low frame rate, thereby degrading the QoE. The fundamental problem is the tight frame referencing design adopted by most codecs, which delays the processing of the current frame until its immediate predecessor frame is reconstructed. To overcome this limitation, we propose LiFteR, a novel video streaming system that operates a learned video codec with loose frame referencing (LFR). LFR is a unique frame referencing paradigm that redefines the reference relation between frames and allows parallelism in the learned video codec to boost the frame rate. LiFteR has three key designs: i) the LFR video dispatcher that routes video data to the video encoder and decoder based on LFR, ii) LFR learned codec that enhances bandwidth efficiency by exploiting spatial-temporal correlation in LFR, and iii) streaming adaptations that support adaptive bitrate streaming with learned codecs. In our evaluation, LiFteR consistently outperforms existing video streaming systems. Compared to the existing best-performing learned and traditional systems, LiFteR demonstrates up to 23.8% and 19.7% QoE gain, respectively. Furthermore, LiFteR achieves up to a 3.2X frame rate improvement through its adaptive frame rate approach.

Human Experiences Thrust

Sebastian Kelle (CS), James Planey (SCD), Dejan Trencevski (Gies)

Building upon the results of the ImPRESS project (Immersive Production of Representations Exploring Science Sketching), we present a pragmatic educational framework in the form of an online curriculum for professional development that was designed by a grassroots collaboration among the departments of Computer Science (College of Engineering), Molecule Maker Lab Institute (Siebel Center for Design), and IT Partners (Gies College of Business). The Applied Immersive Technologies for Teaching and Learning online curriculum will give educators an overview of the XR technology ecosystem: hardware for XR, 3D computer graphics software, game engine platforms, software development kits (SDKs), and Komodo, an open-source platform developed at UIUC that enables XR-development, deployment, and data collection infrastructure in the context of higher education learning environments. This curriculum will be offered to teaching faculty who are interested in leveraging immersive technologies in their own teaching, but can also be applied across a broader context. As the interactive element of our presentation, we will offer a hands-on experience with the ImPRESS prototype to showcase an example of immersive VR-based learning techniques.

PI: John Toenjes, Professor, Department of Dance; Programmers: Luke Puchner-Hardman, Arav Chheda, Jerry Xu, M. Landon, Robbie Sieczkowski
Costume design/animation: Rochele Gloor
Dancers: Jody Sperling, David Marchant
3D artists: Hajra Lat, Han Ni
Project management: Uliana Ovsiannikova

A virtual reality adventure/activity, Master Dancer is being developed as a template for learning about historical personages through contextualizing them, through interaction with their virtual avatar, and through exploring and learning aspects of their work through game play. Master Dancer is the first of these historical adventure/activities. It features the dancer and theatrical innovator Loïe Fuller, whom we consider a pioneer of technology-based dance theater. After the introduction, the user explores the lobby of a “Folies-Bergère” theater in Paris, where Fuller made her name. In the lobby, various historical items are seen: a gallery of posters, videos of dances by Fuller and her imitators, and a gallery of collaborators and artists who were in her sphere. These will function in the virtual world as quest markers, which lead to a virtual encounter by the user with a virtual Loïe Fuller.Virtual Loie explains that awakening the creative spirit is what maintains this magical theater, and she is counting on the user to contribute to it by learning how to unleash their movement creativity through playing two movement “training games.” The games teach “movement qualities,” which are the basic tools inderlying creative dance. One game, where the player hits stars that are coming at them to create constellations in the sky, trains “direct and forceful” movement. The other game, a music- and light-based game trains “light and indirect” movement.Once the user finishes the first game, they are transported back to the lobby, where additional quest markers will be encountered and there is additional interaction with Fuller. After finishing the second game, the user will meet Fuller again, who beckons them into the theater for a virtual dance performance.

Chris Palaguachi (ED), Rodrigo Hidalgo (ED), Ivan Zhang (CS), Robin Jephthah Rajarathinam (ED), Jina Kang (ED)

Augmented reality (AR) emerges as a transformative technology that integrates the physical and digital realms, promising significant advantages in educational contexts. Within the realm of education, AR holds the potential to revolutionize learning experiences by offering enhanced motivation, fostering collaboration, and improving spatial abilities through immersive encounters. However, the widespread adoption of AR in education faces challenges, notably in the form of cognitive load and usability issues. This research delves into the usability of AR in the specific domain of science education, concentrating on a novel prototype application named HoloOrbits. To assess its effectiveness, a comprehensive study was conducted involving experts in instructional design and subject matter. Their valuable insights unveiled positive experiences concerning immersion and subject matter representation. However, the study also brought to light certain challenges related to menu navigation and interaction methods. In response to these challenges, user-centric approaches have been proposed, advocating for the incorporation of voice commands and intuitive controls. These recommendations aim to enhance the overall user experience and usability of AR applications in educational settings. The significance of prioritizing user experience in the design of AR applications is underscored by these findings, emphasizing the potential of AR as a powerful tool for science learning. This study provides valuable insights into the nuanced relationship between technology and education, laying the groundwork for future advancements in the field and fostering a greater understanding of the potential role of augmented reality as a powerful tool for enhancing science learning experiences.

Chenyang Zhang, Tiansu Chen, Arnav Shah, Sam Hurh, Elahe Soltanaghai

Gaze interaction offers a promising avenue in Virtual Reality (VR) due to its intuitive and efficient user experience. However, the depth control inherent in our visual system is not fully utilized in current methods. In this demo, we introduce FocusFlow, an innovative hands-free interaction method that capitalizes on human visual depth perception within the 3D scenes of Virtual Reality.We begin by developing a binocular visual depth detection algorithm to understand eye input characteristics. We then propose a layer-based user interface and introduce the concept of a "Virtual Window." This window provides an intuitive and robust gaze-depth VR interaction, overcoming the challenges of visual depth accuracy and precision spatially at greater distances.To help users master depth control, we design a learning procedure using different visual cue stages. Our user studies on 24 participants demonstrate the usability of our proposed virtual window concept as a gaze-depth interaction method. Additionally, our findings reveal that the user experience can be improved through an effective learning process, even with weak visual cues. This helps users to develop muscle memory for this new input mechanism.

Dr. Robb Lindgren (PI) University of Illinois, Dr. Jina Kang (co-PI) University of Illinois, Dr. Emma Mercier (co-PI) University of Illinois, Mr. Nathan Kimball (Concord Consortium)

In this talk we describe the CEASAR (Connections of Earth and Sky with Augmented Reality) project which is aims to support collaborative learning in the area of astronomy using a mix of devices including AR headsets. For this project we created a persistent night sky simulation that can be accessed from immersive devices such as headsets or more traditional interfaces such as tablet computers. By structuring the learning task in a way that required students to work across the devices to share knowledge and perspectives, students were able to leverage the affordances of the various tools and collaborate in novel and productive ways. We will discuss the implications of designing immersive spaces that make strategic and sparing use of expensive technologies such as headsets.

Anthony Nepomuceno (CIMED), Avinash Gupta (ISE), Mae Vogel (CIMED), Athena Ryals (CIMED), Shandra Jamison (CIMED), Caroline G.L. Cao (ISE)

While the benefits of high-fidelity medical simulation are numerous, access for medical students is often limited by staffing, budgetary, and time constraints. Extended Reality (XR) immersive experiences provided in virtual reality and augmented reality mediums can close this gap in access to medical simulation, while complementing the undergraduate medical education (UME) curriculum which consists of traditional didactic and high-fidelity, physical simulation training components. However, as simulation training is not required by the LCME (Liaison Committee on Medical Education), guidance and standards for simulation scenario development, equipment procurement, and integration of simulation into UME curriculum is currently lacking. Therefore, an integrative, systematic, and rigorous approach is required to deliver a multi-modal simulation program leveraging the complementary strengths of both XR and physical simulation. The Jump Simulation Center Urbana XR development team has adopted a human-centered design approach, with an emphasis on iterative and participatory design, to address the training needs of medical students at Carle Illinois College of Medicine, while satisfying the learning objectives required by the LCME. This presentation will showcase results of this effort, illustrating our process: 1) translating LCME learning objectives into simulation design requirements, 2) storyboard development by curricular designer, 3) developing VR assets and environments on common platform, 4) evaluation of XR application with end-users, 5) iteratively refine and re-evaluate, and 6) deploy and implement. Each of the XR simulation modules will be validated for face, content, convergence and divergence validity, and compared to its high-fidelity counterpart. A demonstration will be offered to illustrate the outcomes of our approach in developing immersive XR simulation for undergraduate medical education at CIMED.

Nanzeeba Tabassum (CEE), Eric Gene Shaffer (CS) and Nishant Garg (CEE)

The advancement of immersive extended reality (XR) devices holds promising implications for classroom learning, particularly within STEM disciplines. A well-designed educational game can enhance the immersive experience, thereby making learning fun and effective. Thus, a new VR game ‘Crystal Vision’ was developed as a teaching tool for undergraduate students to teach basic crystallographic concepts. Understanding the basic concepts of crystallography requires exploring spatially complex ideas and 3D visualizations. A systematic testing process was applied on a set of 84 undergraduate students to analyze the effect of the interference of VR-based educational game. The result yielded a non-trivial rise in test scores and decrease in test completion times. Moreover, findings from the conducted surveys emphasized the importance of considering user comfort within a VR game for a better outcome. A modified version of the described game called ‘Crystal Vision 2.0’ is in progression which is attempting towards changing major game mechanics, enhancing game aesthetics and adjusting VR-game interfacing to achieve a successful learning outcome.

Dr. Matthew Lira (PI) University of Iowa, Dr. Robb Lindgren (Co-PI) University of Illinois, Dr. Ece-Demir Lira (Co-PI) University of Iowa, Dr. Thenkurussi (Kesh) Kesavadas (Co-PI) University at Albany

This talk will describe a partnership between Illinois, University of Iowa, and University at Albany to create a novel biochemistry simulation environment that tries to elicit undergrad students' representational hand gestures using a custom-built haptic glove. The simulation and the associated glove are designed to make salient the multiple forces determining the cell membrane potential (the distribution of ions inside and outside a cell). The research we are conducting is intended to understand how haptic feedback and physical engagement (i.e., gesturing) contributes to learning within immersive spaces. In additional to traditional learning measures (written and verbal assessments) we are also using fNIRS neuroimaging to detect changes in brain function as a result of interacting with the immersive simulation.

George Mois, Emre Eraslan, Sainath Ganesh, Qiyuan Cheng, Jacob Stolker, Ben Guan, Willencia Louis-Charles, Afnaan Afsar Ali, Anika Katherine Urbonas, Dhruti Rajesh, Avinash Gupta, and Wendy A. Rogers

Maintaining cognitive health holds a critical role in supporting quality of life and wellbeing throughout the aging process. Changes in life circumstances, health declines, and mobility limitations can negatively impact the number of opportunities older adults have to participate in cognitive, social, and activity engagement, all of which are closely intertwined with one’s cognitive health. The evolution and accessibility of virtual reality (VR) systems present unique opportunities to foster cognitive, social, and activity engagement among older adults from the comfort of their own homes. However, challenges related to usability and acceptance of VR applications can limit applicability to support this segment of the population. In this project, the aim is to apply human factors methodologies to develop a suite of VR based cognitive, social, and activity engagement applications. The initial step in our development was to create a VR based multiplayer environment capable of supporting social interactions through interactions with virtual objects and along with verbal communication capabilities. Once this foundation was established, we proceeded to building out the environment for our first application, which included multiple types of interaction opportunities, through games such as chess, checkers, and Jenga. To help reduce user burden we built a starting area that allows for players to select between the three activities and create a private room with an entry code to support privacy. Subsequently, we turned our focus to improving the user experience by optimizing the controls, adding option menus, and inserting tutorial videos. In future, we plan to conduct a study which will involve VR-curious older adult participants. The goal of the study will be to evaluate their experiences using our application along with other commercially available applications through a mix of general and VR-focused questionnaires. Furthermore, we are applying the lessons learned through our initial application development to create two new VR applications, one focused on outdoor recreation and another focused on simulating an arcade-like environment.

Rochele Gloor (CITL), Jamie Nelson (CITL), Bob Dignan (CITL), Jake Metz (Library), Michel Bellini (CITL)

The CITL (Center for Innovation in Teaching & Learning) Innovation Studio and VR (Virtual Reality) Lab would be pleased to participate in the 2024 IMMERSE Symposium with both a brief presentation and a demo. During the academic year of 2022, the Innovation Studio hosted 2537 visitors, comprising of students, faculty, staff, international scholars, high school students, youth camps, home school groups, colleagues from other educational institutions, and corporate visitors. We provided customized first experiences in emerging technologies as well as education, technical, and human experience guidance for complex projects in immersive computing, converging Artificial Intelligence (AI), Virtual Reality (VR), Extended Reality (XR), Augmented Reality (AR), and metaverse technologies. The presentation will give an overview of the projects (and disciplines and audiences) that we seeded, implemented, and enabled in our spaces. We will show data from our Annual Report about users tracing a parallel on the importance of spaces that host innovation as well as integrating immersive computing to enhance teaching and learning. The mission to integrate these emerging technologies across various disciplines - social work, music, applied arts, sustainable design, fashion design, computer science, dance, and others to enrich the human experience is fully in line with the core theme of the conference. Our studio's approach focuses on immersive technologies as catalysts for innovation and human-centered design. Following are the varied disciplines that visit and reside in our space and the technologies we provide. For Social Work, we utilize VR to foster empathy and understanding, like body swaps, and exposure therapy. In the School of Art and Design, we hosted fashion illustration classes using VR Open Brush and a sustainable design class that used AI tools for illustrating research and model ecological education. We provided computer software and space for implementation of a VR game using technologies such as cloth simulation collaborating with Department of Dance and Stu/dio (a Game Studies and Design initiative). We are currently developing digital avatars at the Innovation Studio continuing an initiative that started with Gies College of Business. For the demo, in line with the abstract's focus on integrating immersive computing, the Innovation Studio is actively exploring the potential of metaverse experiences, particularly platforms like Meta Horizon Workroom. We are interested in finding dynamic ways to interact and share information, enhancing user experiences, and fostering collaboration. Within the virtual reality workroom, we have been experimenting with different numbers of participants comprised avatars in the virtual space along with others attending in a traditional video-conferencing interface. We are also exploring human factors and usability of online broadcasting, content sharing, screen recording, sharing images and videos in a large virtual monitor, as well as user experience and workflows. For the demo, we will enter the Meta Horizons Workrooms with people located in different geographical locations and in the university and perform various activities while casting it to a large monitor so the audience can see the interactions. One person in the audience will enter online in the computer in more of Zoom-like interface.

Juan Salamanca (Art & Design)

Coalescing Currents, an interactive mural at the University of Illinois at Urbana-Champaign, stands out for its transformative experience that transcends traditional immersive interactions. Unlike static displays, this mural invites users to break free from a fixed location and engage in dynamic exploration. Featuring 50 cases of networked innovation from diverse disciplines, the mural incorporates two-dimensional, three-dimensional, and augmented reality (AR) components, creating a multidimensional narrative. Emphasizing a cognitive approach, the design encodes numerical and categorical data into conceptual spaces, enabling users to navigate through 150 years of research agendas. The two-dimensional component flows chronologically, representing themes over decades, while the three-dimensional aspect features 3D-printed 'pebbles' in radar diagram layouts symbolizing innovation cases. The AR component takes the experience further by allowing users to interact with a dynamic network of flying ribbons and labels connecting instances of innovation. Crucially, the mural transforms the viewing experience into an embodied journey. Users physically navigate intersections of disciplinary domains, walk through the mural, and engage with virtual pebbles through AR. This emphasis on mobility and exploration distinguishes Coalescing Currents, offering a groundbreaking approach to interactive data visualization that goes beyond the confines of traditional immersive displays, allowing users to actively shape their experience.

Bradly Alicea (iSchool)

The characterization of immersion as humans and machine symbiosis requires considering how symbiosis can be characterized as a dynamical system that enables unique opportunities for adaptation. This includes forms of adaptive and maladaptive accommodation, as well as enhancements in performance (learning). Biological systems are typically supervised by social (tutoring) and externalized (symbolic) cues. For immersive environments, however, there are unique aspects of procedural learning that may be overcome with natural supervised learning. The presence of rare or unusual physical relationships or unique world geometries are particularly difficult to learn from instruction. Furthermore, the ability to orient oneself and develop embodied interactions in unusual simulated physics is something that is not easily taught or acquired without assistance. What is required is a supervisory process that regulates the mapping between sensory inputs and behavioral outputs, with the goal of maintaining isomorphy between the two. In cases where the nervous system cannot adequately serve to regulate this mapping, naturally supervised learning can serve to fill in the gaps of the action-perception loop. We can enhance learning by adding nonlinear or superadditive components to the learning curve. As a dynamical system, cognition in immersive settings must often do this on the fly, being embedded in the interaction itself. Specifically, immersion can trigger latent supervisory processes. Examples from dynamic motor learning and multisensory perception in virtual environments will be used. In the former, two potential mechanisms are switching and resonance. The latter can be induced by the strategic induction of sensory signals, with similarities to resonance in motor learning. These examples are interpreted in terms of insights from Ecological Psychology and Psychophysics, which provide avenues to a broader theory of non-Euclidean immersion. Overall, naturally supervised learning allows us to identify properties of symbiotic systems dynamics and offers a potentially controllable technique for encouraging immersive interactions.

Mingyuan Wu, Yinjie Zhang, Qian Zhou, Bo Chen, Shiv Trivedi, Yuhan Lu, Klara Nahrstedt (UIUC CS); Lingdong Wang, Michael Zink, Ramesh Sitaraman (UMass); Simran Singh, Jacob Chakareski (NJIT)

In a real-life meeting environment, individuals often demonstrate a remarkable ability to selectively focus their attention on specific visual information. The ability allows them to naturally concentrate on a specific region of interest while tuning out others. Understanding and exploiting such selective attention remain unexplored in a user-centric teleconferencing system, where there is potential to customize video streaming and foveated rendering based on the viewer's attention. In this work, we propose a novel user-centric scene analysis module that fully leverages the power of selective attention for online meeting scenarios and recognizes the unequal importance of individual pixels in the videos. The module determines the user's selective attention through the meeting contexts. The contextual representation of the meeting is modeled as a combination of two primary components: proactive user interaction within the system and passive real-time analysis of high-level visual semantics from the scenes. As the meeting progresses, the interactive scene analysis module dynamically updates its contextual representation, offering a dual advantage: (a) Videos can be selectively and adaptively streamed within a user's attention, resulting in bandwidth savings of up to 78 percent. (b) The module enhances overall quality of the user experience by facilitating more user interactivity, particularly in meeting-related tasks such as screen sharing, privacy-preserving user blocking, background removal, automatic user attention shift detection, etc. We believe that our interactive scene analysis module makes a significant stride towards creating an efficient, immersive, interactive, and intelligent teleconferencing system.

Mark Cohen1,3, Inki Kim2*, James Rehg2, Charlie Hawknuff3, Carlos Brown1,3, Shandra Jamison1, Angelia Deweese4, Mae Vogel1, Debapriya Dutta3, Linda Owens3, Emily Wee3, Cassie Cox3, Staci Hoffman3
1 Carle Illinois College of Medicine at the University of Illinois, Urbana-Champaign, 506 S Mathews Ave, Urbana, IL 61801, United States
2 Health Care Engineering Systems Center at the University of Illinois, Urbana-Champaign, 1206 W Clark St, Urbana, IL 61801, United States 
3 Carle Foundation Hospital, 611 W Park St, Urbana, IL 61801, United States;  4 Carle BroMenn Medical Center, 1304 Franklin Ave, Normal, IL 61761, United States

As the number one driver of health care costs and hospital deaths worldwide, with rising rates in the US, sepsis is a condition in critical need of improved methods for early diagnosis and aggressive treatment. The goal of this effort is to improve survival and shorten hospital length-of-stay for sepsis patients through use of a novel, scalable virtual reality (VR) educational training experience that helps learners recognize different presentations of sepsis early and comply better with a standardized “bundled” treatment protocol. Using protocol-specific immersive virtual reality (VR) experiences that expose learners to diverse patient presentations of sepsis, learners can gain relevant sepsis cues that strengthen earlier diagnosis, while becoming more aware of the professional interactions and controversies associated with bundle compliance. Our central hypothesis is that this more robust customized VR procedural simulation, developed to measure and test the user specifically for time to sepsis recognition as well as compliance with the sepsis bundle order set, will show improved bundle compliance and time to treatment, leading to improved survival as well as shorter length-of-stay hospital outcomes. To test this hypothesis, we first identified gaps and variables central to the screening, recognition, diagnosis, and treatment of the septic patient and compliance with the new health system sepsis bundle. We aimed to formally represent clinicians’ perceptions on the set of gaps and variables associated with sepsis recognition and bundle compliance. In our preliminary study, we conducted a survey of 68 medical providers (38 registered nurses and 30 physicians and advanced practice providers) in the EDs of the Carle Health system regarding their perceived difficulties, educational gaps, and confidence in sepsis knowledge. Specifically, the participants were presented with the set of variables derived from the literature and were asked to check all that applied to their own perception of difficulties. Simultaneously, to grow the initial set, the respondents were also asked to add new variables if needed, as well as to elaborate on their perceptions of those variables in their own words. Applying Multiple Correspondence Analysis (MCA) to the survey responses identified two primary dimensions; the first-dimension divides “all-or-nothing” from multifaceted reasoning associated with sepsis recognition and bundle compliance, and the second splits system-wide from case-specific foci in sepsis interventions. Both dimensions explained 65.5% of the total variability. In addition, statistical tests confirmed that clinicians’ experiences with sepsis, levels of medical education, and clinical roles had significant relationships to their perceptions, supporting the need for custom VR education tailored to these learner segments. These preliminary findings will allow us to design highly realistic and holistic VR training scenarios that can translate into improved diagnostic and interprofessional skills along with quality reasoning for bundle compliance and antibiotics initiation. For the next steps, we will Iteratively develop and prospectively evaluate novel clinical VR educational scenarios to enhance learner diagnosis of sepsis and compliance with the health system sepsis bundle orders.

Charlie Nudelman (Department of Speech and Hearing Science), Pasquale Bottalico (Department of Speech and Hearing Science)

This study explores the influence of visual input on voice production in virtual reality with healthy participants. The effects of the room size and fullness of virtual reality rooms on acoustic voice parameters and vocal status ratings are examined. Voice production from 30 participants were recorded in six virtual conditions. After each condition, the participants provided vocal status ratings, and the virtual reality conditions were ranked by the participants following all voice production tasks. The voice recordings were processed to calculate various acoustic parameters. The effects of the virtual reality conditions on these voice acoustic parameters and the vocal status ratings were analyzed. The full virtual reality rooms resulted in significantly worse vocal fatigue and vocal discomfort ratings and the smallest virtual reality room when sparsely occupied was ranked as best by the participants. The virtual reality room size had statistically significant effects on mean sound pressure level, the standard deviation of sound pressure level, mean fundamental frequency, mean pitch strength, and cepstral peak prominence smoothed. This study provides evidence that larger and more densely occupied virtual reality rooms contribute to perceived vocal fatigue, vocal discomfort, and changes in objective voice acoustic parameters in healthy speakers.

Pasquale Bottalico (Department of Speech and Hearing Science), Carly Wingfield (School of Music), Charlie Nudelman (Department of Speech and Hearing Science), Joshua Glasner (School of Graduate Studies, Delaware Valley University), Yvonne Gonzales Redman (School of Music)

In the realm of classical singing, performances of identical repertoires in diverse acoustic and visual settings display significant adaptations. These adjustments in the singer's delivery are influenced by a myriad of factors, encompassing the artist's perception of the acoustic surroundings, the visual aesthetics of the performance venue, and the measured acoustic properties of the space. Voice production behaviors were evaluated to explore the effects of room acoustics on five voice parameters: vibrato rate, vibrato extent, vibrato jitter (Jvib), vibrato shimmer, and quality ratio (QR), an estimation of the singer's formant power. The subjects were ten classically-trained professional singers (five males and five females). Subjects sang the aria da camera “Caro mio ben” by Giordani in their preferred key in three different performance venues with different acoustics and dimensions, with and without a virtual reality headset that simulated the same room. The study revealed significant adaptability in voice production behaviors among ten classically-trained professional singers, both male and female, as they performed the aria da camera "Caro mio ben" by Giordani in three different performance venues. Consistency in the performance was found between the condition in the real room and the condition with VR simulating the same room. Notably, vibrato rate, extent, jitter (Jvib), shimmer, and quality ratio (QR), an estimate of the singer's formant power, remain consistent due to the successful immersion provided by the virtual reality technology.  These results emphasize the complex interplay between room acoustics, visual perception, and vocal parameters in influencing the delivery of classical singing performances, shedding light on the multifaceted nature of artistic adaptation.

Rachel Switzky, Sarita Adve, Eric Shaffer, and Amber Dewey Schultz

The Siebel Center for Design (SCD) is a multidisciplinary hub at the University of Illinois Urbana-Champaign with a mission to practice, model and teach design thinking, using Human Centered Design to re-imagine our campus, community and collective world. Human-centered design (HCD) is a problem-solving approach that uses design thinking methods and tools to identify the unmet needs of a population in order to collaboratively and iteratively develop solutions.SCD is offering a new Certificate Program in Immersive Experience Design. This Interdisciplinary Certificate presents an opportunity for undergraduate students to delve into the world of immersive design. By tapping into the power of human-centered design, this enriching program empowers creative students to shape the future of immersive experiences. Students will gain the skills to create captivating experiences that go beyond the ordinary, connecting deeply with audiences on an emotional level when designing everything from augmented and mixed reality to real environments.Our presentation will offer insights into the dynamic collaboration between the Siebel Center for Design (SCD) and IMMERSE, shaping the development of the Immersive Experiences Certificate. This partnership aligns with IMMERSE's three thrusts: Technologies, Applications, and Human Experience. Focalizing on human needs lies at the core of both human-centered design and IMMERSE's Human Experience thrust, offering a unique opportunity for synergy among technology, applications, and human-centric principles within our certificate program. By including methods, tools, and ethical considerations for immersive interventions within the curriculum, our certificate program will support the multidisciplinary students’ integration of the virtual and the physical, drawing insights from IMMERSE's Technologies thrust. This collaboration echoes the shared mission of SCD and IMMERSE—to assemble diverse teams for collaborative problem-solving—aligning with the Applications Thrust.