This article was written by Pierre-Luc Lapointe, Director of R&D and XR design at StellarX
AI is the new kid on the block, competing for attention against terms like XR and Spatial Computing in the public’s imagination. However, experts are in agreement that these emerging technologies are actually converging into a new computing paradigm. XR helps AI applications solve otherwise unsolvable challenges, while AI makes XR tools more powerful and more accessible. This is most true in the workplace.
The synergy between XR and AI is natural; in a way, XR (extended reality) and AI technologies have the potential to extend both the body and the mind by augmenting our perceptual and cognitive abilities. That convergence transcends traditional screens, making the body and mind novel computing interfaces. Advanced AI models with multimodal capabilities are ideal for XR devices equipped with vision, audio, motion and tactile sensors. This combination may enable immersive, augmented and context-aware experiences that enhance decision-making, training, safety and operational efficiency in the workplace.
Yet, Generative AI and XR technologies are often treated separately through the lens of gaming, entertainment and consumer applications, overshadowing their vast potential in other environments. Just this week, Meta released Meta AI on the Meta Quest. Much of the conversation has been on using Meta AI for fashion tips and bringing a new dimension to games. So, I want to speak on how AI can enhance the way we work.
Understanding Generative AI
First, let’s start with clearing up any misinformation about what Generative AI actually is. Artificial Intelligence is evolving quickly; there’s a lot of talk about all kinds of AI that will change your life, but since these technologies are new to most, it can be overwhelming to decipher what they all are. Generative AI is one of these terms that I find sometimes confounds people, here’s an article on the StellarX blog breaking it down. As a general definition though, Generative AI refers to a type of deep learning model that can generate new content based on the data it is trained on.
From Sensing to Adapting with AI
If we augment our capacity of “sensing” the world through XR devices, it seems natural to think that all that data coming in from such device sensors can be fed to an AI model. This model will assist us in our cognitive tasks and help us make better decisions by not only generating content but, most importantly, by adapting to various contexts.
In high risk work environments, where workers’ perception of danger can be altered by fatigue or impairment, multimodal AI agents could predict dangerous situations in real time by leveraging XR wearables’ sensors and providing visual and auditory assistance, bringing better spatial awareness to the user. The multimodal capabilities of both XR and AI systems could be combined to provide an augmented work experience that helps minimize fatalities.
In a survey, 45% of Canadian employers cited finding qualified candidates as the biggest hurdle in the hiring process. Job seeking can feel like a black box, with candidates unsure about how to prepare for an interview despite being qualified.
Job seekers could benefit from AI-XR technology to develop these soft skills. Virtual Agents, powered by Generative AI, could interview them. Informed by specific job postings, they could hold natural and helpful conversations. AI Models could be used to score the interviews, or simply help job seekers gain confidence in interviewing.
This sort of technology can also be used to upskill employees in the workplace. If a trainee is struggling in a particular competency, an XR training scenario can automatically adapt using AI to help the trainee exercise that skill. On the other hand, if a trainee is doing exceptionally well, AI can increase the difficulty and further test the trainee.
I’d also argue that healthcare is one of the most exciting arenas for AI-XR. Think about a surgeon being guided by an AI assistant that can analyze situations in real time, and identify issues that may be easy to miss with the human eye.
Research indicates that missing cancerous tumors during surgery is a significant issue. Additionally, cancer can recur after surgery due to residual cancer cells that were not detected and removed during an initial operation. Using Mixed Reality devices to autonomously collect data during procedures, and integrating it with AI to process collected data and detect cancerous cells, could be a life-saving technology.
If we think more creatively, AI-XR could also be an easy way to generate fully interactive immersive training scenarios from simple 2D storyboards that can also be “automagically” animated from basic sketches and notes.
Experiments at OVA
At OVA, we are currently prioritizing an AI NPC (Non-Playable Character) Agent feature to start making some of these hypothetical scenarios a reality. You can read about some of the other cool work we’re doing with Scale AI to use AI like NeRF to make training at Arcelormittal safer.
Our work with the AI Agent aims to enable dynamic content generation in various languages, producing text, speech, and images within seconds. We are allowing users to customize conversations using specific data and documents, enabling the AI to respond with different tones, roles, or personalities.
Additionally, we are exploring how the AI Agent can use its vision feature to describe visual inputs and trigger specific behaviors based on conversation and spatial context. By developing tools that enable training and learning scenarios to dynamically adapt to various contexts, we aim to enhance interactive experiences in XR with Generative AI. This will improve the content creation process, allowing subject-matter experts to focus on what matters most.
This feature, combined with StellarX’s easy-to-use creation tools, will allow people to prototype various experiences using Generative AI while staying completely immersed in their XR environment.
Responsible AI is Still Relevant
Ensuring data protection and privacy is crucial as we develop these technologies, making sure user data is secure and used responsibly. We need to build such systems so that they are not operating within black boxes; we want them to be open and transparent, and to have maximum control over them. At the same time, we need to make sure that all the data involved is secure, and accessible only by its owners.
That policy is even more important when developing AI that’s not just based on text inputs, but also on computer vision, speech, and more.
We’ve been exploring AI technologies for a while now, and in many ways; so these are some considerations we’ve already thought about deeply. Data is always secure here. We make sure people know how their data is put to work, and outline a plan to keep things safeguarded within teams and organizations.
So, what do you think? Are there any ways you think AI-XR can be used for work use cases that I didn’t cover in this article?