Disclaimer: I've been working on VR as a Product Manager at Google for a few years, and I was interested in the space for some time before that (e.g. see this). However, when push comes to shove: this is my personal blog and the views expressed on these pages are mine alone, and not those of my employer.
Since late 2014, accessible and simple smartphone-powered VR viewers like Google Cardboard brought the first taste of VR to larger audiences than ever before. In January 2016 Google has announced that over 5 million headsets have shipped.
Ever since the original Cardboard launch in 2014, there was one feature that puzzled the press and the users more than anything else: the lack of a headstrap.
Here's why including a headstrap with Cardboard-like devices wouldn't be a good idea.
VR presents a fundamentally difficult challenge of replacing the "natural" set of photons that would reach your eye from your surrounding environment with the set of "artificially" generated photons coming from the display. In order for you to feel comfortable in VR, this artificial set of photons has to approximate the behavior of the natural ones as closely as possible.
For example, when you turn your head, your brain expects the new set of photons to reach the eyes almost instantly. After all, this is how the real world always works.
Unfortunately, things in VR are slightly different.
To begin with, you don't actually need to get new visual input to know that your head position has changed. If you close your eyes and turn your head, your brain will get feedback from your neck muscles stretching and contracting, the tension on ligaments and tendons changing, the nerves in surrounding areas getting stimulated, angular acceleration being registered by vestibular system, and so on.
What this effectively means is that when you turn your head, your brain expects to receive the new set of photons because of proprioception. If there is a noticeable delay between your head movement and the new set of photons reaching your eyes, your brain realizes that something is wrong and copes with it using the oldest known trick in the book... Just like hundreds of thousands of years ago, if you ate the berries from the plant that you weren't supposed to and the toxins in the berries affected the levels of neurotransmitters, your brain would tell your body to get rid of the toxins in the fastest way possible: by making you throw up.
In VR's case, if the delay between the head movement and the new corresponding visual input exceeds 15-20ms (this delay is also known as motion-to-photon latency), you start experiencing the simulator sickness. Unfortunately, most of today's smartphones when used for VR, have the motion-to-photon latency of about 80-100ms.
This is because of two main reasons: the first reason is that today's smartphone operating systems have not been optimized for VR -- modern mobile OSes contain deep rendering stacks with double/triple buffering, no direct front buffer access, etc, which increase the time between the new frame being ready and the time when the new pixels are being sent to display. The second reason is that the underlying hardware components in the smartphones themselves also were not built for VR -- current smartphones ship with high-persistence displays (pixels take long time to switch on-and-off), low spatiotemporal resolution IMUs and so on.
So how does this all relate to the headstraps on Cardboard-like devices?
Well, turns out that the human peak angular neck velocity (~300 deg/s [1] [2]), is around 2-3 times faster than the peak angular torso velocity! (If you'd like to convince yourself, do a simple experiment: set a timer to 30 seconds and see how many times you can twist your neck side-to-side with your hands resting on your lap vs turning side-to-side with your hands held behind your head.)
By removing the headstrap and requiring the user to hold the viewer to their head with their hands, Cardboard-like devices effectively shift the user's yaw rotation plane from neck to torso, which in turn (pun intended) limits the maximum speed at which the user can change the position of their field of view, and masks the motion-to-photon latency.
Since reaching the peak angular velocities is definitely not a hypothetical scenario in VR (imagine if you were walking through a dimly-lit dungeon in VR and a zombie appeared in the periphery of your vision), reducing the speed of head movement to hide the motion-to-photon latency yields an better user experience overall: it's better to be forced to take small breaks at every 10 mins or so because your hands get a little bit tired than to constantly be experiencing the simulator sickness because of latency.
To end on a brighter note, the obstacles related to the smartphone motion-to-photon latency do not require any major technological breakthroughs: better hardware components can be put into the phones, and operating systems can be improved. As shown by Samsung Gear VR, when the motion-to-photon latency is reduced sufficiently, headstraps start making a ton of sense. In the meantime, the success of Cardboard-like devices also makes sense. They provide a simple way for the billions of existing smartphone users to understand what VR is about, to enjoy tens of thousands of "snackable" VR experiences already developed for the platform, and to share the fun with friends, family and colleagues.
References
[1] Bussone, Linear and Angular Head Accelerations in Daily Life. Master thesis at Virginia Polytechnic Institute and State University, USA, 2005.
[2] Ă–hberg et al. Chronic Whiplash Associated Disorders and Neck Movement Measurements: An Instantaneous Helical Axis Approach. IEEE Transactions on Information Technology in Biomedicine. 01/2004; 7(4):274-82.