For a long time it has been suggested that visual immersive analytics based in virtual reality (VR), augmented reality (AR) and other advanced forms of human-computer interactions have enormous potential in assisting thinking processes in scientific research and in education, especially in areas of science that deal with abstract objects, objects much smaller or larger than human dimensions, objects that are hard to acquire and handle due to high costs, scarcity or fragility, and very large amounts of data (O’Donoghue et al., 2010; Matthews, 2018; Krichenbauer et al., 2018; Sommer et al., 2018). Chemistry and structural biology are examples of such disciplines where AR and VR have been attributed high potential in education and research, by providing hybrid physical/computational interfaces to handle and explore virtual molecules in real 3D space augmented with real-time overlay of information from databases and calculations. However, the actual impact of immersive technologies on teaching, learning and working in chemistry still requires deep evaluation (Fjeld & Voegtli, 2002; Pence, Williams & Belford, 2015; Matthews, 2018; Bach et al., 2018; Yang, Mei & Yue, 2018). Such evaluation has progressed very slowly due to the complex software setups and the specialized hardware needed, which limit availability, reach, adoption, and thus testing. Indeed this limitation is shared more broadly with other potential applications of AR and VR in science, which so far “[…] remain niche tools for scientific research” (Matthews, 2018).
In the last decade several groups have been studying ways to achieve immersive environments for chemistry and structural biology using VR and AR (Gillet et al., 2004, 2005; Maier, Tönnis & GudrunKlinker, 2009; Maier & Klinker, 2013; Hirst, Glowacki & Baaden, 2014; Berry & Board, 2014; Martínez-Hung, García-López & Escalona-Arranz, 2016; Vega Garzón, Magrini & Galembeck, 2017; Balo, Wang & Ernst, 2017; Goddard et al., 2018a, 2018b; Wolle, Müller & Rauh, 2018; O’Connor et al., 2018; Ratamero et al., 2018; Müller et al., 2018; Stone, 2019). Such interfaces allow handling molecules over 6 degrees of freedom and with both hands, in immersive 3D. They are expected to overcome the limitations of traditional software based on screen, mouse and keyboard, thus enabling more intuitive, fluid exploration of molecular features and data. At the time of writing, most of these works are not complete AR or VR versions of fully-fledged programs, but rather prototypes, proofs of concept and case studies on how humans interact with virtual molecules in AR or VR. Notable highlights moving towards complete immersive molecular visualization and modeling programs are the new rewrite of Chimera, ChimeraX, which was optimized for the new GPUs and incorporates support for VR experiences (Goddard et al., 2018b), VRmol (Xu et al., 2019), new VR plugins for the VMD molecular graphics program (Stone, 2019), and a few commercial programs like Autodesk’s Molecule Viewer (https://autodeskresearch.com/groups/lifesciences) or Nanome (https://nanome.ai/), all with interfaces specifically tailored for VR.
Most of these works suffer from two severe limitations. First, all but a few exceptions require hardware such as head-mount displays (helmets, headsets or goggles like MS Hololens, Oculus Rift, HTC Vibe, etc.) or immersive installations with large surround screens plus 3D-handheld input devices and the corresponding computer and GPU hardware. The few remarkable exceptions are prototypes using ordinary webcam-enabled smartphones (Balo, Wang & Ernst, 2017) or laptops (Gillet et al., 2004). The second limitation is the need of specialized programs that often must be correctly interfaced to couple the different components required for an AR or VR experience, that is, tracking limbs, handheld devices or AR markers, then running calculations on molecular data, and finally displaying results and molecular graphics on screen (see Ratamero et al., 2018; Gillet et al., 2005). Some of these programs are only compatible with specific VR devices and many are not free software. Overall, then, despite the dropping costs, access to these tools still requires investment in the order of hundreds to low-thousand American dollars per user, and software interfacing that may not be available to lay students and teachers. It is therefore unlikely that VR will achieve the ideal of one device per student within the next few years. To date, these solutions are not widely used across the world, and their costs make them totally out of reach for educational centers in developing countries. Additionally, most current setups are limited to VR, but it has been shown that AR is more suited for educational purposes because by not occluding the view of the user’s own limbs, it results in better motion coordination and object handling than VR (Sekhavat & Zarei, 2016; Gaffary et al., 2017; Krichenbauer et al., 2018). Furthermore, in AR the view of the world is not obstructed thus allowing students and teachers to interact more easily.
This article is organized in two parts. Part 1 provides a practical overview of the main building blocks available as of 2019 to program AR apps in web pages, with a focus on ways to achieve molecular visualization and modeling. It also briefly explores ways to handle gesture- and speech-based commands, molecular mechanics, calculation of experimental observables, concurrent collaboration through the world wide web, and other human-computer interaction technologies available in web browsers. Part 2 of the article showcases prototype web apps for specific tasks of practical utility in pedagogical and research settings. These web apps are based on open, commodity technology that requires only a modern browser “out of the box”, so educators, students and researchers are free to try out all these examples on their computers right away by following the provided links.
Part 1: Overview of Building Blocks for Immersive Molecular Modeling in Web Browsers
Virtual and augmented reality
Object detection and tracking
It is important to note that in marker-based AR different viewers looking at the same physical marker receive different perspectives of it and hence of the rendered virtual object, just as if it was a real object in real space (Fig. S3). This easily enables multi-user AR in a common room, as would be useful in a classroom setting where students and teachers look at the same virtual molecule.
A speech-based interface can be very useful for situations in which the user’s hands are busy holding objects, as in AR/VR applications. In-browser speech recognition APIs enable implementation of speech recognition very easily, especially through libraries like Annyang (Ater, 2019) which is used in some of the examples of this article. These libraries usually allow working in two modes, one where the browser waits for specific commands (while accepting variable arguments) and one where the browser collects large amounts of free text that are then made available to the environment. The former allows direct activation of functions without the need for the user to click on the screen. The second option opens up the possibility of automatically detecting subjects, actions and concepts that are fed to artificial intelligence routines, or just predefined rules, that the computer will analyze in background. For example, when two users are discussing the interaction surface between two proteins and mention certain residues, the computer could automatically mine NIH’s PubMed repository of papers for mentions of said residues. This may seem far-fetched, but is essentially the same technology that underlies automatic advertising and suggestions based on users’ various inputs and usage statistics in software and websites. The problem of intelligently suggesting chemical and biological information related to a topic or object has already been addressed for some time, for example in information augmentation tools like Reflect (Pafilis et al., 2009) and advanced text-mining tools (Rebholz-Schuhmann, Oellrich & Hoehndorf, 2012). The evolution of science-related standards are very important in this regard, formats and content for the semantic web (Hendler, 2003) and of machine-readable scientific databases and ontologies that standardize knowledge and data.
Further building blocks
Finally, a particularly interesting aspect of software running on web browsers is the ease with which different users can connect to each other, just over the internet. Web apps can exploit web sockets to achieve direct browser-to-browser links over which data can be transmitted freely, with a server only intervening to establish the initial connection (Pimentel & Nickerson, 2012). For example, two or more users can concurrently work on a JSmol session by sharing just mouse rotations and commands, appearing on all other users’ screens with a minor delay (Abriata, 2017b). Such collaborative working technology could be adapted to complex immersive environments to allow multiple users to work on chemical problems at a distance, essential for scientific collaborations, demonstrations, and online teaching (Lee, Kim & Kang, 2012).
Part 2: Prototype Web Apps Showcasing Example Applications
This section presents example AR web apps compatible with major web browsers in modern computers, introducing features of increasing complexity. All examples were verified to run out of the box in multiple web browsers on Windows, Linux and MacOS operating systems, in laptop and desktop computers. All the examples are accessible through links in Table 1 and at https://lucianoabriata.altervista.org/papersdata/tablepeerjcs2019.html, which further contains links to demo videos. To run these examples the user needs to print the Hiro, Kanji or cube markers as needed in each case (Fig. 1A; Figs. S1 and S2, and links on web pages). For simpler handling, flat markers (Hiro and Kanji) may be glued on a flat surface mounted on a device that can be easily rotated from the back, such as a small shaft perpendicular to the marker plane. The cube marker is printed in a single page, folded and possibly glued to a solid cube made of wood, plastic, rubber or similar material. Readers interested in the inner workings and in developing content can inspect the source code of each webpage (Ctrl+U or Cmd+U in most browsers). Several recommendations and basic troubleshooting for developers and users are given in Table 2.
|Software and hardware requirements|
|* Ensure using https URLs; otherwise webcams will not activate.
* Free web hosting services work, as web pages only need to be hosted but run in the client.
* Given the regular updates in w3c standards, APIs and libraries, routine tests are recommended to ensure normal functioning.
* Examples from this paper were verified to work “out of the box” on Safari in multiple MacOS 10.x versions and on multiple Chrome and Firefox versions in Windows 8, Windows 10, Linux RedHat Enterprise Edition, Ubuntu and ArchLinux.
* Currently limited and heterogeneous support in tablets and smartphones, these devices are not recommended.
|Software and hardware requirements|
|* Need a webcam- and internet-enabled computer (desktop or laptop).
* Enable webcam when prompted.
* Check that ad blockers and firewalls do not block the webcam and other content.
* Pages containing large Wavefront files may take time to load (half to a few minutes).
|Augmented reality markers|
|* Print on regular printer; try different sizes for different applications.
* When using Hiro and Kanji markers ensure they are printed at the same size.
* Ensure that makers have a white frame around the black drawing, at least 10% of size.
* To improve marker recognition, avoid glossy papers. Opaque paper is best.
* Lighting may also affect the quality of the AR experience.
* Markers are easier to handle if glued on solid surfaces (but avoid wrinkles).
* Cubic marker can be glued on solid rubber cube cut to appropriate size.
Introducing web browser-based AR for visualization
The simplest way to achieve AR in web pages consists in displaying on the AR markers representations exported from programs like VMD (Humphrey, Dalke & Schulten, 1996) in Wavefront (OBJ+MTL) format. This can be achieved with a few lines of HTML code thanks to libraries like AR.js for A-Frame, enabling the very easy creation of content for displaying any kind of object handled by the exporting program. Figure 2A exemplifies this with a small molecule, 2-bromo-butane, shown as balls and sticks. This small molecule is chiral at carbon 2; the Hiro marker displays its R enantiomer while the Kanji marker displays the S enantiomer, both rendered from the same pair of OBJ+MTL files but scaled as required for chirality inversion. Figure 2B shows on the Hiro marker a protein complex rendered as cartoons with small molecule ligands rendered as sticks (PDB ID 1VYQ, same example used by Berry & Board (2014)); and Fig. 3 shows a cartoon representation of a protein bound to a short segment of double stranded DNA rendered as sticks (from PDB ID 1FJL) spinning on the Kanji marker. Figure 2D exemplifies display of VMD isosurfaces to show a volumetric map of a bacteriophage attached to its host as determined through electron microscopy (EMDB ID 9010). Two further examples feature combined representations of atomic structure and volumetric data: Fig. 2E shows a small peptide rendered as sticks and docked inside the experimental electron map shown as a mesh (from PDB ID 3HYD), and Fig. 2F shows the frontier molecular orbitals of BH3 and NH3 (from Wavefront files kindly provided by G. Frattini and Prof. D. Moreno, IQUIR, Argentina).
Adding interactivity: examples on small molecules
Similar emulation strategies could be easily used to build “interactive animations” for exploring chemical and physical phenomena of much pedagogical use, as in the PhET interactive simulations (Moore et al., 2014) but using AR to directly, almost tangibly, handle molecules. For example, the app shown in Fig. 3B illustrates stereoselectivity in the Diels-Alder reaction in interactive 3D. This reaction occurs between a dienophile and a conjugated diene in a concerted fashion, such that the side of the diene where the initial approach occurs defines the stereochemistry of the product. The web app in this example allows users to visualize this in 3D as they approach a molecule of 1,3-cyclohexadiene on the Hiro marker to a molecule of chloroethene on the Kanji marker. As the two pairs of reacting C atoms approach each other, the new bonds gain opacity until the product is formed. Additionally, the product formed in this reaction is by itself an interesting molecule to visualize and manipulate in AR, because it contains two fused six-membered rings which are hard to understand in 2D.
It should be noted that the examples provided here emulating reactivity are merely pictorial visualizations of the mechanisms, and not based on any kind of quantum calculations. Such calculations are too slow to be incorporated into immersive experiences where energies need to be computed on the fly. However, novel machine learning methods that approximate quantum calculations through orders-of-magnitude faster computations (Smith, Isayev & Roitberg, 2017; Bartók et al., 2017; Paruzzo et al., 2018) could in the near future be coupled to AR/VR systems to interactively explore reactivity in real time. Obviously, such tools could be useful not only in education but also in research, for example to interactively test the effect of chemical substituents on a reaction, estimate effects on spectroscopic observables, probe effects of structural changes on molecular orbitals, etc. It is already possible to integrate AR/VR with a physics engine, to add realistic mechanics to the simulation. The web app in Fig. 2H uses Cannon.js to simulate thermal motions and thus give a sense of dynamics to the visualized system. In this web app, Cannon.js handles the multi-atom system by treating atoms as spheres of defined radii connected by fixed-distance constraints and whose velocities are continuously updated to match the set temperature, leading to rotations around bonds. However, extension of Cannon.js to include additional force field terms like dihedral angle terms and electrostatic interactions would be needed to achieve a more complete and realistic modeling experience.
AR-based modeling of biological macromolecules
Figures 2 and 3 already include examples for visualizing certain biological macromolecules, and the previous section introduced ways to incorporate interactivity into these web apps. This section digs deeper into the development of interactive content more relevant to education and research in structural biology, by exploring the incorporation of restraints, simple force fields and on-the-fly simulation of experimental observables for biological macromolecules.
The web app shown in Fig. 4A allows exploration of the interaction space of two proteins that are known to form a complex in solution, specifically ubiquitin (red trace) and a ubiquitin-interacting motif (UIM, blue trace) taken from PDB ID 2D3G (Hirano et al., 2006). The web app simulates on-the-fly the small-angle X-ray scattering (SAXS) profiles expected from the relative arrangement of the two proteins, and displays them overlaid onto an experimental profile in real time as the user moves the proteins. This offers a way to interactively test compatibility of possible docking poses with the experimental data. Although it cannot compete with the extensive sampling achievable with molecular simulations, such an interactive tool could be useful for preliminary analysis of SAXS data before starting complex calculations or to assist interpretation of the results of such calculations. For simplicity and speed, in this example the SAXS profile calculation is based on the Debye formula iterated through pairs of residues instead of through all pairs of atoms as the full equation requires (Debye, 1915); however, realistic SAXS profiles of actual utility in modeling can be achieved with coarse graining strategies and proper parameterization of the scattering centers (Stovgaard et al., 2010). This web app further includes a rudimentary residue-grained force field (i.e., describing each amino acid with one backbone and one side chain bead) to detect clashes, and a predefined binding coordinate which upon activation brings the two molecules together. Activation of SAXS profile simulation, clash-detecting force field and binding coordinate are controlled by voice commands, required because the user’s hands are busy handing the markers. This proceeds through the browser’s cloud-based speech recognition API so does not consume much resources. The successful combination of all these different elements (AR, 3D visualization, calculations, and speech recognition) illustrates the superb integration capability of libraries for client-side scripting in web browsers. The modularity and simplicity of client-side web programming allows easy adaptation to other kinds of experimental data; for example to residue-specific paramagnetic relaxation enhancements as done by Prof. Rasia (IBR-CONICET-UNR, Argentina) at https://rrasia.altervista.org/HYL1_1-2/Hyl1_12_minima.html.
Another example, presented in Fig. 4B shows how AR can help to explore residue-residue contact predictions from residue coevolution data. Such predictions provide useful restraints in modern approaches for modeling proteins and their complexes (Simkovic et al., 2017; Abriata et al., 2018; Abriata, Tamò & Dal Peraro, 2019), but often include a fraction of false positives that introduce incorrect information if undetected. Interactive inspection of residue-residue contact predictions could help to actively detect false positives through human intervention before the actual restraint-guided modeling proceeds. The example in Fig. 4B shows contacts predicted from coevolution analysis of large sequence alignments for the pair of proteins in chains A and B of PDB ID 1QOP, taken from the Gremlin server (Ovchinnikov, Kamisetty & Baker, 2014). Each protein is driven by one marker, and the predicted contacts are overlaid as dotted lines connecting the intervening pairs of residues. These lines are colored green, olive and red according to decreasing coevolution score as in the Gremlin website, and their widths reflect in real time the distance between pairs of residues, presumably minimal when contacts are satisfied if the prediction is true.
The last prototype application shows rudimentary handling of highly disordered protein regions, in this case to test how a flexible linker made of six glycine residues restricts the space available for exploration and possible docking poses of two well-folded domains (Figs. 5A–5C). Each folded domain (ubiquitin and ubiquitin-interacting motif, taken from PDB ID 2D3G) is modeled at a resolution of two spheres per residue, one centered at the backbone’s alpha carbon and one at the center of mass of the sidechain (i.e., a description slightly coarser than that of the MARTINI force field (Marrink et al., 2007)). All spheres representing residues of the folded domains are kept in fixed positions relative to their AR marker, and have radii assigned as the cubic root of the amino acid volumes to roughly respect the relative amino acid sizes (Abriata, Palzkill & Dal Peraro, 2015). The glycine residues of the flexible linker are modeled as single spheres centered at the alpha carbons with their radii set to the cubic root of glycine’s volume. Using Cannon.js, the spheres representing the glycine residues of the flexible linker (in orange in Fig. 5) are allowed to move freely but under a fixed-distance constraint from each other and from the corresponding ends of the folded domains. This very simple model can help to answer questions related to the separation of the anchor points and the maximal extension of the linker when straight: How far apart can the two proteins go with the given linker containing six residues? Can both proteins be docked through certain interfaces keeping the linker in a relaxed configuration? The user’s investigations are assisted by on-the-fly estimation of entropy associated to given extensions of the linkers, estimated with a worm-like chain model from polymer physics (Marko & Siggia, 1995), and by an estimation of the strain experienced by the linker when the user pulls its glycine residues apart beyond their equilibrium distance.
Achieving seamless integration of immersive visualizations, haptic interfaces and chemical computation stands as one of the key “grand challenges” for the simulation of matter in the 21st century (Aspuru-Guzik, Lindh & Reiher, 2018). Such integration is expected to help us to more easily grasp and explore molecular properties and to efficiently navigate chemical information. In the last two decades several works introduced different ways to achieve AR and VR, as presented in the Introduction section. In education, such tools could replace or complement physical (such as plastic-made) modeling kits, augmenting them with additional information about forces, charges, electron distributions, data facts, etc. In research, such tools could help better visualize and probe molecular structure, simulate expected outcomes of experiments and test models and simulated data against experimental data, etc., all through intuitive cues and fluent human-computer interactions.
For educational applications, the next stage would be to develop actual content of value for teachers and students. The simplest content could merely provide visualizations to explore molecular geometries, surfaces, orbitals, etc., with specific sets of demos to assist learning of key concepts such as chirality, organic molecules, metal coordination complexes and biomacromolecular structures, to mention just a few cases. By adding mechanics, more complex demos could be created where students could for example interactively explore torsion angles as in the textbook cases of probing the energy landscape of rotation around the central C–C bond of butane, or swapping between chair and boat conformations of six-member rings, exploring hydrogen bonding patterns between neighbor beta strands in a protein, etc. Importantly, every single student having a computer at hand could use these apps, not only at the school/university but also at home, therefore this could become an actual learning tool of full-time, individual use. The possibility of reaching the masses with this kind of web technologies for AR-based molecular modeling in turn opens up the opportunity of performing large-scale evaluations of their actual impact in education.
As actively investigated by others, there is also a need to explore if full working programs for AR-based molecular modeling may actually become powerful enough to also assist research. Here again, web-based tools like those discussed in this article could help to carry out such tests at large scales. Some of the prototypes presented here advance possible uses in research, as in the simulation of data from protein–protein docking poses and comparison to the corresponding experimental data in real time. However, some issues should be addressed before creating fully-fledged web programs for research: (i) improving AR marker detection and tracking to stabilize inputs (Gao et al., 2017), (ii) developing some kind of AR marker that is clickable so that users can drag and drop objects in space (hitherto unexplored), (iii) improving graphics, where the possibility of adapting existing web molecular graphics like NGL (Rose & Hildebrand, 2015), 3dmol (Rego & Koes, 2015), Litemol (Sehnal et al., in press), Mol* (Sehnal et al., 2018), JSmol (Hanson et al., 2013), etc. is particularly enticing, and (iv) developing force fields that correctly handle molecular dynamics and energetics for different tasks, which may imply different levels of granularity for different applications. Another global improvement, also important for pedagogical applications, would be incorporating proper object occlusion, which is still non-trivial and subject of studies in the AR community (Shah, Arshad & Sulaiman, 2012; Gimeno et al., 2018).
Some further directions that could be explored in the near future are fluently connecting through an AR experience users in physically distant locations, so that they can collaborate on a research project or teach/learn at a distance. Adapting future standards for commodity AR/VR in smartphones (plugged into cardboard goggles for stereoscopy) is also worth exploring as this would lead to an even more immersive experience than with the mirror-like apps proposed here. However, since smartphones are more limited in power, their use for AR molecular modeling may require coupling to external computer power. Last, pushing the limits towards fully immersive visual analytics for molecular modeling, and especially thinking about solutions useful for research, a few especially enticing additions include support for haptic devices for force feedback, AR without markers (i.e., just by tracking the users’ hands and fingers) and considering occlusion, and as described above, the capability to respond to visual or audio cues by automatically checking databases and mining literature to propose relevant pieces of information.