Human Performance in Cooperative Virtual Environments: the Effect of Visual Aids and Oral Communication

—Cooperative virtual environments, where users simultaneously manipulate objects, is one of the subﬁelds of Collaborative virtual environments (CVEs). In this paper we simulate the use of two string based parallel robots in cooperative teleoperation task. Two users setting on separate machines connected through local network operate each robot. In this context, we investigate the effects of visual aids and oral communication on cooperation, co-presence and users performance. Ten volunteers subject had to cooperatively perform a peg-in-hole task. A second group of ten subjects perform the same task in a single user setup. The objective of the two experiments is twofold, ﬁrstly to compare the task’s complexity of single user setup with that of the cooperative environment. Secondly to examine the degree of inﬂuence of visual aids and oral communications on user’s performance in the two different setups. Results revealed that shadow has a signiﬁcant effect on task execution while arrows and oral communication not only increase users performance but also enhance the sense of co-presence and awareness. We also observed that cooperative manipulation was more complex as compare to single user manipulation.


INTRODUCTION
The successful advancements in the field of high quality computer graphics and the capability of inexpensive personal computers to render high-end 3D graphics in a more realistic manner has made virtual reality (VR) feasible to be used in many areas such as industrial design, data visualization, training etc.Similarly, there are other domains of VR application such as medical [26,22,26,11], textile and fashion [16], assembling, repairing and education [5], etc.
Human beings often perform their work (from simple to complex ones) in a collaborative manner, that is why VR scientists initiated the development of virtual environments (VEs) supporting collaborative work.A CVE is a computer generated world that enables people in local/remote locations to interact with synthetic objects and representations of other participants within it.The applications of such environments are in military training, telepresence, collaborative design and entertainment.Interaction in CVE may take one of the following forms [21]:  Asynchronous: It is the sequential manipulation of distinct or the same attributes of an object, for example a person changes an object's position, and then another person paints it.Another example is, if a person moves an object to a place, then another person moves it further. Synchronous: It is the concurrent manipulation of distinct or the same attributes of an object, for example a person is holding an object while another person is painting it, or when two or many people lift or displace a heavy object together.The concurrent manipulation is also termed as Cooperative Manipulation or Cooperative work.
In order to carry out a cooperative task efficiently, the participants need to feel the presence of others and have means of communication with each other.The communication may be verbal or non verbal such as pointing to, looking at or even through gestures or facial expressions.Similarly the participants must have a common protocol for task execution.The design and implementation of a system with these capabilities specially for distant users has really been a challenging job for the researchers.For example the architecture of the virtual world may be client server or a replicated one.In case of client-server architecture the known problems of network load and latency arise.
Similarly in replicated solution the consistency of two or more sites need to be addressed.We implement the VE designed for cooperative work in replicated architecture and seek solution to network load/latency and VE consistency in a unique way.Similarly to feel the presence of others and to make cooperative work easier and more intuitive, we augment the environment with visual aids and oral communication and investigate their effects on user performance in a peg-in-hole task [10,32].
This section is followed by the related work.Section 3 describes the proposed system and the hardware platform used for the experiments.Section 4 presents the peg-in hole experiment I in single user setup.Section 5 discusses the experiment II in which the same task is cooperatively performed.Section 6 is dedicated to the comparative analysis of experiment I & II.Section 7 gives conclusion and some tracks for future work.

Human Performance in Cooperative Virtual Environments: the Effect of Visual Aids and Oral Communication II. RELATED WORK
A lot of work has already been done in the field of CVE, for example MASSIVE provides a collaborative environment for teleconferencing [12].Most of this collaborative work is pertinent to the general software sketch, the underlying network architecture [7,30] and framework [4,18].Basdogan et al. have investigated the role of force feedback in cooperative task.They connected two monitors and haptic devices to a single machine [6].Similarly, sallnas et al. have reported the effect of force feedback over presence, awareness and task performance in a CVE.Similarly, they connected two monitors and haptic devices to a single host [27].A heterogeneous scalable architecture has been given, which supports haptic interactions in collaborative tasks [29].Other important works that support the cooperative manipulation of objects in a VE include [15,14,19,3] but all these systems require heavy data exchange between two nodes to keep them consistent.
Visual, auditory and tactile cues have already been used both in single user VR and teleoperation systems as a substitute for haptic feedback [17,23].Sensory substitution may also be used as a redundant cue to avoid the possible force feedback instabilities in presence of small delays.

III. DESCRIPTION OF THE SYSTEM
In this section we present a system that enables two users, connected through Local Area Network (LAN), to cooperatively manipulate virtual objects using string-based simulated parallel robots in a VE.Secondly we present how oral communication and the visual aids (shadow and arrows) may assist the cooperative manipulation of objects.
The VE for cooperative manipulation has a simple cubic structure (has side of 36cm), consisting of three walls, floor and a ceiling.Furthermore the VE contains four cylinders each with a distinct color and standing lengthwise in a line (see Fig. 1).In front of each cylinder at the distance of 30cm there is a torus of the same color.All cylinders are of the same size 1.5cm.The red, green, blue and yellow toruses have inner radii of 1.6, 1.8, 2.0 and 2.20cm respectively.Cylinders and toruses have 4cm distance between them.We have modeled two SPIDAR (3DOF) to be used as robots [28,24,31].At each corners of the cube a motor for one of the SPIDAR has been mounted.The end effectors of the SPIDARs have been represented by two spheres of distinct color.Each end effector uses 4 wires (same in color) for connection with its corresponding motors.Therefore, user's movements are constrained by the wire arrangement of the SPIDAR.
One of the important tasks related to collaborative/ cooperative system is the representation of users in the virtual world.This is normally carried out using avatars [21,13,14,9] or some other representations like virtual hands or balls [8,5,19,20].We use two spheres which are identical in size but different in colors (one is red and the other is blue) so that the users may feel the presence of others.Each pointer controls the movements of an end effector.Once a pointer collides with its corresponding end effector, the later will follow the movements of the former.In order to lift and/or transport a cylinder the red end effector will always rest on right and blue on left of the cylinder.

Use of visual aid and oral communication in cooperative work
Cooperative work is really a challenging research area, specially when the users are connected through LAN or WAN, because there are a number points to be treated.For example to sense the presence of others and to have awareness of where is and what is the status of other partners is essential and may have profound effects on cooperation.Similarly the cooperating persons should also have some feedback to know, when they can start together, can leave each other (when task is finished), or if there is some interruption during task.For this purpose we exploit visual feedback and oral communication.In visual channel we make use of arrows and objects' shadows.
If any user moves to touch a cylinder on its proper side, an arrow appears pointing in the opposite direction of the force applied by the end effector on the object (see Fig. 2).The arrow has many advantages, for example it indicates the collision between an end effector and cylinder.Similarly during the transportation, if any user looses control of the cylinder, his/her arrow will disappear and the cylinder will stop moving.The second user will just wait for the first one to come back in contact with the cylinder.It means that the two users will be aware of each other's status via arrows during task accomplishment.
Our current system is a desktop environment and do not support stereoscopic display.In order to have the knowledge of perspective positions of various objects in the VE, we make use of shadow for all objects in the environment.The shadows not only give information about the two end effector's contact with cylinder but also provide feedback about the cylinder's position relative to its corresponding torus during transportation.Human beings frequently make use of oral communication while performing a collaborative or/and cooperative task.In order to accomplish the cooperative work in a more natural manner, to achieve high performance and increase co-presence and awareness, we make use of oral communication in our system.For this purpose we use TeamSpeak software that allows the two users to communicate over the network using a headphone equipped with microphone [1].The oral communication allows the users to negotiate and inform each other about various events, such as increase or decrease in speed, losing control of the cylinder and arriving over the torus.The following conditions are checked once the two end effectors touch a cylinder (see Fig. 3).

Fig. 2. Illustration of the appearance of arrow
In equation 1, Dh represents the horizontal distance between the centers of the two spheres, Rc is the radius of the cylinder and K is a positive constant.This check ensures that the spheres must not completely penetrate the cylinder and should remain visible during the task.In equation 2, Dv represents the vertical distance between the centers of the two spheres that must be less than or equal to a threshold T. When conditions in equation 1 and equation 2 are both satisfied then users can cooperatively move the cylinder.

Framework for Cooperative VE
The framework plays a very important role in the success of collaborative and/or cooperative VEs.It is pertinent to, how different users will have access to the same virtual world and data (i.e centralized, distributed or replicated), what protocol (TCP ,UDP, etc) to be used and what kind of data should flow through network to keep consistency as well [8].We use a complete replicated approach and install the same copy of the VE on two different machines.

Experimental Setup
We installed the software on two Pentium 4 type personal computers connected through local network.Each machine had processor of 3GHZ and 1GB memory.Each system is equipped with standard graphic and sound cards.Both systems used 24 inch plate LCD tv screen for display.
Similarly each VR station is equipped with a patriot polhemus [2] as input device.The workspace that polhemus support is a half sphere of 50cm radius.The software was developed using C++ and OpenGL Library.

IV. EXPERIMENT I
In this experiment the peg-in-hole task was carried out in a single user setup.For this purpose two polhemus sensors were used on the same machine.The two sensors were attached to the right hand of the users in a way that the sensors corresponding to the red and blue sphere were on the index finger and thumb respectively (see Fig. 5).This experiment was performed by ten volunteers consisting of five male and five female.All of them were master students and were right handed.Each subject was given a pre-trial along with a short briefing.Here the task for the users was to grasp the cylinders via end effectors and put them in its corresponding torus.We tested only conditions C1 (Shadow), C2 (Shadow + Arrow) and C4 (No aid) in a counter balanced manner.There were four trials under each condition and the order of selection was sequential starting from the red.The evaluation is based on task completion time, errors and user's response collected through a questionnaire.respectively with a significant ANOVA.It shows that users performed better in condition C1 as compared to Fig. 5. Illustration of polhemus sensors in single user setup condition C4.Similarly C2 (6.27 sec, std 0.82) and C4 (8.09 sec, std 1.14) also gives significant ANOVA results.Therefore, users performed better in condition C2 as compare to condition C4.On the other hand comparing the task completion time of C1 and C2, We have 6.34 sec (std 1.51) and 6.27 sec (std 0.82) respectively with a non significant ANOVA.This result shows that shadow has an influence but arrow has no influence on task performance in the single user setup.

Errors in task completion
Fig. 7 illustrates the average error in single user setup for conditions C1 (Shadow), C2 (Shadow +Arrow) and C4 (No aid).The loss of cylinder's control is considered as an error/drop.We recorded the number of errors for each cylinder under each condition.A global error analysis is presented.For errors in task completion, the ANOVA (F(3,9)= 0.49, p > 0.05) is not significant.Here C1 has mean of 1.16 errors with std 0.72.Similarly C2 and C4 have 1.17 (std 0.48), 1.47 (std 1.02) means respectively.The user had more control over his/her finger's positions.It resulted in less errors in all conditions.

Subjective evaluation
For subjective evaluation users responded to the questionnaire after task completion.The questionnaire had the following questions. What feedback helped you more for task accomplishment?( C1, C2, C4 ) Here 80% users were for C1 while the 20% were for C2.
 Which part of the task was more difficult?( Grasping, Transportation, Placement ) The response was 20%, 70% and 10% for grasping, transportation and placement respectively.

User learning
Learning is defined as the improvement of group performance during task repetition.In figure 8 the four trials (T1, T2, T3 and T4) are represented along x-axis.Conditions are differentiated through different colors.The results show that applying condition C1, the subjects completed the task in 7.1 sec (std = 1.32) in the first trial and in 5.37 sec (std = 0.92) in the fourth trial.They completed the task under condition C2 in a mean time of 6.22 sec (std=1.07) in the first trial, while took 5.77 sec (std = 0.95) in the fourth trial.Similarly we have the mean time of 8.05 sec (std =2.02) under condition C4 for the first trial and 6.9 sec (std =1.04) for the last trial.This results in performance improvement of 24.36, 7.24 and 14.28 percent for conditions C1, C2 and C4 respectively.EXPERIMENT II The second experiment is pertaining to cooperative manipulation while in the first experiment the same task was performed by users in a single user setup.

Procedure
In order to evaluate the system and investigate the effect of visual aids and oral communication on user performance in cooperative object manipulation, we carried out experiment II.For this purpose a new group of ten volunteers consisting of five male and five female participated.They were master and PhD students and had ages from 22 to 35.All the participants performed the experiment with same person who was expert of the domain and also of proposed system.
Each subject was given a short briefing about the experiment and to get them familiar with the system.They were also given a pre-trial in which they experienced all feedbacks.The users needed to start the application on their respective machines.
After the successful network connection between the two computers the users could see the two spheres (red and blue) as well as the two end effector of SPIDARs on their screens.Seeing the two spheres they were required to bring their polhemus controlled spheres in contact with their respective end effectors (i.e.red + red and blue +blue).The red sphere was assigned to the expert while the subjects were in charge of the blue one.In order to pick up the cylinder the expert needs to touch it from right while the subject should rest on its left.The experiment was carried out under the following four conditions.
 C1= only shadow  C2= shadow + arrows  C3= shadow + arrows + oral communication  C4= No aid All the ten groups performed the experiment using counter balance combinations of the four conditions.We recorded the task completion time for each cylinder.The time counter starts for a cylinder once the two end effectors have an initial contact with it, and stops when it is properly placed in the torus.The indicator for the proper placement of cylinder is the change in color (white) of the torus.Similarly we recorded the number of times the cylinder was dropped as errors.After task completion we gave each user a questionnaire in order to have subjective feedback.Fig. 9 shows a novice user on his station while performing the task.

Task
The experiment for the users was to cooperatively pick up a cylinder and put it into the torus whose color matches with the cylinder.The users were required to place all cylinders in their corresponding toruses in a single trial.Each group performed exactly four trials under each condition.Thus each user had 64 manipulations of cylinders under all conditions.The order of selection of the cylinders was also the same for all groups i.e. to start from the red, go on sequentially and finish at yellow (right).
In following subsections we present and analyze the results of task completion time and also the errors made during task accomplishment.Similarly the user's responses collected through the questionnaire is also thoroughly examined and discussed.

Task completion time
For task completion time the ANOVA (F (3, 9) = 16.02,p < 0.05) is significant.Comparing the task completion time of C1 and C2, we have 30.07sec (std 6.17) and 22.39 sec (std 3.10) respectively with a significant ANOVA.This result shows that arrow has an influence on task performance.Similarly comparing C4 (mean 38.31 sec, std 7.94) with C1 also gives significant ANOVA.This indicates that only "shadow" as compared to"No aid" also increases user performance.Now we compare the mean 22.39 sec (std 3.10) of C2 with that of C3 (24.48 sec std 3.93), the ANOVA result is not significant.It shows that users had almost the same level of performance under C2 and C3.On the other hand the comparison of C2 and C3 with C4 (mean 38.31 sec, std 7.94) both have statistically significant results (see Fig. 10).Fig. 10.Task completion time under various conditions

Error in task completion
When one or both users were detached from the cylinder during task accomplishment, it was considered as an error.We recorded the number of errors for each cylinder under each condition.We present a global error analysis for each condition (see Fig. 11).Here C1 has mean of 8.6 errors with std 4.6.Similarly C2, C3 and C4 have 6.6 (std 3.5), 6.4 (std 3.2) and 11.7 (std 5.7) means with std respectively.The errors are less in conditions C2 and C3 as compared to conditions C1 and C4.

Subjective evaluation
In this section we analyze the response collected through the questionnaire.The questionnaire had five questions with three to four options for response.For each question the subjects had to select the options in their order of preference.
 Q1: What condition did you prefer?Classify in order of preference.(a) C1 (b) C2 (c) C3 (d) C4 For this question 90 % subjects placed C3 as their first option while 10% put it on the 2nd.C2 was marked by 10%, 70% and 20% subjects as their first, second and third priority respectively.Similarly C1 got 30% and 70% opinion for second and third position respectively.On the other hand C4 was placed by all users at last position.
 Q2: What feedback helped you more in task accomplishment?Classify in order of preference.For this question 90% subjects placed C3 at first priority while 10% put it on the 3rd.C2 got 10%, 70% and 20% opinions for 1st, 2nd and 3rd priority positions respectively.C1 was marked by 30% and 70% users for 2nd and 3 rd position respectively.On the other hand C4 was placed by all users at last position.
 Q4: In which condition you sensed more the presence of your collaborator?Classify in order of preference.(a) C1 (b) C2 (c) C3 (d) C4 Similarly for this also 90 percent subjects placed C3 at first priority while 10% put it on the 2nd.C2 got 10%, 70% and 20% opinions for 1st, 2nd and 3rd priority positions respectively.C1 was marked by 20% and 80% users for 2nd and 3rd position respectively.On the other hand C4 was placed by all users at last position.
 Q5: What condition helped you more to establish coordination with your collaborator?Classify in order of preference.
To summarize, we can say that C3 (shadow + arrows + oral communication) is the most preferable condition and users placed it on first priority position.In C3 condition users deduced cylinder's position with respect to the torus via shadow and status of the collaborator through arrows, while oral communication enhanced awareness and realism.Furthermore C2, C1 and C4 were placed on 2nd, 3rd and 4 th position respectively.

User learning
In Fig. 12   Similarly we have the mean time of 30.75 sec (std =4.05) under condition C3 for the first trial and 28.75 sec (std =3.52) for the last trial.We got mean time of 39.27 sec (std =5.51) under condition C4 for the first trial and 33.15 sec (std =6.94) for the fourth trial (see figure).Therefore, we have performance improvement of 4.46, 19.03, 7.15 and 15.58 percent for conditions C1, C2, C3 and C4 respectively.

VI. COMPARATIVE ANALYSIS
In this section we compare the task completion time for C1, C2 and C4 of the experiment I with those of the same conditions of the experiment II.The condition C1 has means of 30.07 sec (std 6.17) and 6.34 sec (std 1.5) for experiment II and I respectively and gives significant ( F(2,9)=139.54, P < 0.05 ) ANOVA.The condition C2 has means of 28.38 sec (std 3.11) and 6.26 sec (std 0.82) for experiment II and I respectively and the ANOVA (F (2, 9) =250.36,P < 0.05) result is also significant.Similarly for C4 we have means of 38.31 sec (std 7.94) and 8.09 sec (std 1.13) for experiment II and I respectively and the ANOVA (F (2, 9) =141.83,P < 0.05) result is significant.
Similarly comparing the errors for C1 (8.6 errors, std 4.6), C2 (6.6 errors, std 3.5) and C4 (11.7 errors, std 5.7) of the experiments II (see Fig. 11) with the errors for the corresponding conditions (C1 =1.16 errors (std 0.72), C2 = 1.17 errors (std 0.48) and C4=1.47 errors (std 1.02)) of experiment I (see Fig. 7) give significant differences.All these indicate that the task is easier to accomplish in single user setup as compared to the cooperative setup.It is also clear that shadow aids users in task accomplishment in both setups while the arrow enhances user performance in cooperative manipulation.The arrow is useful in cooperative manipulation because it not only gives feedback to the user about his end effector but also about the collaborator.

VII. CONCLUSION
In this paper we simulated the use of two string based parallel robots in cooperative teleoperation task.Two users setting on separate machines connected through local area network operate each robot.In addition, the use of visual aids (i.e.shadows and arrows) and oral communication was investigated for their effects on cooperation, co-presence and users performance.Ten volunteer subjects cooperatively performed a peg-in-hole task.Another group of ten subjects performed the same task in a single user setup.Results revealed that shadow has a significant effect on task execution while arrows and oral communication not only increase users' performance but also enhance the sense of co-presence and awareness.We also observed that cooperative manipulation is more complex as compare to single user manipulation.Moreover we found that the addition of visual cues (arrows and shadows) and oral communication greatly helped users in cooperative manipulation of objects in the VE.Secondly these aids, specially arrows and oral communication also increased users' performance and enabled them to perceive each other's actions.Future work will be carried out to integrate the force feedback modality and examine its effects on cooperative task.Furthermore we will implement the system on long distance network (i.e.Internet) and investigate the influence of network delay.

Fig. 3 .
Fig. 3. Illustration of conditions and way of cooperative manipulation

Fig. 6 .
Fig. 6.Task completion time under various conditions in single user setup

Fig. 7 .
Fig. 7. Illustration of errors for various conditions in single user setup What condition did you prefer?(C1, C2, C4 ) Here 90% user opted for C1 while 10% for C2. What feedback helped you more for task accomplishment?( C1, C2, C4 ) Here 80% users were for C1 while the 20% were for C2. Which part of the task was more difficult?( Grasping, Transportation, Placement ) The response was 20%, 70% and 10% for grasping, transportation and placement respectively.

Fig. 8 .
Fig. 8. Illustration of user learning in single user setup for various conditions

Fig. 11 .
Fig. 11.Illustration of error for various conditions (a) Shadow (b) Arrow (c) Oral communication For the 2nd question, Shadow was marked by 50%, 30% and 20% users as first, second and third priority respectively.Only 10% users placed arrow on first position while 40% and 50% users placed it on 2nd and 3rd position respectively.Oral communication was prioritized for 2 nd position by 40%, while 1st and 3rd position each got 30% votes. Q3: In which condition you perceived better the actions of your collaborator?Classify in order of preference.(a) C1 (b) C2 (c) C3 (d) C4 the four trials (T1, T2, T3 and T4) are represented along x-axis.Conditions are differentiated through different colors.The results show that applying condition C1, the subjects completed the task in 30.27 sec (std = 3.25) in the first trial and in 27.2 sec (std = 3.52) in the fourth trial.They completed the task under condition C2 in a mean time of 29.15 sec (std=6.55) in the first trial, while took 23.6 sec (std = 3.6) in the fourth trial.

Fig. 12 .
Fig. 12. Illustration of cooperative user learning in various conditions