Assessment of Stereoscopic Multi-resolution Images for a Virtual Reality System

A camera and monitor system that projects actual real-world images has yet to be developed due to the technical limitation that the existing cameras cannot simultaneously acquire high-resolution and wide-angle images. In this research, we try to resolve this issue by superimposing images; a method which is effective because the entire wide-angle image does not necessarily need to be of high resolution because of perceptual characteristics of the human visual system. First, we examined the minimum resolution required for the field of view, which indicated that a triple-resolution image where positions more than 20 and 40 deg from the center of the visual field were decreased to 25% and approximately 11% of the resolution of the gaze point, respectively, was perceived as similar to a completely high-resolution image. Next, we investigated whether the participants could distinguish between the original completely high-resolution image and processed images, which included triple-resolution, dual-resolution, and low-resolution images. Our results suggested that the participants could not differentiate between the triple-resolution image and the original image. Finally, we developed a stereoscopic camera system based on our results.

Abstract-A camera and monitor system that projects actual real-world images has yet to be developed due to the technical limitation that the existing cameras cannot simultaneously acquire high-resolution and wide-angle images.In this research, we try to resolve this issue by superimposing images; a method which is effective because the entire wide-angle image does not necessarily need to be of high resolution because of perceptual characteristics of the human visual system.First, we examined the minimum resolution required for the field of view, which indicated that a triple-resolution image where positions more than 20 and 40 deg from the center of the visual field were decreased to 25% and approximately 11% of the resolution of the gaze point, respectively, was perceived as similar to a completely high-resolution image.Next, we investigated whether the participants could distinguish between the original completely high-resolution image and processed images, which included triple-resolution, dual-resolution, and low-resolution images.Our results suggested that the participants could not differentiate between the triple-resolution image and the original image.Finally, we developed a stereoscopic camera system based on our results.Index Terms-Multi-resolution image, Stereoscopic system, Peripheral visual acuity.

I. INTRODUCTION
In the field of virtual reality, numerous systems employing cameras, monitors, and projectors have been developed to create an environment where users can obtain a large mergence (e.g., [1,2]).However, a camera and monitor system capable of projecting the realistic images has yet to be developed due to the technical limitation that high-resolution and wide-angle images are incompatible.There are two ways to resolve this problem.One is to develop high-resolution digital television systems (e.g., [3,4]), but this solution requires a few more decades of research to be executed.The other strategy is to combine high-resolution images and wide-angle images obtained using existing devices.Dual-resolution video systems have been proposed (e.g., [5][6][7][8]).In these systems, high-resolution images taken with telephotographic cameras are inserted into low-resolution images taken with pantoscopic cameras, and the combined images are displayed at the full size on the monitor.However, observers can see the low-resolution images in the peripheral Manuscript Received on September 11, 2009.E-Mail: ogawam@brain.is.kyushu-u.ac.jp area of the visual field because these systems are not designed to provide realistic images.Naemura, Sugita, Takano, and Harashima have proposed a quasi-tri resolution stereoscopic system [9], which includes dual-resolution images with different high-resolution areas for the right and left eye.The authors reported that when the central high-resolution area was enlarged for the left eye, observers had difficulty in recognizing low-resolution objects in the periphery of the visual field.We hypothesize that observers are unable to recognize the low-resolution objects with the improvements of multi-resolution image composition because human visual acuity is decreased with retinal eccentricity.We have attempted to employ the latter approach to develop a new stereoscopic system where observers perceive a multi-resolution image as a single complete high-resolution image.Then we conducted fundamental research for our system based on human visual characteristics.
Numerous researchers have investigated the relation between visual acuity and eccentricity.For example, Wertheim examined this relationship using grating detection [10], and Millodot investigated it using Landolt rings [11].Both the studies observed the same tendency; human visual acuity rapidly decreases as the degree of eccentricity increases.Anstis confirmed this tendency using Snellen letters [12].Kondo, Nakamizo, and Araragi also confirmed this using Hiragana letters in Japanese [13].
On the other hand, little research has been performed on the effect of monitor pixel density (resolution) on perception.The relationship between resolution discrimination and eccentricity must be clarified to develop our multi-resolution image system.Howlett reported that a high-resolution image can be inserted within the range from a viewpoint of 5 or 15 deg in a dual-resolution head-mounted stereoscopic display (HMD) [14].Iwamoto, Maeda, and Tanie have examined the resolution required to design a dual-resolution HMD, and consequently explained that the high-resolution area can be within 9 deg of the observer's visual angle and the low-resolution area, which can be decreased to 30% of the high-resolution area, can be within 110 deg [15].However, when we repeated Iwamoto's experiment, we recognized the low resolution of the peripheral area in the dual-resolution stereoscopic images.Iwamoto et al. created a low-resolution area using a smoothing algorithm, which generated a blurred image.We consider blurred images different from the images with a decreased resolution because the smoothing algorithm fills in the gaps of decreased resolution

Assessment of Stereoscopic Multi-resolution
Images for a Virtual Reality System as blurs.Although this algorithm enhances the low resolution image, the system requires significant computing power and cost of computer processing.Thus, we conducted the following experiments to develop our stereoscopic multi-resolution images system while attempting to keep the cost of the system to minimum.Experiment 1 was designed to determine the necessary resolution in the visual field such that an observer cannot distinguish the high-resolution of the central area from the decreased resolution of the peripheral area.We used images that randomly changed the size of the high-resolution area and the resolution of the peripheral area.Experiment 2, which was based on the results of experiment 1, confirmed whether the composition of a multi-resolution stereoscopic image is perceived similarly to the completely high-resolution image.

Participants
Six male students participated in this study, all of whom were university students between 22 and 24 years of age.All of them had normal or corrected-to-normal visual acuity, and could distinguish the 2048 × 1536-pixel image from the 1600 × 1200-pixel image at a viewing distance of 450 mm.

Apparatus and Stimuli
A Wheatstone stereoscopic viewer using monitors and angled mirrors was used to present the stimuli.Stimuli, which were computer graphic images created using a personal computer (CPU: Intel Xeon 3.06 GHz; Graphic board: NVIDIA Quadro FX3000), were displayed on two LCD monitors (iiyama: AQ5311D-BK, 20.8 inch).These monitors were facing each other.Between the monitors, the mirrors were positioned at 45-deg angles.At a viewing distance of 450 mm (100 mm from the participant's eyes to the mirrors, 350 mm from the mirrors to the LCD), these monitors were subtended 54-by-41 arc deg.
Forty-one types of images were used as the stimuli (Fig. 1).The original image (resolution: 2048 × 1536 pixels) was created using OpenGL.In the image, a number of cubes, measuring 35 mm .Under a stereoscopic view, the cubes were alternately presented in anterior and posterior positions to provide depth perception to the observers.The even-numbered cubes on the even-numbered rows and the odd-numbered cubes on the odd-numbered rows were perceived to be 470 mm from the observers.The remaining cubes were perceived to be approximately 450 mm from the observers.
The other 40 stimuli with deteriorated resolutions were created by processing the original high-resolution image with the Nearest Neighbor (NN) algorithm using Photoshop Elements (Adobe).Processing rendered a decreased resolution in all the areas, except for the central area in a circular pattern.

Procedure
The experiment was performed in a dark room.The original high-resolution image was initially displayed.As each participant viewed an image, he was to fixate on the center of the image with both eyes.When the participants sensed that the left and right images fused into a single image, they clicked a mouse button.Then one of the 41 stimuli (the original or one of the 40 deteriorated images) was randomly presented to the participant for 1000 msec.After that, a one-digit or two-digit number was randomly presented for 100 msec at the center of the monitor.Then the participant determined whether the two images were the same, and pressed the appropriate button on the mouse.In addition, the participants verbally answered the presented number.Fig. 2 shows the flow of a trial.The reading tasks were performed to confirm whether the participant focused at the center of the image.Each stimulus was presented 10 times.Half of the participants initially completed the stereoscopic session and then the non-stereoscopic session.The time required was about 30 min.

Result and Discussion
We excluded trials in which the participants made a mistake in the reading tasks (correct answer rate of stereoscopic: 98.8% and non-stereoscopic: 99.2%).Fig. 3 and Fig. 4 show the mean proportion in which the participants perceived the two images as the same for the stereoscopic session and non-stereoscopic session, respectively.The proportions were high when the peripheral resolution was high.Furthermore, the proportions increased rapidly when the central area was more than 20 deg in sessions.After the experiment, we collected feedback from the participants.When they answered "not same," they perceived the jaggies as a pixelated effect in the peripheral visual field.We conducted Dunnett's multiple t-test to analyze the results, and to compare the proportions of the original image to that of the processed images to determine specifications for a stereoscopic multiple-resolution system.Fig. 5 shows the minimum resolutions for each view angle which were peripheral resolutions of the processed images that have no significant differences and are the poorest resolutions for each view angle.The stereoscopic and non-stereoscopic images produced similar results.We determined the composition of the multi-resolution image where the observers could not perceive a decreased resolution of the peripheral area based on the results of Dunnett's multiple t-test.Participants perceived images similarly when the completely high-resolution images were presented with images that contained areas in which the resolution was 25% (1024 × 768 pixels) of the original resolution at positions more than 20 deg outside the central position.Moreover, in areas shifted by more than 40 deg from the gaze point, we surmised that a resolution that was approximately 11% (640 × 480 pixels) of the resolution at the gaze point was sufficient.Participants could not perceive a decrease in the resolution at the peripheral area for the decreased visual acuity, especially in the peripheral field.However, the attenuation of the necessary resolution is more gradual than visual acuity [11].The decreased resolution generated jaggies, which were very noticeable and changed the sharpness of the image.Moreover, it is easier to detect jaggies than to detect a grid.

III. EXPERIMENT 2
Experiment 2 was designed to confirm whether the multi-resolution image system based on the results of experiment 1 produced the same results as the completely high-resolution image system.The results of experiment 1 indicated that observers cannot perceive a decrease in the resolution by 25% when the gaze point is shifted by 20 deg or more.Moreover, observers cannot recognize a decrease in the resolution of approximately 11% when it is shifted from the gaze point by 40 deg or more.We examined whether triple-resolution stereoscopic images could be distinguished from high-resolution stereoscopic images based on the results of experiment 1.Within 20 deg of the visual field, the image could be divided into more compositions to provide more detail.However, if the resolution, which decreased during processing, was not a divisor of the original image resolution, the observer could detect slight differences in the image.Because the resolution on the monitor was always an integer value, processed images were distorted.

Participants
Seven male students participated in this study, all of whom were university students between 22 and 25 years of age.All of them had normal or corrected-to-normal visual acuity.

Apparatus and Stimuli
The environment where the stimulus was presented was the same as experiment 1. Eye movements were monitored using a video camera (Sony: DCR-HC41) sampling at a rate of 30 Hz.The video stream was recorded using another personal computer.Gaze points were measured using image processing of the pupil and a HALCON 7.1 software library (MVTec).
There were four different resolution images, each of which included a pair of images with binocular disparity.The original high-resolution image (resolution: 2048 × 1536 pixels) was created using OpenGL.A cross was drawn at the center of the image to provide the gaze point.In the image, a number of blue spheres having a diameter of 20 mm [CIE xyY (0.21, 0.18, 14.71), about 16.1 cd/m 2 ] were shown at equal intervals in a 13 × 17 array with a white background [CIE xyY (0.32, 0.34, 138.36), about 110.2 cd/m 2 ].The decrease in image resolution influences the curvature of the spheres considerably and is seen as jaggies.The spheres were alternately presented in anterior and posterior positions to provide depth perception to the observers.The even-numbered spheres on the even-numbered rows and the odd-numbered spheres on the odd-numbered rows were perceived to be 470 mm from the observer.The remaining spheres were perceived to be approximately 450 mm from the observer.Each sphere occupied approximately 2.8 deg of the visual angle for the participant.
Close-up sphere of the central area.Resolution is the same as the high-resolution image (2048×1536 pixels).
Close-up sphere of the middle area .Resolution was decreased to 25 % of the central area (1024 ×768 pixels).
Close-up sphere of the peripheral area .Resolution was decreased to about 11% of the central area (640 ×480 pixels).The other three resolution images, which were triple-resolution, dual-resolution, and low-resolution images, were created by processing the original high-resolution image with the NN algorithm using Photoshop Elements (Adobe).The resolution of the central area, which subtended 16 × 12 deg in the triple-resolution image, was the same as that in the high-resolution image.The pixel density of the area subtended between 16 × 12 deg and 32 × 24 deg was decreased to 25% of that of the central area.The pixel density of the remainder of the visual field (from 32 × 24 deg to 54 × 41 deg) was decreased to approximately 11% of that of the central area (Fig. 6).In the dual-resolution image, the pixel density of the area outside the central area subtended within 16 × 12 deg was decreased to approximately 11% of that of the central area.In the low-resolution image, the pixel density was decreased to 11% of that of the high-resolution image throughout the image (640 × 480 pixels).

Result and Discussion
Fig. 7 shows the mean proportion in which the participants perceived the two images as the same.The proportions were 0% for the low-resolution trials, 4% for the dual-resolution trials, 77% for the triple-resolution trials, and 93% for the high-resolution trials.A one-way ANOVA with repeated measures was used to analyze the results from the various trials.There was a significant main effect from the type of trial [F (3,18)

IV. GENERAL DISCUSSION
We examined the relationship between the resolution of an image and viewing angle in the design of multi-resolution images, in which observers cannot perceive resolution changes.We conducted this fundamental research using static images with a fixed gaze point for the observer.We can initially define the relationship between image resolution and view angle with the conditions of our experiments.In the future, we will study the effects of using moving images without limiting the observer's gaze point.
In experiment 1, a triple-resolution image, where positions more than 20 and 40 deg from the center of the visual field were decreased to 25% and ~11% of the resolution of the gaze point, respectively, was perceived as similar to a completely high-resolution image.In experiment 2, participants could not differentiate the high-resolution stereoscopic images from the triple-resolution stereoscopic images based on the results of the experiment 1.However, observers were able to perceive the low-resolution area in the dual-resolution stereoscopic images.Based on these data, we concluded that the width of the central area with a high resolution should be wider than that indicated by earlier study [15].Our conjecture is that the discrepancy is due to differences between the interpolation algorithms.The NN algorithm results in jaggies and fast speed processing, whereas smoothing algorithms (i.e., Bilinear, Bicubic, and Lanczos interpolation) do not produce jaggies but at a significant processing cost for the image and the overall system.We can adopt better multi-resolution composition processing if the computing power of the hardware system is increased such that it can process the interpolation.
In experiment 1, the peripheral area resolutions of the stimulus were set based on the computer display standard.The stimulus should be set more as a parameter of the resolution to determine the general effect of resolution on perception.However, if our results are applied to an actual system, then the specifications of devices such as monitors and cameras must also be considered.
There are differences in the central and the peripheral fields of view for pattern discrimination [16,17].Hence, we need to examine a number of variables, including spatial frequencies, contrasts, colors, and objects used for the stimuli.Moreover, we should compare still images to moving images as the movement of an object may alter the perception of image resolution.Additionally, we should examine depth perception with stereoscopic multi-resolution images.Devisme, Drobe, Monot, and Droulez examined the difference between disparity and depth perception in a peripheral field, and found that decreasing the resolution may decrease the disparity information in stereoscopic images [18].Hence, the low-resolution area of a multi-resolution image may influence depth perception.
Our observations suggest that we can compress image data without reducing information using the triple-resolution configuration.Image compression is convenient in network transmission.The triple-resolution image, which could not be distinguished from the QXGA image, was created by three VGA images.The number of pixels in the overall image was 921,600 (640 × 480 × 3), whereas there were 3,145,728 pixels (2048 × 1536) in the QXGA image.Thus, the amount of data in the former is about 29% of that in the latter.
Finally, we produced a triple-resolution stereoscopic camera system based on our experimental results (Fig. 8) [19].Images were obtained for the left and right eye using three cameras (Point Grey Research: FLEA-HICOL-CS) to create a stereoscopic view.All of the obtained images contained 640 × 480 pixels (VGA).For each eye, the cameras for the central and middle areas of superimposing image had varifocal lenses (Omron: 13VM550T), the angles of which were set at 21 and 42 deg, respectively.The optical axes of these cameras were matched using a half mirror.The camera for the peripheral area was attached to a wide lens (Omron: 13FM28IR), which had an angle of 68 deg.These camera angles were based on the results from the experiment and the settings of the Wheatstone stereoscopic viewer.Observers viewed the stereoscopic triple-resolution images through a Wheatstone stereoscopic viewer.If they did not move their eyes, the observers reported that they did not perceive a decrease in the peripheral resolution in the triple-resolution image.To improve the system, the cameras in the future study will be moved according to the participant's gaze point.

Fig. 3 .
Fig. 3. Mean proportion answering the "same" under a stereoscopic view with regard to the size of central area and the resolution of peripheral area.Each line shows the proportion of answers for each central area.*** shows significance probability is less than 0.001 conducting Dunnett's multiple t-test to analyze the results and to compare the proportion of the original image to the proportions for the processed images.

Fig. 4 .Fig. 5 .
Fig. 4. Mean proportion answering the "same" under non-stereoscopic view with regard to the size of central area and the resolution of peripheral area.Each line shows the proportion of answers for each central area.*** shows significance probability is less than 0.001 conducting Dunnett's multiple t-test to analyze the results and to compare the proportion of the original image to the proportions for the processed images.

Fig. 6 .
Fig. 6.Triple-resolution image and close-up images of the decreased resolution image.