A Robust High-dimensional Data Reduction Method

In this paper, we propose a robust high-dimensional data reduction method. The model assumes that the pixel reflectance results from linear combinations of pure component spectra contaminated by an additive noise. The abundance parameters appearing in this model satisfy positivity and additive constraints. These constraints are naturally expressed in a Bayesian literature by using appropriate abundance prior distributions. The posterior distributions of the unknown model parameters are then derived. The proposed algorithm consists of Bayesian inductive cognition part and hierarchical reduction algorithm model part. The proposed reduction algorithm based on Bayesian inductive cognitive model is used to decide which dimensions are advantageous and to output the recommended dimensions of the hyperspectral image. The algorithm can be interpreted as a robust reduction inference method for a Bayesian inductive cognitive model. Experimental results on high-dimensional data demonstrate useful properties of the proposed reduction algorithm.


INTRODUCTION
The reduction of hyperspectral image has been widely used in remote sensing signal processing for data analysis [1][2][3][4]. Its underlying assumption is based on the fact that all data sample vectors are mixed by a number of so-called endmembers assumed to be presented in the data. By virtue of this assumption, two models have been investigated in the past to model how mixing activities take place. One is the macrospectral mixture that describes a mixed pixel as a linear mixture endmembers opposed to the other model suggested by Hapke, referred to as intimate mixture that models a mixed pixel as a nonlinear mixture [5][6]. Another is how to estimate these endmembers once the number of endmembers is determined. Such statistics are said to be sufficient if they capture all the "relevant information" in the sample about the identity of reduction of hyperspectral image [7][8].
In this work we propose a way of quantifying this information using information theoretic notions and show how features which maximize this information can be extracted. Due to its link with the statistical concept of reduction, an view of this setup is a generalization of the problem of nonlinear regression. There is often more information between and than can be Manuscript received on 12 November 2008 E-Mail: longcunjin@163.com captured by a single conditional expectation value [9][10][11][12][13]. Our problem then is to find several such functions, or regressions, that together capture more of the information and structure of the variables. As will be shown in this paper, this problem can be cast as finding dimension reduction of hyperspectral image which captures the mutual information in a two-way contingency table. It is thus related to a long line of work in statistics.
As explained above, the linear reduction of model is classically used to model the spectrum of a pixel in the observed scene. And the dimension reduction of hyperspectral image has already received much attention in the literature [3][4][14][15][16][17]. Consequently, their estimation requires to use a quadratic programming algorithm with linear equalities and inequalities as constraints. Different estimators were developed using these ideas [18][19][20]. This paper studies a Bayesian estimator of Inductive Cognition which allows one to estimate the abundances in an nonlinear reduction of model. The proposed algorithm defines appropriate prior distributions for the unknown signal parameters and estimates these unknown parameters from their posterior distributions.
The prior distributions used in the present paper depend on hyperparameters which have to be determined. The unknown parameters in the posterior distributions requires to use appropriate simulation methods such as hierarchical Bayesian methods [21][22][23]. It is well known that the dimensionality of input space strongly affects performance of many classification methods. This requires the careful design of new algorithms that are able to handle hundreds of such spectral images at the same time minimizing the effects from the "curse of dimensionality". We first illustrate a well-known phenomenon in hyperspectral data. To address the characteristic that certain parts of the spectrum will provide a much richer descriptor for classification than other parts, some approaches such as a straightforward feature selection or a block-based approximation to the covariance matrix can be applied [24][25].
The remainder of the paper is organized as follows. An overview of the related work is given in Section Ⅱ. The proposed algorithm is discussed in Section Ⅲ. Section Ⅳ presents the experimental settings and performance evaluation. Section Ⅴ concludes this paper.

II. RELATED WORK
The natural variability of the material spectra and the noise added by the transmission media and sensor system make necessary the use of statistical methods for information extraction and pattern recognition on hyperspectral data. Hyperspectral imaging technology has found application beyond earth remote sensing in agriculture, medicine, biology, pharmaceuticals, forensics, color vision, target detection, archaeology, and many others near field applications. However, classification of hyperspectral data is primarily made on a pixel by pixel basis with classification accuracy figures in the range 79%~84%, and they have not changed significantly in recent decade [4,6,10,[26][27][28]. The scale-space framework introduced by the diffusion equation has been also used for image reduction, in conjunction with level sets to detect movement in image sequences, information extraction and image restoration, registration, and classification integrating level sets in a common framework [17][18]. Continuous transformation of the original image into a space of progressively smoother images identified by the scale or level of image smoothing, in terms of pixel resolution.
In recent years, hyperspectral data reduction and classification are generally considered as a processing step for spectra classification, target detection or segmentation. For each of these problems, different methods exist. Methods of feature extraction like projection pursuit or Bayesian construct a lower dimension feature space by transforming the original data to preserve the most discriminable information content but change the physical meaning of the components [10,13,15,20]. Spectral reduction methods, as discussed in [4,13], reduce the number of spectra by keeping the representative ones. Such methods do not care about the spatial structure of hyperspectral data; hyperspectral data is only considered as a set of spectra. In [3], the authors use independent component analysis for band selection reduction method. The aim in this case is to identify the spectral signature that identifies these materials, each one having specific characteristics on certain bands. [24] utilized a Gaussian mixture model to analysis to reduce the size of the spectral vector. In our study, we account for the spatial structure of the data through a hierarchical Bayesian model about from that we also account for the noise and thus estimate a typical mean spectra for each class. The authors in [7] show that the reduction matrix obtained might be far from the true one. The classical hierarchical Bayesian model does not take into account any information about noise or any prior information on the signal of interest [29][30][31]. In this paper, we proposed a Bayesian estimation framework for the sources with a common Bayesian classification variable which is modeled as a inductive cognition field. Each value of this Bayesian variable corresponds to a characteristic mean spectrum of a given region.

III. THE PROPOSED ALGORITHM
Since Bayesian model with concentration hyperparameter  defines a prior on all partitions of the k n hyperspectral image point n k D (the value of  is directly related to the expected number of classes), the prior on the merged hypothesis is the relative mass of all k n points belonging to one class versus all the other partitions of those k n virtual reality data points consistent with the tree structure. The proposed algorithm is based Bayesian inductive cognitive model by an approximate infe-rence approach. This can be computed bottom-up as the tree built shown in Fig. 1. The main idea in our Bayesian of inductive cognition model is to use the hyperspectral model (3) and the prior distributions (4) and to obtain the posterior law Our main goal in our proposed algorithm is to incorporate the idea of multi-scale hyperspectral data reduction into extended Bayesian of inductive cognitive transformations. However, if the reduction patterns do not have regular properties across the hyperspectral data, an adaptative scheme is needed to ensure the wonderful experimental result. In order to extend our algorithm model and closing operations to hyperspectral image, let us consider a hyperspectral image f defined on N R . Given a Bayesian model of minimal size, extended opening by reconstruction is defined by Using Eqs. (18) and (20) By duality, the derivative of the closing Bayesian model Given all of the above, the multi-scale opening characteristic   Similarly, the Bayesian model of multi-scale closing characteristic The procedure architecture of the proposed reduction algorithm is shown in Fig. 2.

IV. EXPERIMENTAL RESULTS
In this part, we will present the effectiveness of the dimensional reduction approach based on Bayesian model, over the hyperspectral image data set. However, since the data is separated into various discontinuous intervals, the dimensional reduction approach cannot be applied as is and need to be adapted for the hyperspectral data. Thus, the regular Bayesian reduction algorithm can be applied as it is, on each interval. The decomposed hypespectral data output is composed of the decomposed output of each considered interval, appended to each other. This mechanism is illustrated in Fig. 4. The hyperspectral data interval decomposition can be done during the pre-processing stage.
Such algorithm is composed of three major phases: pre-processing, processing and post-processing. The pre-processing initializes the problem variables and orders the intervals from the hyperspectral data file. The processing step determines for each pixel and its interval the maximum number of Bayesian dimensional reduction which leads to a viable correlation. The granularity of considered intervals used during the Bayesian processing can vary. It can be matching the number of intervals of the hyperspectral data, namely 17 intervals. On the other hand, the granularity can be coarser, to match the three major subintervals. This is possible since intervals composing those subintervals are generating contiguous spectrum ranges. However, a special care must be taken to conserve spectral contiguousness inside each major subinterval, i.e. reading intervals 1, 2, 3, 4 in this order, according to Table 1, for first major subinterval.  Table 1 shows the level of Bayesian dimension reduction achieved for different correlation values, as well as the resulting number of bands, per major sub-interval. Each sub-interval is identified by its interval components. The total reduction line gives the overall reduction efficiency. Table 1 is considering only three sub-intervals, concatenating close intervals to produce contiguous ranges in frequency domain. On the other hand, Table 2 is considering each interval independently. It can be noticed that by breaking the spectrum into smaller domains, a different level of reduction can be achieved for each interval.
In our work, we conduct some experiments in hyperspectral data dimensionality reduction in order to demonstrate the feasibility of the proposed algorithm. Our datacube was acquired using an Airborne Visible/Infrared Imaging Spectrometer (AVIRIS). The original scene with size of 256 256  pixels, was acquired by the AVIRIS sensor, which is a mixed river/city area in south of china, early in the growing season. The scene comprises 220 spectral channels to be used in the experiments. These laboratory spectra, which were convolved in accordance with AVIRIS wavelength specifications, will be used to assess endmenber signature purity in this work. At this point, it is important to note that many of the mineral spectra in the USGS library are not from the south of china area. Thus, the best match between hyperspectral image endmember and one in the USGS library spectra does not necessarily mean the best match the endmember. In addition, some minerals do not occur in pure form in the area, specifically at the 20-m spatial resolution of the sensor. For illustrative purposes, Fig. 3 shows high-dimension data expressed by computer graphics and a reduction result of hyperspectral data using our proposed reduction based on Bayesian inductive cognition model. It is useful that the results are reduced by our proposed algorithm.  59 estimated by a Bayesian sampling strategy. These posterior distributions provided estimates of the unknown parameters but also information about their uncertainties such as standard deviations of confidence intervals. The proposed algorithms were developed depending whether the endmembers belonging to the mixture are known or belong to a known area. Simulation results conducted on real images illustrated the performance of the proposed reduction algorithm based on Bayesian inductive cognitive model. The hierarchical reduction methodologies developed in this paper could be modified to handle more complicated models. For instance, it would be interesting to extend the proposed algorithm to reduce hyperspectral data composed of homogenous regions surrounded by hyperspectral image by introducing dimensions correlation via Bayesian models. Future work will investigate the use of complex hyperspectral data for the selection of the number of reduced bands and of the correlation threshold, as well as for detecting data anomalies. The dimension reduction technique based on Bayesian model will also be compared to the PCA technique to estimate the efficiency of each with the hyperspectral data.