1



Information Processing and Extraction Group

Group Leader: Dr. Miguel Vélez-Reyes

Members: Dr. Sandra Cruz-Pol, Dr. Shawn Hunt, Dr. Hamed Parsiani,

Dr. Bienvenido Vélez, Dr. Manuel Rodríguez, and Dr. Pedro Rivera

[pic]

Figure 7.1.1: Components of the IPEG Research Group.

1 Introduction

NASA’s Office of Earth Science (OES) studies Earth as an interconnected system of atmosphere, oceans, continents and life to understand the total Earth system and the effects of natural and human-induced changes on the global environment. Using higher spectral and spatial resolution imagery, and the availability of multi-mode sensing modalities that are (or will be) available from space and sub-orbital platforms, NASA acquires, processes, and make available very large (gigabyte to terabyte) volumes of remote sensing data, and related observations and information to private, public, and governmental entities. This information is, or has the potential to be, used by scientists to understand and solve major scientific mysteries, and by the practitioners and policy makers to solve practical, societal problems, and/or establish sound policy decisions. To reach the full potential of using the available information, we need to improve our ability to extract and manage information about our environment provided by remote sensing data, and related observations.

Signal processing is a broad field that encompasses the acquisition of data from the world around us, manipulating or processing that data into a useful form, the extraction of information from that data, and the interpretation of that information.  The breadth and power of signal processing is what makes it one of the key enabling technologies of the information age. Our group research focuses on the development of signal processing algorithms to extract information from remotely sensed and related data. The approach we propose is based on the development and combination of physical and statistical models for both observations and prior knowledge and the subsequent use of these models for optimal or near-optimal processing.

Figure 7.1.1 shows the different projects under IPEG. A detailed description of the research work is given in Section 7.2. Sections 7.3-7.13 give detailed information about outcomes such as publications, students, and collaborative efforts. Table 7.1.1 gives a summary of important IPEG outcome statistics.

Table 7.1.1 IPEG Statistical Data.

|# Investigators (Total # faculty and research associates) |8 |

|# Undergrad students supported (total) |3 |

|# Undergrad students supported (US Citizen, UMD only) |3 |

|# Undergraduate degrees awarded (US Citizen, UMD only) |1 |

|# Masters students supported |3 |

|# Masters students supported (U.S. Citizen, UMD only) |3 |

|# Doctoral Students Supported |3 |

|# Doctoral Students Supported (U.S. Citizen, UMD only) | |

|# Grad students supported (total) |6 |

|# Grad students supported (U.S. Citizen, UMD only) |4 |

|# Masters degrees awarded (US Citizen, UMD only) |3 |

|# Doctoral degrees awarded (U.S. Citizen, UMD only) | |

|# Post-docs supported (total) |1 |

|# Post-docs supported (U.S. Citizen, UMD only) | |

|# Refereed Papers and Book Chapters Accepted/Published |3 |

|# of student authors or co-authors | |

|# Refereed Papers and Book Chapters submitted but not yet accepted or published | |

|# of student authors or co-authors | |

|# Presentations at NASA Installations | |

|# Presentations national/international research conferences |19 |

|# given by students |9 |

|# Presentations at Faculty seminars | |

|# Presentations to K-12 Schools | |

|# K-12 Students Contacted | |

|# K-12 Educators Contacted | |

|# Patents applied/awarded | |

|# NASA MUREP Panels served on (peer review, advisory, etc.) | |

|# Other NASA Panels served on (peer review, advisory, etc.) |1 |

|Total research $ awarded by NASA other than URC | |

|Total research $ awarded by other Agencies |$625k |

|Infrastructure $ leveraged from NASA sources other than URC | |

|Infrastructure $ leveraged from other Agencies during 1998 as a direct result from URC | |

| | |

|UMD = Underrepresented Minority and Disabled | |

2 Research Projects

1 Hyperspectral Image Processing: Miguel Vélez-Reyes and Vidya Manian

1 Problem Statement

The main objective of this project is the development of information extraction techniques based on hyperspectral imagery (HSI) for environmental applications. In Hyperspectral Imaging or Imaging Spectroscopy, high-spectral resolution and spatial information of the scene under study is collected. As the object of interest is embedded in a translucent media (i.e. atmosphere, or coastal waters), the measured spectral signature is a distorted version of the original object signature mixed with clutter. The high spectral resolution could allow discrimination between media and object contributions enabling the retrieval of information about the object of interest. Our approach is based on integrating physical models and statistical methodologies where appropriate. Algorithm validation is done using simulated data, real data collected using laboratory setups, as well as data available from existing AVIRIS and HYPERION imagery. Validated algorithms are incorporated into the Hyperspectral Image Analysis Toolbox being developed at LARSIP.

The work on the this project can be divided into two components

• Object Recognition

• Estimation Problems

The object recognition has been the prime focus of our work for many years. Mature classification, feature extraction, and training algorithms are being incorporated into a hyperspectral image processing toolbox being developed in the MATLAB environment. Estimation problems have been focused in developing algorithms for unmixing of hyperspectral images.

2 Relevance to NASA

Hyperspectral sensors are being used by NASA scientists for different applications such as land cover classification and mineralogy in earth and planetary[1] surfaces. The proposed research in information extraction algorithms from hyperspectral imagery can provide NASA scientist the tools needed for information extraction to study important earth and extraterrestrial planetary processes.

3 Goals of the Project

• To develop algorithms that incorporate spectral and spatial information for hyperspectral image classification.

• To develop algorithms for subpixel object classification using unsupervised unmixing approaches.

4 Component Accomplishments

• Implemented the constrained positive matrix factorization to solve the unsupervised unmixing problem. Experimental results show that it can retrieve endmembers with physical significance and their abundances and achieve better fitting than state of the art two stage approach.

• Implemented texture features using wavelets for hyperspectral image classification.

• New version of the MATLAB hyperspectral toolbox including abundance estimation algorithms has been released. An executable version was developed which does not require access to a MATLAB license.

5 Technical Summary

The main objective of this project is the development of techniques for information extraction from hyperspectral remote sensing. Research sponsored under the NASA grant focused on three areas this year:

• Integration of spatial and spectral information for hyperspectral image classification

• Spectral unmixing

• Hyperspectral image analysis toolbox development

1 Integration of Spectral and Spatial Information in Hyperspectral Classification

Our main objective is the development of tools for detection and classification in hyperspectral imagery that can deal with the high dimensionality of the image and take full advantage of the information in the spectral and spatial domains. Work this year focused primarily in the integration of spectral and spatial information in hyperspectral image classification.

Introduction

Hyperspectral images are characterized by large amounts of data spread over several spectral bands. Traditionally these images are classified by spectral information alone [1] employing a classifier. There are some works reporting the performance of spatial methods in classifying these images. Gaussian Markov random fields have been used to model texture information in these images [2]. Filter banks have been used in [3] and [4] for recognition of regions. In this work we present a set of statistical and multiresolution features for improving hyperspectral image classification.

Spatial Texture Features

Let[pic], represent a hyperspectral image where N1 and N2 and B are the number of rows, number of columns and number of bands, respectively. The spatial texture features are a set of 6 statistical and multiresolution features that capture the gradient, directional variations and the residual energies of texture regions in a spectral band. The texture features include the average (f1), the standard deviation (f2), the average deviation of gradient magnitude (f3), the average residual energy (f4), and the average deviation of the horizontal (f5) and vertical directional residuals (f6) of pixel intensities given by

[pic], [pic],

[pic],[pic],[pic],

and

[pic],

respectively ([pic]is the mean value of the spectral texture). These features capture the edge information of the texture and are, therefore, useful in characterizing the coarseness and randomness of the texture.

The second set includes multiresolution features computed from wavelet decomposition of texture regions. A 2-level wavelet transform of the region is constructed by convolving it with a wavelet filter of length n. This produces 4 sub-images: the approximate, vertical, diagonal and horizontal detail images. The approximate sub-images are further decomposed in a similar manner; specifically, the sub-images at level r are half the size of the sub-images at level r-1. In this work, wavelet features are the wavelet coefficients of the sub-images.

Hyperspectral Image Classification Algorithm

Given a set of training samples for the classes present in the image, the algorithm partitions it into two sets – training set and testing set. Spectral features are the pixel values in each band. Texture features as described above are extracted for each of the samples. A texture class is represesented by the mean feature vector of the training samples of that class. A testing samples is classified to a particular class using the minimum distance obtained from the distance metric [pic]where fp is the feature vector, [pic]is the feature vector for texture class i,[pic]is the feature vector for texture class j, and P is the number of features. (( fp ) is the standard deviation of the features over the texture classes.

Classification Results and Discussion

We have implemented the wavelet texture feature extraction method for reducing the number of bands in the hyperspectral image, followed by Support Vector Machine (SVM) classification based on the reduced number of bands. The SVM algorithm has been developed for detecting objects/targets in a scene. This algorithm can also be applied for ATD in hyperspectral images. Extensions of the algorithm to hyperspectral images based on pixel spectra is presented in [5]. In [6], reduced the number of bands for the Indian Pine to 9. 90 training samples from each of the 7 classes are used to first compute the wavelet features. The wavelet filtering is done up to the maximum number of decomposition levels, which is J1= log2200=7.6 ( 8. Hence, there are 9 features per sample (8 energy features from the 8 levels of detail coefficients and 1 feature from the approximate coefficients). The parameters d (degree of polynomial kernel) is tuned between 1 and 15 and C (regularization parameter) between 100 and 106. In this experiment, the polynomial degree d = 14. The SVM classifier is trained with the mapped features and regularization parameter C = 15,000. Fig. 7.2.1.1 shows the image of size 145 x 145 with the 7 classes. The Airborne Visible Infrared Imaging Spectrometer (AVIRIS) hyperspectral data acquired over the Indian pine test site in Northwestern Indiana in 1992 is used. This image has originally 220 spectral bands, after excluding the water absorption and noise bands, 200 bands are selected. The number of testing samples per class is 200. Table 1 shows the confusion matrix for the wavelet-SVM method, where each entry Xii (ith row and ith column) is the percentage of samples from class i classified correctly in class i and Xij is the percentage of samples from class i classified incorrectly in class j. The producer’s accuracy in the last column is 100% minus the percent omission error. The last row of the table gives the user’s accuracy which is equal to 100% minus the commission error. The overall accuracy (OA) is 94.29%. For comparison the SVM classifier is also trained with the 200 spectral features per class. Table 2 gives the confusion matrix for this experiment. It can be seen that the wavelet – SVM method improves accuracy by about 2.5%.

[pic]

Fig. 7.2.1.1. Subset of AVIRIS hyperspectral image of Indian Pine with ground truth classes.

Table 7.2.1.1. Confusion matrix for Indian Pines image using wavelet-SVM method.

|Classes |Soy notill |Soy |

| | |min |

|Seagrass |88 |88 |

|Sand |82.05 |92.31 |

|Coral reef |98.08 |86.08 |

|Mangrove |100 |100 |

|Water |81.5 |100 |

|OA |84.73 |93.28 |

2 Spectral Unmixing

Linear Mixing Model

Analytical models for the spectral mixing of materials provide the basis for the development of techniques to inversely recover estimates of the constituents and their proportions from mixed pixels. In the typical mixing process, the surface is portrayed as a checkerboard mixture. The incident light interacts with the surface and the received light conveys the characteristics of the media in a proportion equal to the area covered by the endmembers and the reflectivity of the media. The typical model for such interaction is given by [7, 8]

|[pic] | |

| |(7.2.1.1) |

where [pic] is the pixel of interest, si is the spectral signature of the i-th endmember, ai is the corresponding fractional abundance, w is the measurement noise, m is the number of spectral channel, and p is the number of endmembers. The matrix [pic] is the matrix of endmembers and [pic]is the vector of spectral abundances. Notice that all elements of S, A, and x are constrained to be positive, and [pic]. Typically, m>n so we are dealing with an overconstrained linear system of equations. In the literature this is called the linear mixing model (LMM) [7, 8].

Unmixing Algorithm

For the entire image, the linear mixture model given in Equation (7.2.1.1) above can be written in matrix form as follows

|[pic] | |

where[pic],[pic],[pic], and [pic] are matrix representations of the hyperspectral image, and N is the number of pixels in the image.

Full spectral unmixing can be formulated as the procedure by which we want to factor a given image X as the product of the endmember matrix S, and the abundance matrix A. As mentioned earlier, most published methods for unmixing solve the problem in two stages. In the first stage, the endmembers are determined by one of several methods such as PPI and N-FINDR [7, 8]. In the second stage, the abundances are estimated by solving a linear least square problem under different constraints [9, 10, 12, 11, 13]. We call this approach supervised unmixing.

Our research work is focused in unsupervised unmixing where endmemembers and abundances are estimated together with minimal intervention from the image analyst. A potential mathematical formulation for joint endmember and abundance estimation is the following optimization problem

|[pic] |(7.2.1.2) |

where p is the number of endmembers, S is the matrix of endmember signatures, A is the matrix of abundances, [pic] is the Frobenius norm (other norms or seminorms are possible), and 1r is an r-dimensional vector of ones. As shown in [14, 15, 16, 17], this problem is a special case of the positive matrix factorization (PMF) problem [18, 19, 20] with the additional constraint that AT1p=1N. We call this a constrained PMF or cPMF [19, 15, 16].

Last year we studied solving (7.2.1.2) using a penalty type method to enforce the sum to one constraint as follows:

|[pic] | |

where

[pic]

and ((0 is a positive constant, The number of endmembers p was fixed a priori.

Another potential algorithm for (7.2.1.2) can be derived by noticing that (7.2.1.2) is multilinear in S and A, The multilinear structure of (7.2.1.2) makes the Gauss-Seidel also called alternating least squares algorithm of interest. This method solves the optimization problem (7.2.1.2) by alternating between two steps. In the first step, the endmember matrix S will be fixed and the corresponding abundances matrix is estimated by solving a Non-negative Sum-to-One constrained Linear Least Squares problem [10]. In the second step, the abundance matrix A will be fixed to the current estimate and the endmembers matrix estimate is updated by solving a non-negative linear least squares problem. The basic structure of the algorithm is a Gauss Seidel or coordinate descent optimization method given by

[pic] (3)

[pic] (4)

This is repeated until convergence is achieved. The time complexity of this algorithm is O(MNP) per iteration where M is the number of spectral bands, N is number of pixels, and P is the number of endmembers, respectively. This is a coordinate descent optimization algorithm and convergence to a minimum in guaranteed. In addition to looking at different algorithms, we studied the effect of different initialization schemes using randomly selected pixels from the image, using pixels with the largest norm, and using endmembers obtained with the standard approaches such as PPI and N-FINDR. Also, different stopping criteria for the iterative methods were studied. Because [pic] is computationally expensive, simple stopping criterion are used, i.e., maximum number of iteration, specific error value, or the difference between the estimated parameter values in two successive iterations below a specified threshold. Both [pic] and [pic] are required to be smaller than a specified value. This criterion works well as can be seen later in the experimental results. Some convergence criterions are given in [21].

Table 7.2.1.4: Results obtained by two-stage and penalty approaches.

|Data Initialized(E0=309.4084) |PPI Initialized(E0=1576.9) |

| |# iterations |Time(sec) |Error Value |# iteration |Time(sec) |Error Value |

|Gauss-Seidel |173 |213.1 |1.584 |161 |191.32 |1.6109 |

|Penalty |552 |74.107 |1.8251 |557 |105.35 |1.8949 |

[pic]

(a)

[pic]

(b)

Figure 7.2.1.3: Images of the Enrique reef study area: (a) IKONOS (b) HYPERION.

Experimental Results

A small subset (29x49x90) of the Hyperion Enrique Reef image shown in Figure 7.2.1.3 was used in the experiment. Figure 7.2.1.3(a) shows a 1m resolution image of the area taken with the IKONOS multispectral sensor, Figure 7.2.1.3(b) shows another image of the same area taken using Hyperion, a 30m spatial resolution hyperspectral sensor. The spatial degradation caused by low spatial resolution is clearly seen. The 1 m resolution image is also used to evaluate qualitatively the results from the unmixing process. Five endmembers were identified in the image: sand, seagrass, reef head, mangrove and deep water. Figure 7.2.1.4 shows the spectra for the endmembers extracted using the Pixel Purity Index (PPI) method in ENVI and Figure 7.2.1.5 shows the five endmembers extracted using N-FINDR [22].

The cPMF was computed using the two approaches described previously. The abundance matrix was initialized randomly using Gaussian distribution with zero-mean and unit variance. The randomly generated matrix is normalized to force the sum-to-one constraint on its columns. The endmembers matrix was initialized either using pixels from the measured data or using the selected endmembers from PPI. Figure 7.2.1.6 and Figure 7.2.1.7 show the estimated spectra using the penalty approach. Figure 7.2.1.8 and Figure 7.2.1.9 show the estimated spectra using the Gauss-Seidel approach. The endmember estimates using cPMF (both methods) are different from those retrieved using N-FINDR and PPI. The mangroves have the most distinctive spectral signature and all approaches get very similar estimates. Segrass general features are well retrieved but there is variability across estimates. PPI, cPMF using penalty and cPMF using Gauss Seidel retrieve similar features for reef crest. The estimates for sand and water are quite variable across the four estimators although certain level of agreement in general features can be observed between the two cPMF.

Another way to evaluate the performance of the estimators by looking at how well the estimated endmembers and abundances can reconstruct the original pixels in the image. For this comparison, the R2 value for each pixel was computed using

[pic]

as a measure of model fit. The closer R2 is to one the better the fitting. Figure 7.2.1.10 shows the R2 value for the Gauss-Seidel approach. Figure 7.2.1.11 shows the R2 value for the penalty approach. Figure 7.2.1.12 shows the R2 value when using the endmembers obtained by PPI and N-FINDR. Clearly both cPMF approaches retrieved models that fit the data better than either PPI or N-FINDR.

Figure 7.2.1.13 shows the value of the Frobenious norm at each iteration and Table 7.2.1.4 shows a summary on timing and iteration results for Gauss-Seidel and penalty approaches to cPMF. The results in Table 1 show that the Gauss-Seidel algorithm needs fewer iteration to converge but each iteration is more costly resulting in higher CPU time while the penalty method needed more computationaly cheaper iterations that resulted in lower CPU time even with more iterations.

[pic]

Figure 7.2.1.4: Estimated spectra of five endmembers extracted using PPI

[pic]

Figure 7.2.1.5: Estimated spectra for five endmembers using N-FINDR

[pic]

Figure 7.2.1.6: Estimated spectra for five endmembers using penalty approach (data initialized).

[pic]

Figure 7.2.1.7: Estimated spectra for five endmembers using penalty approach (PPI initialized).

[pic]

Figure 7.2.1.8: Estimated spectra for five endmembers using Gauss-Seidel approach (data initialized).

[pic]

Figure 7.2.1.9: Estimated spectra for five endmembers using Gauss-Seidel approach (PPI initialized).

[pic] [pic]

(a) (b)

Figure 7.2.1.10: R2 value for Gauss-Seidel approach (a) data initialized (b) PPI initialized.

[pic] [pic]

(a) (b)

Figure 7.2.1.11: R2 value for penalty approach (a) data initialized (b) PPI initialized.

[pic] [pic]

(a) (b)

Figure 7.2.1.12: R2 value for estimation error using (a) PPI (b) N-FINDR estimated endmembers.

[pic] [pic]

(a) (b)

Figure 7.2.1.13: Square Error from penalty and two-stage approaches (a)(data initialized) (b)(PPI initialized).

6 Hyperspectral Image Analysis Toolbox

The Hyperspectral Image Analysis Toolbox is an image analysis software using MATLAB [23, 24]. It is intended for the analysis of multispectral and hyperspectral images. A new version of the toolbox is now available. Since 2005, we refined the GUI and added abundance estimation algorithms. Figure 7.2.1.14 shows a functional block diagram of the toolbox and Figure 7.2.1.15 shows an example of the toolbox GUI and the help. The toolbox can be downloaded from . More than 500 downloads have been recorded from industry, academia and government with applications ranging from remote sensing, biomedical imaging to food processing.

[pic]

Figure 7.2.1.14: Toolbox functional block diagram.

[pic][pic]

Figure 7.2.1.15: Hyperspectral Image Analysis Toolbox.

7 Future Work

• Continue work on algorithms for integration of spectral and spatial information in hyperspectral image classification.

• Continue work in unsupervised unmixing algorithms. Focus on validation and fast implementation.

• Integrate developed algorithms in the Hyperspectral Image Analysis Toolbox.

8 Technology Transfers

• A new version of the Hyperspectral Image Analysis MATLABTM Toolbox has released to community and described in several publications [23, 24]. Over 500 downloads already.

9 References

1] D. A. Landgrebe, Signal theory methods in multispectral remote sensing, Wiley, NJ, 2003.

2] G. Rellier, X. Descombes, F. Falzon and J. Zerubia, “Texture feature analysis using a Gauss-Markov model in hyperspectral image classification, IEEE. Trans. GRSS, Vol. 42, No. 7, pp. 1543-1551, July 2004.

3] P. S. Hong, L. M. Kaplan and M. J. T. Smith, “Hyperspectral image segmentation using filter banks for texture augmentation,” Proceedings. IEEE Workshop on Advances in Techniques for Analysis of Remotely Sensed Data, pp. 254-258, 2004.

4] M. Shi and G. Healey, “Hyperspectral texture recognition using a multiscale opponent representation,” IEEE Trans. GRSS, Vol. 41, No. 5, pp. 1090-1095, May 2003.

5] J.A. Gualtieri, “Hyperspectral analysis, the support vector machine, and land and benthic habitats,” Proceedings of IEEE workshop on Advances in techniques for analysis of remotely sensed data, pp. 354-363, Oct 2003.

6] V. Manian and M. Velez-Reyez, “Support vector classification of land cover and benthic habitat from hyperspectral images,” presented at Intl. conf. ISSSR 2006, Bar Harbor, Maine.

7] N. Keshava and J.F. Mustard, “Spectral Unmixing.” In IEEE Signal Processing Magazine, pp. 44-57, January 2002.

8] Keshava,N. and J. F. Mustrad,” A survey of spectral unmixing algorithms.” In Lincoln Laboratory Journal, Vol. 14, No.1, pp. 55-78, 2003.

9] M. Velez-Reyes, S. Rosario, A. Puetz, R. B. Lockwood, “Iterative algorithms for unmixing of hyperspectral imagery.” In Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery IX, Proceedings of SPIE Vol. 5093, April 2003.

10] C.L. Lawson and R.J. Hanson, Solving Least Squares Problems, Prentice-Hall, 1974.

11] M Vélez-Reyes and S. Rosario, “Solving Adundance Estimation in Hyperspectral Unmixing as

a Least Distance Problem” In Proceedings of International Geoscience and Remote Sensing Symposium, September 2004.

12] S. Rosario-Torres and M. Vélez-Reyes, “An algorithm for fully constrained abundance estimation in hyperspectral unmixing.” In Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XI, Proceedings of SPIE Vol. 5806, April 2005.

13] Chang C. I.; Ren H.; et. al. ‘‘Subpixel Target Size Estimation for Remotely Sensed Imagery". In Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery IX, Proceedings of SPIE Vol. 5093, April 2003.

14] Masalmah, Y.M., M. Velez-Reyes, and S. Rosario-Torres, “An algorithm for unsupervised unmixing of hyperspectral imagery using positive matrix factorization.” In Proceeding of SPIE Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XI, Vol. 5806, April 2005.

15] Y.M. Masalmah and M. Vélez-Reyes, “Unsupervised unmixing of hyperspectral imagery using the constrained positive matrix factorization.” In Proceedings of SPIE: Independent Component Analyses, Wavelets, Unsupervised Smart Sensors, and Neural Networks IV, Vol. 6247, April 2006.

16] Y.M. Masalmah and M. Vélez-Reyes, “A comparison of algorithms to compute the positive matrix factorization and their application to unsupervised unmixing.” In Proceedings of SPIE: Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XII, Vol. 6233, May 2006.

17] Y.M. Masalmah and M. Vélez-Reyes, “Unsupervised Unmixing of Hyperspectral Imagery.” In Proceedings of the IEEE Midwest Symposium on Circuits and Systems, San Juan, Puerto Rico, 2006.

18] Lee, D. D. and H. S. Seung, ”Learning the parts of objects by non-negative matrix factorization.” In Nature 401 (1999) 788-791

19] Lee, Daniel D. and H. Sebastian Seung, “Algorithms for non-negative matrix factorization.” In Advances in Neural Information Processing, pp. 556-562, 2001.

20] Lee, D. D. and H. S. Seung, “Unsupervised Learning by Convex and Conic Coding.” In Proceedings of the Conference on Neural Information Processing Systems 9, pp. 515-521, 1997.

21] Chih-Jen Lin,” Projected Gradient Methods for Non-negative Matrix Factorization,” Information and Support Services Technical Report ISSTECH-95-013, Department of Computer Science, National Taiwan University, 2005.

22] Winter, M. E.,"Fast autonomous spectral end-member determination in hyperspectral data." In Proceedings of the Thirteenth International Conference on Applied Geologic Remote Sensing, Vol. II, pp 337-344, Vancouver, B.C., Canada, 1999.

23] E. Arzuaga-Cruz, L.O. Jimenez-Rodriguez, M. Velez-Reyes, D. Kaeli, E. Rodriguez-Diaz, H.T. Velazquez-Santana, A. Castrodad-Carrau, L.E. Santos-Campis, and C. Santiago. “A MATLAB Toolbox for Hyperspectral Image Analysis.” In Proceedings IEEE International Geosciences and Remote Sensing Symposium, Alaska, September 2004.

24] S. Rosario-Torres, E. Arzuaga-Cruz, M. Velez-Reyes and L.O. Jiménez-Rodríguez, “An Update on the MATLAB Hyperspectral Image Analysis Toolbox.” In Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XI, Proceedings of SPIE Vol. 5806, April 2005.

2 Digital Signal Quantization by Dr. Shawn Hunt

Most signals today are converted into digital form because of the ease of transmission, manipulation and storage. The process of conversion from analog to digital, or from a high resolution digital to a lower resolution digital signal usually adds quantization noise. This research investigated methods of determining whether dither is needed when re-quantizing, or lowering the resolution of a digital signal. As digital signals have become common, re-quantization of these signals has come into widespread use. Requantization is done when the resolution of a digital signal is lowered in the digital domain. For example, a 24 bit signal lowered to 16 bit resolution, or a 16 bit signal lowered to 8 bits. Truncation typically introduces undesirable artifacts so dither is generally used. This dither signal is uncorrelated noise and decreases the undesirable artifacts, but also decreases the signal to noise ratio of the re-quantized signal. Two methods are used to determine whether dither is needed, the independence of the quantization error, and the mutual information of the original and requantized signal. Simulations were run using 24 bit synthetic and real audio signals. Results indicate that the 24 bit signals can be truncated to 16 bits with no loss of information in most cases.

1 Relevance to NASA

All sensor signals are digitized before processing or transmission. In order to reduce the transmission or storage requirements, the minimum number of bits to accurately represent the signal must be used. Having algorithms that reduce the number of bits used by improving the digital representation means that the signals will have smaller transmission or storage requirements.

2 Goals of the Component

• Determining whether quantization will introduce unwanted harmonics to the signal.

3 Component Accomplishments

• C code has been developed to implement the various algorithms. This code can determine whether adding dither to the signal is necessary or will simply increase the noise level.

4 Technical Summary

Digital signals have become widespread, and are favoured in many applications because of their noise immunity. This is a result of their being discrete in both time and amplitude. Once the signal has been discretized, the signal can be stored or transmitted without additional noise being added. There are many applications however, where the discretization in both time and amplitude need to be changed in the discrete domain. For example, if two digital systems operate at different sampling frequencies, a multirate system is used for the sampling rate conversion. The amplitude discretization can also be changed. Going from a 16 to a 8 bit representation, for example, would reduce the memory requirements for storing a signal. The process of lowering the amplitude resolution of a signal is called quantization. This research deals with lowering the amplitude quantization of digital signals.

It is the quantization scheme that determines the amplitude resolution of the digital signal. The model for the lowered resolution signal is:

xq[n]=x[n]+q[n],

where xq is the lower resolution quantized signal, x the original signal, and q the quantization noise. The simplest method of lowering the resolution is truncation, where the lower significant bits are discarded. This can introduce various undesirable artifacts however, namely, additional harmonics related to the signal being quantized [1,2]. This occurs when the quantization signal is not independent of the signal x and so has harmonic content that changes with x. To avoid this, a dither signal is generally added to the signal before quantizing. The purpose of the digital signal is to ensure that the quantization error, the difference between the signal before and after quantizing, is uncorrelated with the signal being quantized. Ideally, the quantizatin signal is independent of the signal being quantized, but it is generally accepted that having the first and second moments uncorrelated is sufficient.

The disadvantage of adding dither is that since the dither is basically a noise signal, the signal to noise ratio of the final requantized signal is lowered. Because of this, it is desirable to know if truncation will introduce the undesired harmonics. In some cases, the quantization noise will be independent of the desired signal even though no dither is added. In these cases, the signal can be truncated with no added harmonics, and no loss of signal to noise ratio. This generally occurs in two cases. If the signal to be quantized has a high noise level, this will act as a dither, and additional dither is not needed to make the quantization error independent of the signal. In other cases the amplitude change of the signal is large from sample to sample compared to the quantization noise. This ensures that the quantization error will be white, and thus independent of the signal.

The purpose of this research is to automatically determine to what level a signal can be truncated to without adding undesirable harmonic content. This is done in two ways. The first is to study the quantization noise. If the quantization noise is an iid sequence, the signal can be quantized to this level with no additional harmonics. The second method is to study the mutual information between the original and quantized signals. If the mutual information is very close to the entropy of the original signal, then very little information is lost in the quantization process.

The basic purpose of dither is to ensure that the quantization noise is independent of the signal. This can be determined by studying the quantization noise. xq is the sum of x and q, so it will contain undesired harmonics depending on the contribution of q. If q is an indepentent sequence, its power spectral density will be uniform, and no undesired harmonics are introduced. The first studies look at determining if the quantization sequence is an independent sequence.

The first order entropy of a signal x is defined as,

[pic]

The joint entropy between two signals, x and y is defined as

[pic]

Another measure, called mutual information can then be defined using the first order and joint entropies,

[pic].

Exaclty what mutual information measures is seen more clearly when it is written in a different form using conditional entropies. The conditional entropy of x given y is

[pic].

The conditional entropy is the information in x that is not in y. The joint entropy can then be written as

[pic].

Substituting this in the mutual information equation, we have

[pic].

From the equation above we see that the mutual information is the information in x minus the information in x that is not in y. In other words, the mutual information is the information in x that is also contained in y. If two signals contain the same information, the mutual information is simply the information in one of the signals. If the signals are independent, then the mutual information is zero. We can use mutual information with both the quantization noise,q , and the lower resolution signal xq.

The mutual information between x and xq will tell us if any information has been lost in the quantization process. Alternatively, the mutual information between x and q will tell us if the quantization noise is independent of the signal. Both have been calculated here.

In order to caclulate the mutual information, first order and joint pdfs must be estimated. This turns out to be difficult in practice. The signals being used are 24 bit digital signals. 24 bits means the signal can have 224 possible values. When calculating the joint or conditional entropies, the joint densities of two signals must be estimated. This joing pdf has[pic]possible values. This causes problems both for memory requirements, and for the data required to accurately estimate the pdf. To get around these problems, the pdfs are estimated by an N by N contingency matrix, where N is 1024, 2048 or 4096.

The mutual information between x, q and xq were calculated for two synthetic, and one real signal. The synthetic signals were a cosine and a randomly changing cosine signal. All signals originally have 24 bit resolution.

The first experiment run determined the mutual information between the original signal, x, and the quantized signal, xq. The results are shown in figure 7.2.2.1. As seen in the figure, quantizing from 24 to 16 bits produces practically no change in the mutual information.

[pic]

Figure 7.2.2.1. The mutual information between the original and quantized signal vs. bits.

Another method of determining if dither is needed is to look at just the quantization noise. A sufficient condition for the re-quantized signal to be harmonic free is for the quantization noise to be white [3,4]. This white condition is the same as saying the samples are independent. Since the quantization noise comes from the same process, it is assumed in this research that the signal is comes from one distribution. Thus the sufficient condition becomes that the quantization noise is independent and identically distributed (i.i.d.). Of the three methods for determining the whiteness of the quantization signal, two are in time and one in frequency. The time tests are based on the autocorrelation, and the spectral test is based on the power spectral density. The tests use the fact that if a stochastic process is i.i.d., then its autocorrelation will be an impulse, and its power spectral density is a constant.

The time based methods are the Box-Piece and its improved version, the Ljung-Box tests [5] that look at the autocorrelation of the quantization noise. The tests have a chi-squared distribution and indicate how likely the signal under test has an impulse as its autocorrelation. Since an audio signal can change over time, the procedure is to segment the signal, and apply the tests on each segment. This is done to strengthen the stationary assumption and to allow each segment to be dithered, if necessary, independently. Once the signal is segmented, it is quantized, and the quantization noise is scaled to improve the performance of the tests. A threshold is used to determine if dither is added to the segment or not.

The spectral based method tests the distribution of the sampled power spectral density (PSD) of the quantization noise. As stated previously, the PSD of a white signal is constant. Because the signals are finite, the sampled PSD will never be constant, but will have a variance that depends on the variance of the signal in time. This variance will be bounded, since the quantization noise is always bounded and depends on the reduction in number of bits. The test presented here tests the shape of the PSD distribution and the number of samples above a pre-determined threshold.

Experiments were performed using the tests are used on synthetic and real signals. The performance of each of the tests was determined using synthetic signals with varying levels of ‘whiteness’. The final tests were done with real audio signals. These were reduced from 24 bits to N bits, where N went from 23 to 1. As seen in the results for mutual information, quantizing from 24 to 16 bits introduces no additional noise for real signals.

5 References

1. S. P. Lipshitz, R. A. Wannamaker and J. Vanderkooy. “Quantization and Dither: A theoretical survey”. J. Audio Eng. Soc, vol 40, No 5, pp. 355-374 (1992 May).

2. R.M. Gray and D.L. Neuhoff, "Quantization," IEEE Transactions on Information Theory, Vol. 44, pp. 2325-2384, October 1998. (Commemorative Issue, 1948-1998).

3. L. Schuchman, “Dither Signals and Their Effect on Quantization Noise,” IEEE Trans. Commun. Technol., vol. COM-12, pp. 162–165 (1964 Dec).

4. R. A.Wannamaker, S. P. Lipshitz and J. Vanderkooy . “A Theory of Non-Subtractive Dither”, IEEE Transactions on Signal Processing, vol. 48, No. 2, pp 499-516 (february 2000).

5. Ljung, Box. “On a measure of lack of fit in time series models”, Biometrika ,No 65 , pp-297-303 (1978).

3 Microwave Remote Sensing of the Atmosphere by Sandra Cruz Pol

1 General Overview

Microwave remote sensing work is being performed at the UPRM/ECE Laboratory for Cloud Microwave Measurements of Atmospheric Events (CLiMMATE). Our main goals are the development and calibration of atmospheric and cloud models using data from radar and radiometer measurements. Past work includes model calibration for stratus clouds, cirrus ice crystals, atmospheric attenuation and precipitation.

Our work goal is to develop algorithms for the calibration of the atmospheric models to better retrieve the physical and radiative characteristics of the atmosphere. This includes water vapor content, vertical air motion, liquid water content, and raindrop distribution for clouds and/or precipitation. We also look at the atmospheric attenuation suffered by the radar signal as it travels through the clear atmosphere due to water vapor and oxygen gases. This attenuation varies, among other factors, with frequency, radar scanning angle, air temperature and pressure. During the last 5 years, we have worked with data collected from the UMass Cloud Profiling Radar System (CPRS) operating at Ka and W bands, NASA TRMM PR at Ku band, NASA EDOP at X band, NWS WSR-88D at S band, NOAA wind profiler at S band, microwave radiometers and NWS radiosondes, among others sources. In addition, we are in the initial stages of developing an atmospheric Doppler radar for weather applications.

Past work included the use of NCAR cirrus crystal measurements with radar reflectivities measured with the UMass CPRS at the Blackwell Airport, CO in spring 2000. We modeled the ice crystals found in cirrus clouds with data measured by an airborne instrument, known as the Video Ice Particle Sampler (VIPS), from the National Center for Atmospheric Research (NCAR). We computed the backscattering of the ice crystals from the ice particle’s size distribution N(D), and compare them with in-situ radar reflectivity measurements. Data from cirrus clouds on March 13 of 2000 from the CPRS and VIPS were found which matched satisfactorily in time and space. The equivalent reflectivity obtained from both instruments compare favorably when both instruments were close in time and space and when the Brown and Francis [1995] density was applied, also their RMS values were smaller when the points compared were closer (See Fig. 3.13). It was found that the effect of the ice particles shape was negligible at 33 GHz, but at 95 GHz the effect causes significant changes. By taking the shape effect into account at 95 GHz, there was better agreements between radar’s and the VIPS’s Ze for all the crystal ice’s densities used. IWC-Ze relationships for the three densities used in this investigation were obtained at 33 GHz, the relationship obtained using the B&F density, resulted the most reliable (the shape, and the low variability in the discrete values);i.e. [pic], which is the new calibrated model recommended in this study. Meanwhile, when using Heymsfield and solid ice densities, the results were very similar (their IWC-Ze relationships coefficients were similar), and the variability was higher compared with other common relationships.

In addition, two Doppler radars working at W-band and at S-band together with radiosonde observations were used to retrieve physical parameters such as rainfall rate and vertical air motion in rain. These instruments were deployed at the U.S. Department of Energy Atmospheric Radiation Measurement (DOE ARM) Cloud and Radiation Testbed (CART) site in Lamont, Oklahoma where rain data was collected on November, 2001.

2 Drop Size Distributions and Expected Radar Reflectivity from a 2-D Video Disdrometer deployed in Puerto Rico

Accurate estimation of drop size distribution (DSD) for all rain rates is necessary to develop and validate rainfall retrieval algorithms. The Two-Dimensional Video Disdrometer (2dvd) presented in this research offers a new approach to measuring DSDs.

A 2dvd was deployed in Puerto Rico during 18 months (Jul 04- Jan 06). An initial DSD characterization was performed to compare with previous studies. The event considered was Tropical Storm Jeanne that passed on September 15th and 16th, 2004. Preliminary results confirmed that DSDs are highly variable between coastal and mountainous regions, even in a small island, as suggested by Ulbrich (1999).

1 Introduction

The purpose of this study is to improve the calculations of expected precipitation from WSR-88D radar reflectivity measurements. We analyze the characteristics of the drop-size distribution for a coastal zone tropical environment. In a previous study performed by Ulbrich, Petitididier, and Campos, they found DSDs characteristics similar to continental regions. As their study was performed in a mountainous area, they suggested DSD variations between higher areas and coastal zones. From DSD measurements, reflectivity Z and rainfall rates R are calculated using theoretical equations. With these results Z-R relationships can be derived, from which weather radars determine rainfall rates after measuring Z. Those relationships are compared with known relations used by current radars. We compare Z measured by NEXRAD and that computed from DSDs measurements. NEXRAD data available is in 5 dBZ increments and 5 or 6 minutes time intervals; it was provided by NWS local weather forecast office. In the other hand DSD data is in one-minute time intervals, for adjustments were done to compare data from same instances. Additional adjustments in time were needed to correct for the measurements differences in altitude.

Quantitative Precipitation Estimation (QPE) pursues to improve the precipitation estimates and enhance the reliability of flood prediction. In rainfall estimation techniques, the drop size distribution (DSD) is the most fundamental component, since it governs all the microwave and rainfall integral relations. It is characterized by a high temporal and spatial variability that affects both microwave measurements from radars and ground validation. Therefore, its accurate estimation for all rain-rates is necessary in order to develop and validate rainfall retrieval algorithms [Rincón, 2002].

This work intends to increase the reliability of DSD measurements in tropical climates, contributing to better estimation of rainfall rates. It will provide information on reflectivity calculations to improve rainfall retrieval algorithms as well. Accurate DSD estimates will provide better attenuation correction specific to the tropics as well as improved Z-R relationships. It is necessary to solve discrepancies among previous precipitation estimation studies using NEXRAD.

2 Literature Review

Measurement of rain intensities by remote sensing methods is of great interest for a wide field of applications [Schönhuber, 1995]. The reliability of those methods needs verification from point monitoring ground instruments, such as disdrometers and rain gauges. The 2dvd allows for a detailed verification of rain measurements, in some cases more detailed than traditional rain gauges. This is because of the enhanced sensitivity the 2dvd can achieve, compared to rain gauges measurements. A typical tipping bucket rain gauge will indicate just a “trace” of rain when accumulation is less than 0.1 mm. Figure 1-1 shows a widespread rain event in Graz, Austria, on October 24, 1994.

To estimate rain that will fall over a certain area the rain-rate is derived from the DSD and the drop’s diameter and terminal velocity, as shown by Kruger and Krajewski [2001] and by Doviak and Zrnić [1993]. The relation of these parameters is

[pic] (1.1)

where R is the rainfall rate in mm, D and vt are the drop’s diameter and terminal velocity in mm and m•s-1, respectively, and N(D) is the DSD. Accurate measurements of these quantities are consequently needed. Both diameters and terminal velocities of falling hydrometeors can be obtained from the 2dvd accurately as demonstrated in [Kruger, 2001], and [Schönhuber, 1995, 2000]. The 2dvd software arranges drop information to construct DSD, according to Kruger.

Atlas et al. [1973] came up with a useful formula to calculate terminal velocities of water droplets that produces less than 2% error from precise measurements made by Gun and Kinzer [1949], if the diameter of the drops is between 0.6 and 5.8 mm. This formula is the one used by the 2dvd to compute drops’ vertical velocities, and is expressed as

[pic] m/s (1.2)

when D is in mm. The aforementioned velocities measurements were performed in stagnant air. For the 2dvd, low-wind conditions are necessary in order to obtain accurate and detailed information on drop size, velocity, and shape. High-wind conditions may introduce errors in the instrument readings.

The conversion of radar reflectivities to rainfall parameters uses standard models for DSDs [Schönhuber, 1995], such as the well known Marshall-Palmer DSD (MP-DSD), and others as the Joss-Drizzle (JD-DSD), and the Joss-Thunderstorm (JT-DSD) models [Schönhuber, 2000]. These three are exponential models of the form

[pic] (1.3)

where N0 is the scaling factor and is fixed to 8,000, 30,000 and 1,400 /m3mm respectively, for the models mentioned above, and Λ is a function of the rainfall rate. The parameter Λ of the exponential distribution that fit the MP-DSD is 4.1R-0.21 mm-1 when the rainfall rate is in millimeters per hour. Although MP-DSD is very popular in computing rainfall rates derived from radar reflectivities measurements, actual drop sizes change significantly by geographic location, type of storm, season, and region within the storm. Regarding tropical climates, all three models mentioned produce huge overestimations of rainfall rates [Schönhuber, 2000]. This is due to the fact that they overestimated the number of large drops, compared to what is measured by the 2dvd. This is shown on Figure 1-1, which depicts the DSD measured and estimation curves for the three models from experiment on Lae, Papua New Guinea, on August 30, 1995.

[pic]

Figure 1-1 DSD out of tropical rain, recorded on 30 August 1995, 22:10-22:30, in Lae, Papua New Guinea. MP-DSD (yellow), JT-DSD (red), and JD-DSD (green) indicated. Mean R = 25.24 mm/hr. As shown in the figure all three methods overestimate the amount of large raindrops [Schönhuber, 2000].

Ulbrich (1983) proposed a general gamma distribution of the form

[pic] (1.4)

to better represent the DSD. In this case Λ = (3.67 + μ)/D0 [Meneghini, 2003]. As more parameters are introduced in these equations more remotely sensed measurements are required to specify them. Since the WSR-88D is a single-polarization radar, only exponential functions can be considered, as one parameter can be fixed and the other determined. To upgrade this system the National Oceanic and Atmospheric Administration (NOAA) has planned to convert the WSR-88D at TJUA to a dual-polarization radar by the year 2007. But even when two-parameter models reduce rainfall rates estimation errors significantly, there still exist big differences compared to measured rates [Schönhuber, 2000]. This discrepancy is attributed to the assumptions made when converting radar reflectivities to rainfall rates. This conversion uses models for DSDs, as the ones mentioned above, whereas in converting disdrometer measurements to reflectivities actual measured DSDs are used. Radar systems convert reflectivity Z to rainfall rate R using a Z-R relationship [Vázquez, 2001]. This relationship is of the form

Z = aRb (1.5)

The WSR-88D precipitation processing system (PPS) converts reflectivity Z to rainfall rate R using a Z-R relationship [Vásquez and Roche, 2001], as stated before. On their study, Vasquez and Roche proposed new Z-R relationships, owing mostly to underestimation when rainfall accumulation for the hurricane Debbie (August 22-23, 2000) exceeds 75 mm. Furthermore, results show that there is some overestimation in about half the occurrences when rain amounts are less than 75 mm. These results were calculated using the Rosenfeld Tropical Z-R relationship, where Z = 250R1.2. A list of NWS Radar Operational Center (ROC) accepted Z-R relationships are presented in Table 1-1. The new relationships suggested in this work by Vásquez and Roche yielded better approximations of rainfall estimation than the Rosenfeld Tropical relationship, when compared to rain gauge accumulations. The new Z-R equations where selected using the gauge/radar or G/R ratio defined by Wilson and Brandes (1979) as the sum of the observed amounts at all gauges with rainfall divided by the sum of the radar estimates for those gauges. Figure 1-2 presents the relation between the suggested Z-R relationships and gauge data. The problem encountered with these new relationships was that they produce rainfall overestimation for rainfall amounts of less than 76.2 mm; however for flood warnings is much more important to have proper early estimation of heavy precipitation.

Table 1-1 NWS Radar Operational Center (ROC) accepted Z-R relationships.

|RELATIONSHIP |RECOMMENDED USE |2ND RECOMMENDATION |

|Marshall-Palmer |General stratiform | |

|Z = 200R1.6 |precipitation | |

|East-Cool Stratiform |Winter stratiform precip. |Orographic rain-East |

|Z = 130R2.0 |east of continental divide | |

|West-Cool Stratiform |Winter stratiform precip. |Orographic rain-West |

|Z = 75R2.0 |west of continental divide | |

|WSR-88D Convective |Summer deep convection |Other non-tropical convection |

|Z = 300R1.4 | | |

|Rosenfeld Tropical |Tropical convective systems | |

|Z = 250R1.2 | | |

Other factors found on this study that could affect rainfall accumulation estimations are: 1) ground-clutter suppression technique used by NEXRAD’s PPS, that removes power contribution from hydrometeorological targets, as it is applied to stationary echo returns, and will eliminate contributions from targets moving perpendicular to the radar beam; 2) lack of detection by the radar beam, because the scanned volume lowest height increases with the square of the range from the radar site (TJUA) and radar beam overshooting rain clouds at all ranges as TJUA is located at 850 m above sea level (see Figure 1-4); 3) a glitch on the PPS algorithm that truncates estimates at the decimal point, which would not be of much importance during short periods of rain but prolonged accumulation estimates can produce significant errors, especially if using the User Selectable Precipitation product; and 4) radar calibration that can produce a ±1 inch of error with only a ±4 dB reflectivity error, according to Joe and Cynthia Christman [1999].

Regarding radar calibration, studies by Ulbrich and Miller [2000] demonstrated that the use of the default Z-R relation for that zone (South Carolina) with no adjustments for possible calibration offset, produces very low estimated amounts, lower by half or more [Ulbrich, 2001]. Comparing disdrometer data with a tipping bucket rain gauge indicated good agreement between them, implying that radar calibration was incorrect, with an average offset of -4.7 dB for the nine storms examined. Variations on the Z-R relation did not yield improvements in the radar-measured rainfall unless adjustments for possible calibration offset were applied.

[pic]

Figure 1-2 Gauges accumulations for hurricane Debbie (2000) and radar estimated rainfall vs. regressed reflectivities from alternative Z-R relationships for same event.

3 Ancillary Sensors

In order to validate and compare 2dvd measurements, different types of auxiliary sensors were used. First we used rain gauges at the NWS and San Juan International Airport (SJU) premises and later we compared the results with radar measurements.

Rain Gauges

Puerto Rico is sampled by several rain gauge networks with over 170 reporting stations combined. These include mostly gauges from the United States Geological Survey (USGS), others installed on airports and those from cooperative observers’ networks. USGS gauges are usually installed at river dams or other river monitoring stations, creating a lack of information when it is needed for locations in the metropolitan area. The 2dvd is not located near any of the USGS for the reason stated previously; therefore other rain gauges were considered.

The Federal Aviation Administration (FAA) has put in place an Automated Surface Observing System / Automated Weather Observing System (ASOS/AWOS). This system is a suite of sensors, which measures, collects and broadcasts weather data to help meteorologists, pilots and flight dispatchers prepare and monitor weather forecasts, plan flight routes, and provide necessary information for takeoffs and landings. SJU Airport (Luis Muñoz Marín Airport) is part of this system, and though is not at the same exact location as the 2dvd (about 1 mile west), its rain gauge gave us initial validation guidelines.

Two additional rain gauges, one 4-inch, one 8-inch, are located at NWS premises, right by the 2dvd location. However, data collection from these is done manually, creating deficiencies in data management and availability. Consequently we were forced to use data from the ASOS station which is available online and collected automatically at least every hour.

WSR-88D Radar

Located in the eastern-central part of the Island is the Weather Surveillance Radar-1988 Doppler (WSR-88D), also known as NEXRAD. For NWS purposes this radar is identified as TJUA.

The WSR-88D precipitation processing system (PPS) converts reflectivity (Z) to rainfall rate (R) using a Z-R relationship. Because of the complex mountainous terrain, the TJUA radar detects localized ground clutter over several areas. A ground clutter suppression technique is used to filter those non-meteorological, ground-based targets that otherwise would produce false reflectivity echoes affecting the PPS algorithm estimates. Ground clutter suppression may also affect rainfall accumulation by removing power contribution of hydro-meteorological targets. This technique is applied on stationary echo returns, thus hydro-meteorological targets moving perpendicular to the radar beam may be filtered as well.

Another contributing factor to inaccurate precipitation estimation is the lack of detection by the radar beam. As the radar beam samples the atmosphere, the scanned volume lowest height increases as the square of the range from the RDA. The elevation angles of the radar beam used by the PPS to estimate rainfall rates are the lowest four tilts: 0.5°, 1.5°, 2.4°, and 3.4°. In addition, the WSR-88D site (TJUA) is located on a mountain top at an elevation of 2,900 feet above sea level. Thus, the effect of radar beam overshooting rain clouds at all ranges would contribute to additional radar underestimation. Figure 1-3 illustrates the minimum height of NEXRAD’s main beam coverage across Puerto Rico. It is noted from here that over the western part of the Island its coverage starts above 2,500 m.

[pic]

Figure 1-3 NEXRAD minimum beam height in meters along the island of Puerto Rico [Donovan, 2004].

|[pic] |[pic] |

|(a) shows camera locations |(b) shows lamp locations. |

Figure 1-4 Sensor Unit without aluminum covers

Two-Dimensional Video Disdrometer (2dvd)

The 2dvd was developed by Joanneum Research from Graz, Austria, and the ESA/ESTEC (European Space Agency / European Space and Technology Centre). Joanneum Research, with 15 research units, is one of the largest non-university research institutions in Austria. Additionally, students of the Technical University of Graz have also contributed to its development.

A 2dvd is a precipitation gauge, working on the basis of video cameras. This optical device is utilized for raindrop size, shape and velocity measurements. It detects shadows made by passing drops using light and line-scan cameras. The information obtained allows for computations of: drop size distribution (DSD), oblateness, equivolumetric diameter, rainrate, and velocity (horizontal and vertical).

The advantages of this two-dimensional video configuration are several. First it avoids the problem of counting splashes of drops as other individual drops. Second, the special arrangement of the cameras in different planes allows for the actual measurement of fall velocities.

The Sensor Unit houses the two cameras, two light sources, and several mirrors (see Figure 1-4 (a) and (b)). Mirrors are used to deflect light as lamps are not directly in front of the cameras. Each camera-lamp pair is orthogonal to the other, providing the two-dimensional aspect of the measurement, as shown in Figure 1-5.

When a raindrop falls into the measuring area, cameras 1 and 2 detect the drop shadow. The two orthogonal projections provide 3D raindrop shape information that is used to describe the raindrop. The sensor unit operates at a frequency of 34.1 kHz, taking drop measurements every 29 microseconds approximately.

[pic]

Figure 1-5 Sketch of Sensor Unit components [Schönhuber, 1994].

The Outdoor Equipment Unit (OEU) (see Figure 1-6) consists of an embedded computer (PC), power supply – to power lamps and cameras – and connections for power and video signals from the cameras. It receives those video signals from cameras, pre-processes raw data, and runs software for data acquisition and plane alignment. Every 3 seconds, data are "packaged" by the OEU PC and transmitted via TCP/IP to the Indoor User Terminal, a third component that is described below.

[pic] [pic]

Figure 1-6 OEU, open at left, closed at right (back).

Indoor User Terminal (IUT)

The IUT is a regular PC that receives the pre-processed data from the OEU and performs the final computations. It also provides display of these calculations via a proprietary software called VIEW_HYD. An image of the software display is shown inFigure 1-7. IUT is commonly named indoor computer or indoor PC (personal computer).

Several parameters are computed from measurements taken by the Sensor Unit, including rainfall rate, drop size distribution (DSD) and oblateness. Other measured parameters are compared with calculations from well-known models.

To begin with, consider rainfall rates. These are not based in time as one would expect, but in quantity of rain in a given amount of time. The amount of rainfall rate displayed will be the rain accumulation for the last 30 minutes since the last 0.1 mm increment.

Another parameter displayed by VIEW_HYD is the DSD. It is calculated using

[pic]

(3.1)

where Δt is the integration time interval in seconds, ΔD is the width of size class in mm, Aj is the effective measuring area of drop j in m2, vj is the velocity of drop j in m•s-1, I is the drop size class, j is the single drop, Mi is the number of drops in class I during Δt, and Di is the diameter of class i.

[pic]

Figure 1-7 Image of the VIEW_HYD software that provides rainfall and raindrops information.

As stated before, vertical velocity is measured by the difference in distance between light planes, but it also is compared with computed velocity determined after Atlas et al [1973]. This relation is given by

[pic] (3.2)

where V is the velocity in m•s-1 and D is the diameter in mm.

One important parameter computed by the 2dvd is the oblateness, which is defined as the geometrical mean of the height/width (H/W) ratios of a raindrop’s front and side view. As the drop is actually measured by the cameras, oblateness is not estimated from the diameter as done by other models developed by Pruppacher and Beard [1970] and Poiares Baptista [1992]. The NWS provided their facilities as in the first site for the 2dvd installation in Puerto Rico.

The ground surroundings had enough clearance from buildings, trees and fences that encircle the installation area. In addition, a small building used by NWS hydrologists was within reasonable distance to install the IUT and to connect power. Finally, NWS rain gauges and other weather sensors were located in the same area, providing for measurements validation.

3 Comparison with traditional rain gauge

Considering the 2dvd as a precipitation gauge, it was reasonable to compare its measurements to a traditional rain gauge. The rain gauge information was obtained from the ASOS station. An initial comparison was performed during Tropical Storm Frances affecting Puerto Rico on August 30-31, 2004. This comparison set basis to rely in 2dvd data as a means of measuring precipitation parameters.

ASOS information is available in an hourly basis, so 2dvd rain rate was converted to hourly accumulation. Rain rate and DSD information is obtained using a program provided by Colorado State University, called FIRM_DSD. This program reads 2dvd proprietary-formatted files and converts them to text files with the following information: time, diameter range, DSD values, and rain rate. For each minute (time periods are user-defined in terms of seconds) a DSD value is reported for each diameter range, varying from 0 to 0.25 mm to 10 to 10.25 mm. First we evaluated the performance of FIRM_DSD compared to what was displayed by VIEW_HYD. Results from FIRM_DSD shown in Table 1-2 were very close to VIEW_HYD as expected, as the instrument and FIRM_DSD software had been in use for some time.

[pic] [pic]

Figure 1-8 Picture of 2dvd’s sensor and OEU completed installation with personnel from NWS. IUT was installed at the building at the far right (red roof).

Table 1-2 FIRM_DSD performance evaluation during Frances pass over San Juan, PR.

|Total Precipitation |Aug 30th |Aug 31st |

|Total precipitation VIEW_HYD |4.28 mm = 0.17 in |4.60 mm = 0.18 in |

|Total precipitation FIRM_DSD |4.2755 mm |4.6031 mm |

Next we compared 2dvd with ASOS rain gauge daily accumulation during Frances. Figure 1-9 and Table 1-3 shows detailed results of the comparison. Results yielded less than 0.5% difference between the two instruments, providing the confidence to start working with larger data sets towards our DSD characterization goal.

[pic]

Figure 1-9 Comparison between 2dvd and ASOS hourly precipitation accumulation for Aug. 30, 2004 during Tropical Storm Frances pass over San Juan, PR.

Table 1-3 Comparison between 2dvd and ASOS rainfall accumulation during Tropical Storm Frances, August 30-31, 2004.

|Total Precipitation |Aug 30th |Aug 31st |

|Total precipitation 2dvd (mm) |4.6369 |4.2448 |

|Total precipitation 2dvd (in) |0.1826 |0.1671 |

|Total precipitation ASOS(in) |0.25 |0.22 |

|% difference |0.38 |0.40 |

Albeit our good results, several problems were found after the installation of the 2dvd which resulted in data loss and time consuming trips to the instrument.

4 Operation problems encountered

1 Remote monitoring and data downloading

Original plans for monitoring the 2dvd were to use a commercial network computing software that allows viewing and fully-interacting with one computer from any other computer on the Internet [RealVNC web site]. This was planned because of the long driving distance between the University campus and the NWS office – around two and a half hours – which prevented us from regularly monitoring 2dvd and downloading data. Problems started with firewall and information systems security issues at both sites, but mostly at NWS, as it is part of the U.S. federal government and security is a priority after 9-11 events. Remote connections can only be established from the NWS out, but not into their network.

For several months this practice took place, connecting from IUT to our computer or to Colorado State University network, as their firewall did not disconnect the communication as easily as ours (Mayagüez campus). This was done by us or by the very cooperative personnel at NWS when we requested to. However, when IUT started presenting what looked as communications problems with OEU, it was believed that new IP addresses could be “confusing” the internal TCP/IP transmission, causing the system to fail and consequently to loose valuable data. It was the recommendation of the manufacturer to stop downloading data remotely, to avoid more system failures, which caused the schedule of more regular trips to the equipment for monitoring stability and to retrieve archived data.

2 Communication problems between computers

Together with the previous problems, IUT computer started presenting messages regarding communication problems between it and the OEU. Most of the times the action taken by the computer was rebooting itself, with the opinion of the majority of people involved in the project that when loosing communications it created a major system error and caused the computer to restart itself. When it booted and there was nobody there – which was the case most of the times – the system did not ‘log in’ itself and as a result the data collection program did not start. Apparently this happened in a two-week interval of time, approximately. The solution to this, regardless of known security issues, was to eliminate the need for logging in when starting up the computer. Security was not a major concern because access to NWS premises is limited to employees and citizens or press requiring weather information. In addition the building were the IUT was located was not used frequently or by many people.

Kruger and Krajewski [2001] found several causes for the 2dvd to fail. These include some that have been corrected with newer software versions, such as problems synchronizing IUT and OEU clocks, which caused the IUT to lock up. Other problems they found were heat-related problems and blocking of light planes by insects or small objects; this last problem is inherent to the equipment design. Because of these issues they recommend not to operate the instrument unattended for more than four to five days. As we were not able to monitor the instrument with this frequency, we were not able to identify what the specific problem was causing the system to shutdown.

Nevertheless, we knew that the coaxial cable connecting the indoor and outdoor computers was a lot longer than needed. Distance between OEU and IUT was in the order of 100 feet and communications cable shipped with the instrument was a 1000-feet reel. We were asked by the manufacturer not to cut it and make new connectors, because some communication problems could arise. Instead, after trying several indoor computer configurations, the manufacturer shipped a new cable with the length required. After replacing the coaxial cable continuous working time without systems failures was considerably extended, from about one or two weeks to one month or more.

5 Drop size distribution characterization for tropical storm

1 Procedure

DSD information is obtained using the FIRM_DSD program provided by Colorado State University. To characterize DSDs, two parameters were computed, according to Bringi and Chandrasekar (2001): the mass-weighted mean diameter, Dm, and the normalized intercept, Nw, using

|[pic] (mm), |(4.1) |

|[pic] (drops * mm-1 * m-3), |(4.2) |

Both parameters are used to normalize DSDs, reducing the scattering of data points. This is useful in comparing the shapes of distributions with widely different rain rates [Bringi, 2001]. Other characterizations use the median volume diameter D0 and the N0 parameter. For an exponential DSD Do is defined such that drops less than D0 contribute to half the total rainwater content W, e.g.

|[pic] |(4.3) |

In order to compare to results from similar studies, stratiform rain was defined as that with R greater than 0.5 mm/hr and standard deviation from R less than 1.5 mm/hr. Conversely, convective rain was defined as that with R greater than 5 mm/hr and standard deviation from R greater than 1.5 mm/hr.

For September 15, 2004, 712 minutes of data were recorded by the 2dvd, whereas for September 16th, 970 minutes were available for analysis. To reduce computing times, data was averaged every 2 minutes, for DSD characterization analysis, thus 356 data points were used for the 15th and 485 for the 16th.

TABLE 1-4 Dm and log10Results Summary

|Day/Rain Type | [mm] |log10 |

|Sep. 15, 2004 stratiform |1.14 |4.19 |

|Sep. 15, 2004 convective |1.38 |3.96 |

|Sep. 16, 2004 stratiform |0.98 |4.61 |

|Sep. 16, 2004 convective |1.04 |4.80 |

|whole event-stratiform |1.12 |4.51 |

|whole event-convective |1.30 |4.35 |

2 Dm and Nw results

TABLE 1-4 summarizes values found for both and log10; these were as expected with maritime characteristics, even when convective rain results were not consistent with previous studies made in the Island. These studies will be discussed later.

Figure 1-10 (a) shows results from previous studies on stratiform rain parameters as well as findings from this work (see San Juan marker). Interestingly it shows results of the summer of 2004 studies in Colorado, where we briefly participated. Regarding stratiform rain, about 50% of data points were classified as this type on the 15th of September, while about 33% were selected on September 16th; for the combined data set a total of 40% of points were classified as stratiform.

On the other hand, as opposed to findings by Ulbrich, Petitididier and Campos (1999) in a mountainous region of the island, when continental properties were found in the DSDs, log10 versus plot shows characteristics similar to the Maritime Cluster (see Figure 1-10 b)).

Since less than 20% of data points were classified as convective rain in any of the two days, these results require further comparison to future events, in order to make them statistically significant.

We traced scatter plots and histograms characterizing DSDs for the event under consideration. A small number of points classified as convective, with little less than 18% when considering the event as a whole.

[pic]

Figure 1-10 (a) The value of log10 (with 1s std dev bars) versus from 2dvd data (numbered open circles) and dual-polarization radar retrievals (open squares as marked) for stratiform rain. Dotted line is the least squares fit. Note that Nw is the 'normalized' intercept parameter and Dm is the mass-weighted mean diameter of a 'normalized' gamma DSD.

[pic]

Figure 1-10 (b) As in (a) except data for convective rain. Note that Nw is the 'normalized' intercept parameter and Dm is the mass-weighted mean diameter of a 'normalized' gamma DSD.

When comparing these results to preliminary results from previous summer in Colorado – provided by CSU – mean values of Dm differ around 0.45 mm and values of Nw around 0.3 drops•mm-1•m-3. These confirm disparities in weather types, being at very different geographical areas. Nevertheless Nw results from convective rain, as seen from Figure 1-11 (b) tend to approach maritime cluster characteristics, as has happened in other instances in the same area. Therefore we understand DSDs can be highly variable even within the same location.

|[pic] |[pic] |

|(a) Stratiform type rain |(b) convective rain |

Figure 1-11 Log10(Nw) vs. rain rate scatter plot for the Tropical Storm Jeanne, affecting Puerto Rico on September 15-16, 2004.

|[pic] |[pic] |

|(a) mean = 1.1247 mm, mode = 1.1812 mm |(b) mean= 1.2988 mm, mode= 1.2656 mm |

Figure 1-12 Dm histogram for the Tropical Storm Jeanne, affecting Puerto Rico on September 15-16, 2004. (a) Stratiform type rain; (b) convective.

3 Uncertainties in DSD computations

Observational noise can occur in 2dvd measurements because small drops within the sample may come from different cloud regions where rain rates and microphysical processes are quite different from regions where large drops originate. The study of the microphysical processes responsible for the formation of DSDs and their evolution is an interesting but difficult subject because of the great variability of the possible processes involved. Figure 1-13 depicts the effect that the wind can cause mixing drops from different cloud regions.

4 Sequential Intensity Filtering Technique (SIFT)

SIFT was developed by Lee and Zawadski to filter out observational noise concentrating on the stability of the Z-R relationship during a physically uniform situation. For that purpose a record is taken of one-minute DSDs extending over several hours and the distributions are averaged over a variable number (from one to 120) of

a) random samples (random averaging),

b) samples sequential in time (time averaging), and

c) samples sequential in either Z or R (that is, SIFT with either R or Z as the intensity parameter).

[pic]

Figure 1-13 Effect of the wind in mixing drops from different cloud regions.

The average percent difference between the averaged 2dvd rain rate and the averaged recalculated R for September 16, 2004 is 12.40%. We noticed that this difference was of the order of the difference between R from FIRM_DSD and the calculated R using DSD information, that is 11.25% (see Figure 1-14).

[pic]

Figure 1-14 Differences between calculated (theoretical) R from 2dvd’s DSD information (red plot) and FIRM_DSD computed R (blue plot).

As opposed to FIRM_DSD rainfall rate, computed R from theoretical equations uses velocity equations developed for stagnant air. A very detailed experiment by Kruger and Krajewski (2001) concluded the 2dvd performs well under low-wind conditions. Several issues were identified by them caused or aggravated by airflow around the instrument: 1) drop’s shape deformation, 2) raindrops blown directly through the light plane slits onto the mirrors, and 3) spatial distribution distortion in the measuring area. However, the instrument described in this work is a former 2dvd, which was not a low-profile one (lower height), as the one considered in our work. Therefore some of these problems have been minimized already. Next we discuss our experiences with wind and airflow around the instrument, to continue the study on this subject as Kruger and Krajewski suggested, and to contribute with our insights.

5 Wind effects and fall velocities

Our first experience with the influence of wind on 2dvd measurements came up from a problem with the VIEW_HYD software. Tropical Storm Jeanne data produced a file slightly larger than 50 MB. Sometimes when a file size is large we had seen an error that did not allow us to display data with the 2dvd software. When this happened, the corresponding raw file was sent to the manufacturer, Joanneum Research, specifically to Dr. M. Schönhuber, to re-process the file and fix any data corruption. However we were able to manage such files data sets using the FIRM_DSD program. The data file from September 15, 2004 was found corrupt, so it was sent to the manufacturer as stated.

Manufacturer applies matching algorithms to these files, because problems could originate from mismatching particles from an area outside the virtual measuring area (refer to Figure 1-13). This treatment was done on the September 15th file and later sent back to us. While we kept working with the original file, we obtained very good results, which agreed closely with results from the ASOS rain gauge.

When we started analyzing the new file we found discrepancies of about 50% in difference with the original file. After several communications with the manufacturer, we agreed on the conclusion that strong winds affected measurements considerably. In a personal communication from Dr. M. Schönhuber (2005), he prompted the nature of the 2dvd as an area related measuring device, and that FIRM_DSD used measured velocities instead of an accepted drop diameter-velocity relationship.

We noticed that when using computed instead of measured velocities, our results agreed better with rain gauge measurements. Later we confirmed this information from Kruger and Krajewski [2001], when they used three different methods to calculate R and when compared to rain gauge measurements they found a good agreement. Moreover, taking into account outliers on the diameter-velocity plots from VIEW_HYD for the day under study – September 15, 2004 – we were more confident with the computed velocities (see Figure 1-15). Kruger and Krajewski [2001] also discussed the outliers problem confirming, with the manufacturer’s support, the hypothesis of outliers originating from particles crossing the light planes outside the measuring area.

Manufacturer recommended not to use the computed velocity as it derivation would be physically doubtful. Also he explained how to change the virtual measuring area according to the wind component, to observe the differences between FIRM_DSD computed R. This analysis was provided by Dr. Schönhuber as well as rainfall accumulation results, as he could manage raw data we were not able to analyze. After examining wind components for September 15th we noticed a strong wind component from the northeast (camera A is pointing north). Therefore it was reasonable to change the measuring area to the lower left quadrant.

[pic]

Figure 1-15 VIEW_HYD horizontal velocity plot for September 16, 2004.

Table 1-5 FIRM_DSD.SET parameter values and respective rainfall accumulation computation from modified data sets for September 15, 2004.

|USE_LO_A |USE_HI_A |USE_LO_B |USE_HI_B |rainfall accumulation [mm] |

|0 |700 |0 |700 |49.48 |

|0 |317 |318 |635 |65.90 |

|0 |200 |435 |635 |68.86 |

After several changes in A and B’s limits, rainfall accumulation calculations were considerably different. Results are presented in Table 1-5. Our results using computed velocity yielded 95.76 mm of accumulation for the complete day, while ASOS rain gauge measured 93.98 mm. It appears to us that applying matching algorithms is taking too many points out of the data set, resulting in significant data loss and discrepancies between 2dvd computations and other instruments.

When analyzing data for September 16, 2004, differences between computed accumulations using computed and measured velocities were not significantly different: 62.26 mm and 67.55 mm, respectively. ASOS rain gauge measurement for the same day was 60.21 mm, again demonstrating that measured falling velocities were still influenced by drop mismatching. By looking at Figure 1-15, it can be noticed that even when wind had a major easterly component it was not as strong as for the previous day. It is expected that stronger horizontal wind components will produce more mismatching and induce larger errors in calculations.

6 Reflectivity Computations

1 Computations from 2dvd DSD information

Determination of proper DSDs – or N(D) – is crucial to calculate reflectivity Z. The integral form

|[pic] |(5.1) |

is used for this purpose, and defines Z as the sixth moment of the hydrometeor diameter summed over all hydrometeors in a unit volume [Ulaby et al., 1995]. Radars, such as National Weather Service (NWS) WSR-88D, measure Z and use rainfall retrieval algorithms to determine the amount of precipitation expected, hence the importance of an accurate determination of DSDs. Even if smaller drops are more numerous, calculating the sixth power of D causes that the fewer larger diameter drops contribute more to Z. In many cases DSD models overestimate the number of big drops for tropical climates (see Figure 1-2); then the reason for rain overestimation in some cases is clearly understood. This makes it more important – to obtain detailed information about DSDs – especially for smaller diameter drops, as less weight is given to them in this calculation. An accurate number of drops will account for their appropriate contribution to Z.

Rainfall rate R and Z have been related through

|[pic] |(5.2) |

an equation known as Z-R relationship. The WSR-88D precipitation processing system (PPS) converts Z to R using a Z-R relationship [Vázquez, 2001]. A list of NWS Radar Operational Center (ROC) accepted Z-R relationships are presented in Table 1-1. Other relationship are being suggested by former studies made in Puerto Rico, such as the one by Vázquez and Roche, but variations in rainfall characteristics made them suitable for some but not all cases.

In Puerto Rico, the Rosenfeld Tropical relationship, Z=250R1.2, is widely used, though lately it has been modified for certain events, following local NWS findings from their research.

With the intention of comparing and determining Z-R relationships, new Rs are calculated from 2dvd DSD information. To calculate R the following equation was used from Bringi and Chandrasekar (2001).

|[pic] |(5.3) |

This equation yields R in mm*hr-1 when D is in mm, v(D) is in m*s-1, and N(D)dD is in drops*m-3. Drop terminal velocity v(D) was found from same work (Bringi and Chandrasekar, 2001).

|[pic] |(5.4) |

After calculating Z and R using (5.1) and (5.3), respectively, it is necessary to determine Z-R relationships between them, fitting data to find coefficients a and b that optimally fulfills equation (4). Base 10 logarithms were found on each side of (5.2) to make lineal instead of power or exponential fitting. Table 1-6 presents the summary of results from the coefficients found on each day for different Z-R relationships. It also shows the differences caused by differences in R between using FIRM_DSD data and calculating R using (5). Since FIRM_DSD uses measured fall velocities instead of (5.4), results account for that difference.

Table 1-6 Summary of results for a and b coefficients found for Z-R relationships.

|TS Jeanne |15 Sep 2004 |16 Sep 2004 |

|Using calculated R from 2dvd-measured DSDs |Z = 228.89 R1.24 |Z = 428.35 R1.29 |

|Using R from FIRM_DSD |Z = 414.4 R1.38 |Z = 523.26 R1.32 |

|Rainfall characterization |Mostly stratiform, some convective |Mostly stratiform |

| |349 stratiform, 137 convective |156 stratiform, 1 convective |

| |Total: 608 continuous data points (10.13 |Total: 239 continuous data points (3.98 |

| |hrs). |hrs). |

|Known Z-R relationships |Rosenfeld tropical |Tropical Maritime |

| |Z = 250 R1.2 |Z= 335 R1.37* |

| |Convective |Thunderstorms |

| |Z = 300 R1.4 |Z= 450 R1.46** |

| | |*Tokay et al., 1995 |

| | |**Fujiware, 1965 |

7 Comparisons to NEXRAD measured reflectivity

NWS provided us with reflectivity data from NEXRAD for the two days of the storm, September 15 and 16, 2004. Only data available was in 5 dBZ increments and 5 minutes time resolution. It was expected that the poor resolution of the data, compared to data computed from 2dvd’s DSDs, will yield significant differences. Figure 1-16 shows reflectivity vs. time plots for both days of the event, after both plots were matched in time for minimum difference.

|[pic] |[pic] |

|(a) September 15th |(b) September 16th, 2004. |

Figure 1-16 Comparison between NEXRAD measured Z (red +) and Z (blue plot) computed from 2dvd measured DSDs: (a) September 15th, and (b) September 16th, 2004.

Above the NWS premises, NEXRAD collects measurements at approximately 4,000 feet, or 1,219.5 meters. Therefore it is expected that there is a time lag of measurements by 2dvd of drops coming from the same region NEXRAD is measuring. We calculated the difference between the Z values from both instruments and determine from these numbers which was the shifting in time needed to match both measurements, identifying the time where maximum differences were the lowest on that time period. Shifting times required for matching plots are presented in Table 1-7 along with average winds for each day .

Table 1-7 Shifting time required to match Z plots and corresponding average horizontal wind.

|Day |Time difference to match both Zs |Average horizontal wind |

| |(minutes) |(m/s) |

|September 15, 2004 |2 |6.8 NE |

|September 16, 2004 |4 |2.7 E |

As seen here, from Table 1-8 where comparison results are shown, even when fewer data points are available for comparison, differences in Z from September 16th are lower than for the 15th. A factor that might explain these discrepancies is the wind speed, which in this storm reached 31.5 m/s maximum for September 15th and much calmed –7.5 m/s maximum – for the 16th. These winds had a strong northeast component on the first day of the storm and a strong east component on the second; together with vertical winds – data that is not available for this study – add to the falling time of drops, accounting for the time shifting between measurements. These factors make more difficult the possibility of matching both quantities.

Even though bigger differences were expected in the results, Table 1-8 demonstrates that measurements for this event matched quite well. Percent difference calculated yield less than 1% and 3%, respectively for the 15th and 16th of September. Therefore it takes us to asses expected rain accumulation calculations for both days, to verify if they match as well.

Table 1-8 Differences between reflectivity measured by NEXRAD (ZNx) and reflectivity computed using 2dvd measured DSDs (Z2dvd) for the time period shown in Figure 1-14. Calculations were performed after best match of graphs.

|Comparison results |September 15th |September 16th |

|(dBZ) | | |

|max diff |13.8982 |11.6318 |

|min diff |0.0052 |0.0388 |

|avg ZNx |30.8945 |24.2222 |

|avg Z2dvd |30.7486 |23.6475 |

|Percent difference |0.47% |2.4% |

Uncertainties discussed above introduce inconsistencies in measurements that could cause 2dvd to measure drops coming from a cloud region different than the one NEXRAD is taking measurements. This shows in the results presented here regarding the differences in Z, though given the poor resolution of NEXRAD data provided by NWS (5 dBZ range for each measurement), Z comparisons outcomes were better than expected. Furthermore, time resolution of NEXRAD measurements is 5 minutes and 2dvd’s is one minute, adding to the difficulty of comparing both.

8 Conclusions and Future Work

Fundamentally, at least two different types of DSDs are involved in Puerto Rico’s rainfall systems. This validates findings by Ulbrich, Petitididier, and Campos (1999) that DSDs are highly variable between coastal and mountainous zones, even for a small region as the island of Puerto Rico, which posses a very complex orography. The importance of considering variations of Z-R relationships between ocean and land surfaces, as proposed by Ulbrich et al. (1999), is therefore preliminary confirmed. This work contributes with supplementary data analysis for tropical environments that has been understood as necessary.

More events will be required to validate rainfall types differences found in this work. It was expected to find a bigger convective component in a tropical storm event, so further analysis of events with considerable amount of precipitation is suggested. It is recommended as well to use other methods to separate between stratiform and convective rain. Other methods use reflectivity values and/or reflectivity vertical images to identify stratiform or convective events. We have been told that events in our region are localized and therefore are convective by nature. Our recommendation comes from the examination of histograms of Dm and log10, where we observed a small additional component of other values besides the mean and mode values (see Figure 1-12 and Figure 1-11).

Differences in measured and computed fall velocities caused differences in Z-R relationships when using FIRM_DSD rain rate output and calculated R from measured DSDs. These differences were higher when stronger horizontal wind components are present. Our study was based on a tropical storm event where strong winds are known, then other rain events with lower wind velocities should be considered to validate the DSD characterization. It is necessary to compare both results with rain gauge or other validated data as well as with NEXRAD data, to decide what would be the best path to follow when computing expected Z. For that reason 2dvd literature specifies that it provides accurate data under low-wind conditions.

As future work, additional significant rainfall data from other events is available; these could be used to compare and validate these results. Data from other tropical regions can be considered and used for comparison purposes.

In terms of Z, some low-resolution data (3-minutes, 5 dBZ intervals) has been obtained from NWS local weather forecast office, and even when it will not serve for exact comparisons, it will impart confidence in our results.

Finally, existing techniques to filter errors that can be introduced by drops coming from different regions of the clouds are being analyzed. As bigger drops have different fall velocities than smaller drops, DSD estimation on the ground could not be in accordance with what radars are measuring above, causing errors in DSDs and therefore in Z computations. These errors are considered as observational noise. One of these techniques is the Sequential Intensity Filtering Technique (SIFT) which concentrates on the stability of the Z-R relationship during a physically uniform situation [Schönhuber, 2000]. It is recommended that SIFT, aimed to filter out observational noise, be further investigated, applying it to data sets considered here and to future sets as they become available from this ongoing measurement experiment.

9 References

Automated Surface Observing System (ASOS) description [Online]. Available from , 1999.

Aydin, K., and S.E. A Daisley, “Relationships Between Rainfall Rate and 35 GHz Attenuation and Differential Attenuation: Modeling the Effects of Raindrop Size Distribution, Canting, and Oscillation,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 40, No. 11, November 2002.

Bennett, S., V. Grusbisic, and R.M. Rasmussen, “Gravity Waves, Rainbands, and Deep Convection Induced by Trade Wind Flow Past Puerto Rico.” Available from .

Bringi, V.N., and V. Chandrasekar, Polarimetric Doppler Radar, First edition, 2001.

Bringi, V.N., T.D. Keenan, and V. Chandrasekar, “Correcting C-Band Radar Reflectivity and Differential Reflectivity Data for Rain Attenuation: A Self-Consistent Method With Constraints,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 9, September 2001.

Carter, M., J.B. Elsner, and S. Bennett, “Monthly Rainfall Climatology for Puerto Rico,” Southern Topics (National Weather Service Southern Region Headquarters publication), 1997. Available from .

Collaborative Adaptive Sensing of the Atmosphere (CASA) Engineering Research Center, “Overview” [Online]. Available from , 2003.

Cruz Pol, S. L., José Maeso, Margarita Baquero “DSD characterization and computations of expected reflectivity using data from a Two-Dimensional Video Disdrometer deployed in a Tropical Environment”, IEEE IGARSS 05, Korea., 2005.

Cruz Pol, S. L., Margarita Baquero, V. Bringi and V. Chandrasekar, ”Rain-Rate Estimate Algorithm Evaluation and Rainfall Characterization in Tropical Environments Using 2DVD, Rain Gauges and TRMM data”, IEEE IGARSS 05, Korea, 2005.

Denby, B., P. Golé, and J. Tarniewicz, “Structured Neural Network Approach for Measuring Raindrop Sizes and Velocities,” IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing, 1998.

Doviak, R. J.and D. S. Zrnic, Doppler Radar and Weather Observations, Second edition, Academic Press, San Diego, 1993.

Förster, J., “Rain Measurement on Buoys Using Hydrophones,” IEEE Journal of Oceanic Engineering, Vol. 19, No. 1, January 1994.

IDL information page [Online]. Available from .

Joss, J. and A. Waldvogel, “Raindrop size distribution and sampling errors,” Journal of Atmospheric Sciences, Vol. 26, pp. 566-569, 1967.

Kafando, P., and M. Petitdidier, “An attempt to calibrate the UHF strato-tropospheric radar at Arecibo using NEXRAD and disdrometer data,” MST10 Workshop, CETP/CNRS, Velizy, France, 2003.

Kruger, A., W.F. Krajewski, “Two-Dimensional Video Disdrometer: A description,” Journal of Atmospheric and Oceanic Technology, Vol. 19, pp. 602-617, 2001

Lee, GyuWon, and I. Zawadzki, “Erros in Rain Measurement by Radar due to the Variability of Drop Size Distributions,” Sixth International Symposium on Hydrological Applications of Weather Radar, Melbourne, Australia, 2-4 February 2004.

León-Colón, L.V. , S.L. Cruz-Pol, and S.M. Sekelsky, "Active Rain-Gauge Concept for Liquid Clouds using Verticaly oriented W-band and S-band Doppler radars", SPIE 9th International Symposium on Remote Sensing, Crete, Greece, 2002.

McLaughlin, D. , V. Chandrasekar, K. Droegemeier, S. J. Frasier, J. Colom, J. Kurose, and S. Cruz-Pol. "Distributed Collaborative Adaptive Sensing (DCAS) for Improved Detection, Understanding, and Prediction of Atmospheric", 2005 American Meteorological Society (85th AMS Annual Meeting), Ninth Symposium on Integrated Observing and Assimilation Systems for the Atmosphere, Oceans, and Land Surface (IOAS-AOLS), 2005.

Meneghini, R., S. W. Bidwell, R. Rincon, and G. M. Heymsfield, “Differential-frequency Doppler weather radar: Theory and experiment,” Radio Science, Vol. 38, No. 3, February 2003.

Morales J., J. Trabal, S. L. Cruz-Pol, and S. M. Sekelsky, "Ice Water Content (IWC) Retrieval from Cirrus Clouds Using Millimeter-Wave Radar and In-Situ Ice Crystal Airborne Data," IEEE IGARSS 05, Korea., 2005.

RealVNC (Virtual Network Computing) page [Online]. Available from , 2002.

Rincon, R. F., Roger Lang, Robert Meneghini, Steven Bidwell, and Ali Tokay,. “Estimation of Path-Average Rain Drop Size Distribution using the NASA/TRMM Microwave Link,” IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Vol. 3, 9-13 July, 2001.

Rincón, R.F. and R. Lang, “Microwave Link Dual-Wavelength Measurements of Path-Average Attenuation for the Estimation of Drop Size Distribution and Rainfall,” IEEE Transactions Geoscience and Remote Sensing, Vol. 40, No. 4, pp.760-770, April 2002.

Schönhuber, M., “Distrometer results obtained in various climates and their application to weather radar data inversion,” ESA SP-444 Proceedings, Millennium Conference on Antennas & Propagation, Davos, Switzerland, April 9-14, 2000.

Schönhuber, M., H.E. Urban, J.P.V. Poiares Baptista, W.L. Randeu and W. Riedler., “Weather Radar versus 2D-Video-Distrometer Data,” Proceedings of the III International Symposium on Hydrological Applications of Weather Radars, São Paulo, Brazil, August 20-23, 1995.

Schönhuber, M., H.E.Urban, J.P.V.Poiares Baptista, W.L.Randeu, and W.Riedler. “Measurements of Precipitation Characteristics by a New Distrometer,” Institute for Applied Systems Technology, JOANNEUM RESEARCH, Graz/Austria, 1994.

Simpson, J., C. Kummerow, W.-K. Tao, and R. F. Adler, “On the Tropical Rainfall Measuring Mission (TRMM),” Meteor. Atmos. Phys., vol. 60, pp. 19-36, 1996.

Takahashi, N.; H. Uyeda, Y. Asuma, S. Shimizu, Y. Fujiyoshi, S. Sato, T. Endoh, K. Takeuchi, R. Shirooka and A. Sumi, “Radar observations of tropical clouds on the Manus Island, Papua New Guinea,” Geoscience and Remote Sensing Symposium,. IGARSS '93. Better Understanding of Earth Environment., International, vol.2, pp. 615 –617, 1993

Trabal, J, Cruz-Pol, S, Colom, J.G., Sekelsky, S, “Puerto Rico Radar Network Design; Site Survey “, IGARSS 03, France, 2003.

Ulaby, F. T., R. K. Moore and A. K. Fung, Microwave Remote Sensing: Active and Passive, Vol. 1, Artech House, Norwood, 1981.

Ulbrich, C. and N.E. Miller, “Experimental Test of the Effects of Z-R Law Variations on Comparison of WSR-88D Rainfall Amounts with Surface Rain Gauge and Disdrometer Data”, Weather and Forecasting, Vol. 16, pp. 369-374, American Meteorological Society, June 2001.

Ulbrich, C. W., “Natural variations in the analytical form of the raindrop size distribution,” J. Climate Appl. Meteor., vol. 22, pp. 1764-1775, 1983.

Vázquez, M. and A. Roche, “An evaluation of WSR-88D rainfall estimates across Puerto Rico during Hurricane Debbie,” Internal Document from NWS San Juan Weather Forecast Office, Carolina, Puerto Rico.

4 Material type determination of Subsurface Large Objects or Layers Using Ground Penetrating Radar, PI: Dr. Hamed Parsiani

1 Technical Summary

This research has concentrated in soil characterization using a Ground Penetrating Radar, GPR. Soil characterization, consists in obtaining estimates of the soil properties, such as dielectric constant, conductivity, etc. from these estimates the soil type and moisture content can be inferred. We, approach the problem by identifying a signature that represents the soil type, and its moisture content. This signature is fed to Neural Network, NN, to obtain our results. We refer to our approach as the Material Characteristics in Frequency Domain, MCFD.

Soil type determination can be readily achieved if the samples used (both training and testing) are dry. When wetness is present, it is harder to determine the soil type, and a second NN is needed to obtain good results. Our laboratory experiments concerning soil type determination were done using materials that were as dry as possible. After the soil was wet, soil characterization concentrated on moisture content determination. Open field experiments also concentrated on moisture content determination since the field may be a mixture of different types of materials, thus limiting the application of our approach.

Results are presented hereafter. Three different types of experiments were conducted: laboratory experiments, open field experiments with the antenna placed on top of the soil surface, and open field experiments with the antenna being pulled by a moving vehicle at two feet above the ground. All three types of experiments produced acceptable results in terms of soil type determination (only in laboratory experiments), and soil moisture determination.

2 Laboratory Experiments

1 Description

Experiments in controlled conditions helps us to analyze the strengths and weaknesses of the proposed approach, as well as determining its feasibility before testing it in more challenging scenarios. Material type and soil moisture experiments were performed using three different soil types, namely, sand, loam, and clay.

A reasonably large amount of soil was put in a plastic bag, and placed inside a plastic container, which was then hung from a support about 5 feet above the ground, as depicted in figure 1

[pic]

Figure 1 Laboratory setup for data acquisition.

The experiments began with dry soil, and then water was added gradually. Every time that water was added, the soil was mixed as to obtain a homogeneous mixture. The GPR was placed on top of the soil and several scans were taken, then the actual moisture content was measured with a Theta Probe. This process was repeated until the soil was fully saturated.

To quantify the error between the actual soil moisture, measured by Theta Probe, and the soil moisture obtained using our approach, the relative error is used.

[pic]

where T.Probe is the moisture content measured by the Theta Probe, GPR is the moisture content obtained using the MCFD/NN method.

2 Results and Discussions

The method was first tested with the purpose of soil type determination. Training, validation and testing sets were selected from the pool of measurements presented in Table 1. The validation set is required in order to see how well the model obtained by the NN fits data that was not present in the training set. Towards this goal, the training procedure shown in Figure 2 was implemented. Assume that the training has converged to the desired threshold. Present the validation set and let ( be the MSE on this set, and ( the desired validation threshold. If ( ( ( then a neuron is added to the hidden layer. All the weights in the network are re-initialized and training is resumed with this new configuration.

[pic]

Figure 2 Neural Network training procedure.

When ( < (, ( is recomputed as [pic], where ( is an arbitrarily chosen positive constant, and 0 < ( ( 1. The network is saved as the best so far in the training. With this training procedure it is assured that when training has finished, the best possible network would have been saved.

Results of soil type determination are presented in Figure 3. As it is shown in the figure, there is one sample of clay, and two samples each of loam and sand. Clearly soil type of each of the five soil samples in the testing set was determined with good accuracy. The samples used in this experiment ranged from dry to wet, so the moisture that was being introduced may have become a source of error. Nevertheless, the proposed approach seems effective, and its usefulness will be reinforced with laboratory experiments for soil moisture determination, and later with more challenging experiments in uncontrolled environments.

Now, we turn our attention to soil moisture determination. There were three different NNs trained in this case, one for each of the three distinct soil types. Results show good accuracy as presented in figures 4 through 6.

Figure 4 shows the results for the clay experiment. The training samples are presented in red, and the testing samples are in green. Although the amount of data is not very large, a constraint that became inevitable due to the nature of the experiments, good results were obtained. As seen in figure 4, only one of the samples fell way off the line. This may be due in part to a small training set, and in part due to the soil not being homogeneously mixed to obtain a uniform moisture distribution, which may have caused the Theta Probe to give inaccurate readings.

[pic]

Figure 3 Soil Type determination.

[pic]

Figure 4 Results of soil % moisture determination for clay soil..

Results for the loam are shown in Figure 6. We can see that in this case, greater accuracy was obtained, with all testing samples falling close to the line. The loam soil is easier to manage than the clay, so the errors due to poor mixing were reduced.

3 Field Experiments

1 Description

A four-day field campaign extending over a period of six months (two mini-campaigns, two days each and six months apart) was conducted at the baseball field inside the University of Puerto Rico Mayagüez campus with the objective of obtaining open-field soil moisture information.

Soil moisture determination in true open field conditions with GPR is a difficult task that has not been tackled by many due to the fact that an open field is full of hardships that are beyond the control of the researcher, such as inhomogeneities, the field being cluttered with grass, rocks, and other non-soil items, field saturation due to extreme precipitation, etc. These conditions constantly change the characteristics of the field. The effect of some of these factors can be reduced with efficient signal processing techniques. Some work can be found in [7] using ground wave techniques, and more recently in [24] based on electromagnetic inversion.

The first part of the campaign dates back to February-March of 2005 while the second part took place during the month of September of 2005. Field conditions were subjected to noticeable changes in the time between March and September, going from having no grass to an almost outgrown field that showed scattered wetness due to the continuous precipitation characteristic of the area. These changes in field conditions produced soil moisture data in the range of 6% to 37%, as measured by a Theta Probe. The exact distribution is shown in Table 2.

Due to the inhomogeneous (even at the scale of the antenna footprint) nature of the soil in an uncontrolled environment, there can be sharp dielectric contrasts within a small area. Such contrast is observed as variations in the readings obtained by the Theta meter when probing the area under the footprint of the GPR. In order to measure true soil moisture that is to be used as target values in the NN training. To reduce the effect of soil inhomogeneity, three Theta Probe readings were obtained at each measurement point along the main diagonal of the antenna footprint. The median value was used as target value for NN training. Several scans were taken in order to select the best available scan by visual inspection.

[pic]

Figure 5 Results of soil % moisture determination for loam soil

[pic]

Figure 6 Results of soil moisture determination for sand soil.

Figure 6 depicts the results when the soil being used was sand. Again, better results were obtained as compared to Figure 4. Table 2 shows the same results in tabular form, with the relative error specified for each case.

Table 1 Relative error calculation for soil moisture laboratory experiments.

| |Theta Probe reading (%) |GPR soil moisture (%) |Relative Error (%) |

|Clay |15.10 |16.73 |10.79 |

| |18.40 |19.56 |6.30 |

| |19.10 |18.17 |4.87 |

| |23.18 |17.87 |22.91 |

|Loam |13.70 |15.38 |12.26 |

| |19.80 |20.88 |5.45 |

| |24.40 |24.21 |0.78 |

|Sand |4.60 |2.67 |41.96 |

| |13.30 |13.18 |0.90 |

| |20.30 |23.82 |17.34 |

Table 2 Distribution of Measurements

|Moisture Percent range |Measurements |

|5% - 9% |15 |

|10% - 15% |20 |

|16% - 20% |7 |

|21% - 25% |0 |

|26% - 30% |2 |

|31% - 35% |5 |

|36% - 40% |1 |

To measure the accuracy to which soil moisture is obtained, as compared to actual soil moisture, relative errors are used. The definition of the relative error given in equation 1 is modified here to take into account the variations in the Theta Probe readings at the footprint of the GPR, as described above. The relative error of the ith measurement point is given by,

[pic] 1

Where i represents the number of test points, T.Probei is the actual soil moisture as obtained by the Theta meter, GPRi is the soil moisture obtained using the MCFD/NN method, and Ri is defined as the range of the variations in the Theta Probe readings,

[pic] 2

2 Results and Discussion

The data collected following the procedure given in section 1.2.1 was analyzed, and data points where the three Theta Probe readings differed by less than 3% were kept, otherwise it was rejected. The remaining data points were divided in training, validation, and testing sets. The training set consisted of data points that showed more consistency in the Theta Probe readings.

Training of the NN was done following the procedure presented in Figure 2. Four NNs were created. The first NN was trained and tested on data taken from the first day of measurements, back in February 2005. The second NN was trained and tested on data taken from the second day (March). Combined data from the first two days was used to train and test the third NN. Finally, a fourth NN was created. Training data was taken from the first two days (February-March); this NN was tested on data taken from the September leg of the campaign, six months later. With this test, we will be able to assess the capabilities of the MCFD/NN approach to obtain accurate soil moisture information, over time.

The results are shown in figures 7 through 10, along with relative error calculations for the testing set in each of the four cases. As the results show, soil moisture was determined in an open field, uncontrolled, environment with good accuracy. The obtained accuracy can be seen from tables 4 through 7. Specially, table 7 presents the results when the NN was trained and tested on data taken six months apart. Average relative errors are presented in table 8 for each of the four NNs. The smallest relative error, on average, resulted when the NN was trained on data from the first two days of the campaign, and tested on data from the last two days; a six month difference.

An average 13.55 % relative error, which amounts to about 2% difference in moisture content, is not a prohibitive large price to pay when determining soil moisture over large tracks of land. The proposed approach is well suited for on-line implementation, allowing the user to obtain soil moisture maps for a given field. Also, the method has an important application when it is sought to obtain data for satellite validation.

Table 3 Relative error calculation for data in figure 7.

|T.Probe reading (%) |GPR moisture (%) |R (%) |Rel. Error (%) |

|6.77 |8.20 |1.09 |19.34 |

|7.40 |7.43 |1.11 |0.36 |

|14.88 |15.60 |1.11 |4.34 |

|16.0 |14.96 |1.06 |6.15 |

[pic]

Figure 7 Soil moisture determination. Day one data.

[pic]

Figure 8 Soil moisture determination. Day two data.

Table 4 Relative error calculation for data in figure 8.

|T.Probe reading (%) |GPR moisture (%) |R (%) |Rel. Error (%) |

|9.10 |8.80 |1.07 |3.09 |

|10.43 |12.78 |1.02 |22.10 |

|12.56 |14.82 |1.10 |16.37 |

|17.16 |17.10 |1.13 |0.31 |

Table 5 Relative error calculation for data in figure 9.

|T.Probe reading (%) |GPR moisture (%) |R (%) |Rel. Error (%) |

|11.6 |10.7 |1.28 |6.27 |

|17.8 |20.4 |1.18 |12.2 |

|32.5 |29.4 |1.12 |8.42 |

|17.3 |11.1 |1.26 |28.3 |

|15.7 |10.0 |1.21 |29.9 |

|33.3 |36.0 |1.08 |7.55 |

|19.9 |15.6 |1.19 |18.2 |

|14.2 |13.1 |1.19 |6.49 |

|9.40 |8.44 |1.40 |7.24 |

|7.40 |8.72 |1.11 |16.1 |

|6.70 |8.74 |1.09 |27.9 |

|9.40 |9.63 |1.32 |1.82 |

|20.2 |12.9 |1.22 |29.4 |

|14.8 |15.9 |1.11 |6.68 |

|10.1 |13.4 |1.14 |28.6 |

|15.6 |14.4 |1.17 |6.33 |

|12.6 |11.5 |1.08 |7.78 |

|9.4 |10.0 |1.33 |4.89 |

|19.6 |25.8 |1.17 |27.1 |

|10.5 |9.87 |1.02 |5.89 |

Table 6 Relative error calculation for data in figure 10

|T.Probe reading (%) |GPR moisture (%) |R (%) |Rel. Error (%) |

|11.6 |14.4 |2.70 |8.92 |

|14.2 |15.9 |1.19 |9.83 |

|9.4 |7.76 |1.40 |12.5 |

|15.0 |14.9 |1.07 |0.70 |

|14.0 |10.3 |1.03 |25.4 |

|10.8 |12.9 |1.17 |17.1 |

|11.3 |15.7 |1.13 |34.4 |

|9.80 |10.3 |1.17 |4.44 |

|9.40 |8.67 |1.11 |6.97 |

|15.5 |11.9 |1.13 |20.7 |

|10.7 |8.15 |1.12 |21.3 |

|9.90 |9.94 |1.12 |0.33 |

[pic]

Figure 9 Soil moisture determination. 4-day combined data.

[pic]

Figure 10 Soil moisture determination. Trained day 1 and day 2. Tested day 3 and day 4.

Table 7 Average Relative & Moisture Content Errors of the four NNs for the open field experiment

|Data |Average Relative Error (%) |Average Moisture Content Error (%) |

|Trained and Tested from day 1 |7.55 |0.81 |

|Trained and Tested from day 2 |10.47 |1.24 |

|Trained and Tested from all 4 days |14.35 |2.62 |

|Trained from days 1 & 2 Tested from days 3 & 4 |13.55 |1.99 |

4 Continuous Soil Moisture Determination

In this experiment, the GPR was connected to the back of a moving vehicle, as shown in figure 11. The experiment had been designed with the objective of obtaining validation data for the ALOS satellite. For this reason, the studied area was divided in five by five meter squares representing pixels in the satellite image. A total of 24 pixels were obtained aligned in five rows of the field.

[pic]

Figure 11 GPR being pulled by moving vehicle.

To obtain ground truth information, actual soil moisture was measured using a Theta Probe. A total of five measurements were taken for each pixel to consider the within-pixel variability. These readings were averaged, and average moisture was used as target values for the MCFD/NN algorithm.

GPR measurements were taken driving the vehicle over each line of pixels in the field. As the GPR was passing through a pixel four scans were taken. For each pixel, a “best” scan was chosen, based on the waveform characteristics, to train the NN. The remaining three scans were used for validation and testing.

Two scatter plots were produced; representing the soil moisture obtained using the MCFD/NN algorithm versus the Theta Probe readings. In the first plot, figure 12, the GPR derived soil moisture was plotted using all the testing scans from each pixel. In the second plot, figure 13, the soil moisture obtained from each of the scans was averaged. Per-pixel average moisture is reported, since this is the information that’s conveyed in the target values with which the NN was trained. An average absolute error of 3.98% was obtained from the first plot, while a 3.62% average absolute error resulted from the second plot. A small improvement in error percentage was observed in figure 14. Since the NN was trained on average moisture, the results from individual pixels were also averaged for a more realistic comparison.

Considering the individual errors, from figure 12, the maximum deviation from the average moisture content in each pixel was 7%. When the within-pixel moisture variability is considered, it can be concluded that our results are accurate.

5 Summary and Conclusions

Soil characterization has been successfully accomplished with good accuracy, as has been reported. The soil type has been determined for sand, loam, and clay types of soil in laboratory conditions. In the case of soil moisture, laboratory experiments were performed on the previously mentioned soil types. Also, results of experiments in open field conditions have been presented showing the effectiveness of the MCFD/NN approach. It was shown that the method of MCFD was able to capture the relevant characteristics of the field under study under varying moisture conditions, allowing the NN to obtain an adequate mapping for the field as shown in figure 10. The average relative error obtained was 13.55 %.

It is concluded from this result that after a suitable model has been obtained using the MCFD/NN approach, that model is valid over an undetermined period of time. This is particularly useful in the area of agriculture where fast, accurate soil characterization needs to be obtained at a low cost.

[pic]

Figure 12 Continuous soil moisture determination results.

[pic]

Figure 13 Continuous soil moisture determination – average moisture per pixel

6 References

[1] G.C. Topp, J.L. Davis, and A.P. Annan “Electromagnetic determination of soil water content: Measurements in coaxial transmission lines.” Water Resour. Res., 16:574-582, 1980

[2] G.C. Topp, J.L. Davis, and A.P. Annan “Electromagnetic determination of soil water content using TDR I: Application to wetting fronts and steep gradients.” Soil Sci. Soc. Am. J., 46:672-678, 1982

[3] G.C. Topp, J.L. Davis, and A.P. Annan “Electromagnetic determination of soil water content using TDR II: Evaluation of installation and configuration of parallel transmission lines.” Soil Sci. Soc. Am. J., 46:678-684, 1982

[4] G.C. Topp, and J.L. Davis “Measurements of soil water content using time-domain reflectrometry(TDR) a field evaluation”. Soil Sci. Soc. Am. J., 49:19-24, 1985

[5] Dasberg, S. and F.N. Dalton “Time domain reflectrometry field measurements of soil water content and electrical conductivity.” Soil Sci. Soc. Am. J., 49:293-297, 1985.

[6] Huisman, J. A., Hubbard, S. S., Redman, J. D., Annan, A. P. “Measuring Soil Water Content with Ground Penetrating Radar: A Review.” Vadose Zone Journal 2:476-491, 2003

[7] S.S. Hubbard ,Grote,K., and Y. Rubin, “Mapping the volumetric water content of a California vineyard using high-frequency GPR ground wave data.” Leading Edge Explor. 21:552-559,2002

[8] S. Hubbard, K. Grote, M.B. Kowalsky, and Y. Rubin. “High Resolution estimation of near-subsurface water content using surface GPR ground wave information.”

[9] A. Chanzy, A. Tarussov, A. Judge, and F. Bonn. “Soil water content determination using a digital ground penetrating radar.” Soil Sci. Soc. Am. J., 60:1318-1326, 1996

[10] K. Grote, S. Hubbard, and Y. Rubin. Field-scale estimation of volumetric water content using ground-penetrating radar ground wave techniques.” Water Resources Research, 39, 2003

[11] Lambot, S., “Hydrogeophysical Characterization of Soil Using Ground Penetrating Radar.” PhD Thesis, Université catholique de Louvain, November 2003

[12] Freeman, J. A., Skapura, D. M, “Neural Networks: Algorithms, Applications and Programming Techniques”, Addison-Wesley Publishing Company, 1991

[13] Dalton, F.N, “Measurement of soil water content and electrical conductivity using time-domain reflectrometry.” International Conference on Measurement of Soil and Plant Water Status. Centennial of Utah State Univ., pp. 95-98, 1987

[14] Dasberg, S. and A. Nadler, “Field sampling of soil water content and electrical conductivity with time domain reflectometry.” International Conference on Measurement of Soil and Plant Water Status. Centennial of Utah State Univ., pp. 99-102, 1987

[15] Topp, G.C., “Electromagnetic determination of soil water content: measurements in coaxial transmission lines.” Water Resour. Res., 16:574-582, 1980.

[16] Topp, G.C., “The application of time-domain reflectrometry (TDR) to soil water content measurement.” International Conference on Measurement of Soil and Plant Water Status. Centennial of Utah State Univ., pp. 85-94, 1987

[17] Guy Serbin and Dani Or. “Ground-penetrating radar measurement of soil water content dynamics using a suspended horn antenna.” IEEE Trans. Geoscience and Remote Sensing, 42:1695-1705, 2003

[18] Simon Haykin. “Adaptive Filter Theory.” Prentice Hall, 2002

[19] Leon Peters Jr., Jeffrey J. Daniels, and Jonathan D. Young. “Ground-penetrating radar as a subsurface environmental sensing tool.” Proceedings of the IEEE, 82:1802-1822, 1994

[20] Hamed Parsiani, Pedro Torres, and Maritza Torres. “Material characteristics in Fourier domain (MCFD) formulation, a signature to determine soil type, moisture, and vegetation health based on multilayer ground penetrating radar reflection.” Proceedings of IASTED SIP, 2005

[21] D.J. Daniels. “Surface-Penetrating Radar.” The Institution of Electrical Engineers, 1996

[22] “SIRveyor SIR-20.” Geophysical Survey Systems, 2002

[23] Lennard Ljung. “System Identification: Theory for the User.” 2nd Ed. Prentice Hall, 1998

[24] S. Lambot, L. Weihermller, I. van den Bosch, M. Vanclooster, and E.C. Slob. “Full-wave inversion of off-ground monostatic GPR signal focused on the surface reflection for identifying surface dielectric permittivity.” Proceedings of the 3rd International Workshop on Advanced Ground Penetrating Radar, 2005

5 TerraScope: Dynamic Image Retrieval and Composition Services in Distributed Information Systems by Bienvenido Vélez, Manuel Rodríguez and Pedro I. Rivera

1 Problem Statement

The emergence of large-scale distributed environments, such as the Internet, has provided access to vast collections of satellite images, hyper spectral images, GIS data, measurements and other kinds of data products essential to carry out high impact research in many scientific and engineering disciplines [11,17,19]. Many universities, research laboratories and government agencies provide access to these data sets using a Web-based interface, which allows the user to select data products from a pre-defined set of available data holdings [17]. However, these solutions provide limited support for novel services that:

• Efficiently integrate data from multiple data sources into a single data stream that can be used by institutions to create new data products and services.

• Enable end-users (e.g. DSP engineers) to supply custom tailored processing algorithms that can be used during the process of generating new image products.

• Respect the administrative autonomy of information providers while at the same time supporting collaboration to provide integrated access to all available data.

The TerraScope system embodies our vision of a novel Distributed Information System where multiple data sources (e.g. hyper spectral image databases) can be integrated into a coherent system to harvest the data products that each data center stores. These data products may include images, metadata, GIS features, and any other relevant type of information. In this system, new data products can be generated on demand from base products, and users can supply their own application-specific code to assist in the dynamic generation of the new data products. The system should support operations such as:

• Retrieval and visualization of images such as hyper spectral images, MODIS, AVHRR and ETM+ from multiple data sources. This could be extended to images obtained from different sensor modalities.

• Dynamic sub-setting (slicing) of images to create mosaics from two or more images extracted from different data sources, or generated on-the-fly by a previous query.

• Overlay of images to form composite images with different levels of details on each layer and higher information content.

• Dynamic generation of MPEG, QuickTime or Flash movies that illustrate changes in the time domain. Examples of such changes are coral reef destruction, growth of internal tumors inside live organisms, changes in ocean bottom topology, and other kinds of natural phenomena captured via different sensor modalities and subsurface sensing techniques.

2 Relevance to NASA

According to the NASA Administrator, Honorable Sean O’Keefe, NASA’s mandate is: to improve life here, to extend life to there and to find life beyond. To aid in the achievement of this mission, NASA continuously maintains one of the largest collections of multi-media data under both its Space Sciences and Earth Sciences enterprises. Information is collected both internally by NASA as well as by NASA-supported research centers across the world such as the Tropical Center for Earth and Space Studies hosted at the University of Puerto Rico Mayagüez. This information is used to reach into the depths of hidden world not reachable by more standard means including deep space and deep earth.

Such information can only be of value if it becomes accessible to the scientists and engineers who in the end can use it to carry out their research and generate new knowledge. A system like TerraScope could serve as an important vehicle for the effective sharing and dissemination of this valuable information. TerraScope’s design goals have been chosen with this objective in mind. For instance, the system follows a peer-to-peer architecture in which any information provider serves information to both clients as well as other information providers. Each TerraScope server is capable of handling queries and contacts other peer servers to gather all information relevant to these queries. This type of architecture facilitates load balancing among all available peer servers and avoids single points of failure that may render the system inaccessible and are the norm in more centralized approaches including integration servers. Furthermore, the TerraScope user interface will support dynamic data product composition potentially using user-provided data processing algorithms.

3 Benefits to Society

The TerraScope System has been designed using only standard and open protocols and off-the-shelf components available in multiple platforms. This choice provides the greatest opportunities for the wide proliferation of tools and information among data providers and users. For instance, the TerraScope Image Navigator (TIN) was completely developed using Macromedia Flash MX, a tool originally designed for interactive movie authoring. Users wishing to access TIN can do so from any web browser like Internet Explorer or Netscape Navigator and only need to install a freely available Flash player to run the application. Thus the required software tools are available on virtually all major platforms including MS Windows and Linux.

Open standards and protocols facilitate the dissemination of NASA information beyond NASA’s scientific partners. Society at large can greatly benefit from having access to user friendly tools and information collected by NASA. In particular, K-12 students can use this information to learn about the earth and the universe in a way amenable to individual creativity and exploration that extends beyond simple navigation of a static repository of information. Kids can exploit TerraScope’s support for dynamic image and video subsetting and composition in order to, not only find, but create the answers to their inquiries.

4 Goals of the Component (Short Term)

• Design a middleware framework to support a peer-to-peer distributed image database system

• Design a web-enabled client capable of supporting image browsing and retrieval via the Internet and using standard protocols and off-the-shelf software components.

• Disseminate research results in peer-reviewed conferences and thru the web

5 Component Accomplishments

• Updated our literature review on state-of-the-art peer-to-peer application development frameworks

• Selected FreePastry as the development platform for the peer-to-peer version of the Terrascope SRE

• Developed an MS level research proposal for student Abdiel Aviles

• Begun the implementation of the peer-to-peer implementation of the Terrascope SRE and TIN based on the FreePastry peer-to-peer application development framework.

6 Student Accomplishments

The Terrascope subcomponent of the IPEG component supported student Lizvette Malavé. Lizvette completed his MS dissertation and obtained the MS degree in Computer Engineering in May of 2005. She published her work in a paper entitled “Terrascope Image Clustering: Applying Clustering Techniques to Image Agglomeration in Image Retrieval Systems” at the Third IASTED International Conference on Communications, Internet, and Information Technology (CIIT 2004). Lizvette herself presented her work at the conference.

7 Technical Summary

The emergence of multimedia technologies and the possibility of sharing and distributing geospatial satellite data through broadband computer networks have exacerbated the need for geographical information technologies. In one typical scenario, geospatial data is periodically collected at geographically distributed ground stations which often span several technical, administrative and even political domains. Solutions attempting to offer integrated access to distributed data by relying on centralization of data repositories are often not feasible.

Frequently, scientists around the world need to effectively access information necessary for research. Data Many satellite data collector centers are distributed around the world. They provide web access to their databases of satellite images and other geographic useful for the earth. The Terrascope system is a new system developed at the University of Puerto Rico-Mayagüez Campus to provide effective and transparent access to these inherently distributed images.

The inherently distributed nature of geospatial data should be of no concern to scientists, students or other data consumers. The information system should enable and facilitate collaboration among a set of distributed data providers who wish to collaborate to provide an image of a single data repository with minimal loss of their individual autonomy over their data. Based on this principle we are designing and developing the TerraScope Earth Science distributed peer-to-peer database middleware system conformed by a Search and Retrieval Engine (SRE) [33] and TIN, the graphical user interface (GUI) module.

The Terrascope system follows the client server architecture depicted in Figure 1. The server module is called the Search and Retrieval Engine (SRE) [1] and the client module is called the Terrascope Image Navigator (TIN) [2]. The server (SRE) consists of a set of JAVA servlets modules running inside a web server. These modules implement an abstraction of a single data repository by communicating with multiple TerraScope SRE’s peers. The SRE computes the set of results by potentially forwarding queries to other SRE’s believed to hold data pertaining to the query specified by the user. Hence, the SRE’s act as servers to the TIN clients, and also as clients of other SRE’s. This type of architecture is often called peer-to-peer. For more details of the SRE and TIN, the reader is referred to [1] and to [2].

[pic]

Figure 1. Terrascope Architecture

The TerraScope Image Navigator (TIN) provides the end user graphical interface to access the collection of SREs. It is an image browser that provides ubiquitous and efficient access to distributed information. The TIN prototype delivers satellite images with their corresponding metadata, GIS characteristics, and other information to any web browser with a Flash MX player installed. Initially, TIN displays a map showing the overall geographical region covered by the satellite ground stations contributing data to the distributed repository (see Figure 2). The current prototype includes data collected by the Tropical Center for Earth and Space Studies (TCESS), the Center for Subsurface Sensing and Imaging Systems (CenSSIS) both at the University of Puerto Rico Mayaguez Campus and data collected by the Aster sensor from the Southern Urals Region of Russia. Using familiar GUI controls TIN users can restrict the scope of their search to a specific data repository, geographical region, type of satellite sensor (e.g. MODIS, RADARSAT and Landsat 7) and data collection date.

TIN displays polygons of the images found in the database corresponding to the user query and automatically geo-references these images within the query image (see Figure 3). The user can click on the geo representation he/she is interested in and the image will be displayed in the main window. Once an image of interest is selected and displayed, users may easily search the database for images contained within this image (sub-images) or for images overlapping it (see Figure 4). In other words, each browsed image provides a geospatial context from which future exploration may proceed; a feature that we call recursive navigation.

[pic]

Figure 2. TIN’s Initial Window

[pic]

Figure 3. Geo Representation of images retrieved by TIN in response to a query

Previous user studies have evidenced that geo-representation of images helps users visualize the location of images and provide an effective way to present this kind of data. However, when the database consists of multiple highly overlapping images, geo representation of images may worsen the ability of users to find images.

[pic]

Figure 4. Recursive Navigation Example in TIN

During the reporting period we focused our research on image clustering techniques that could be applied to ameliorate the image clobbering problem. The databases that Terrascope is intended to support contain many images that overlap one another because the satellites that recollect the images are continuously rounding the same geospatial region. The system must support temporal queries across all such overlapping images. As shown in Figure 5, the problem is that the user is unable to access all the data requested when the images overlap and can not request to download the images hidden by other images. In order to access the data the user may be forced to make different queries to retrieve less information per query. Coming up with suitable queries to get rid of unwanted information can be often a tedious and time consuming process.

[pic]

Figure 5. Example of Image Overlapping in TIN

Another problem confronted by the user is that properties of data cannot be effectively visualized in the result set. For instance, the results in Figure 5 come from different sources and sensors, but this is not easily identified by just looking at the geo-referenced polygons.

[pic]

Figure 6. Initial Context Geographical Area

TIC exploits image clustering in order to automatically classify the retrieved images into smaller groups and present them to the user in a way that facilitates browsing and finding images of interest. The central hypothesis of our work is that image clustering can be used to improve the effectiveness of image retrieval/browsing systems with similar characteristics to TIN. We test our hypothesis by conducting two types of experiments. The first one is a comparison of several image clustering algorithms and the second is a user study. Our experimental results provide evidence in favor of the effectiveness of TIC to improve the presentation of results and the ability of the users to visualize the properties of data.

1 A Peer-to-Peer Communication Infrastructure for Terrascope

The TerraScope Earth Science database middleware system arose as a solution to the need for unified geographical information technologies and the possibility of sharing and distributing geospatial satellite data through broadband computer networks. Typically, geospatial data is periodically collected at ground stations that are usually geographically distant and span several technical, administrative and even political domains. To offer an integrated and centralized solution to access the inherently distributed spatial data is not feasible. The TerraScope system intends to provide an integrated yet distributed solution. By applying peer-to-peer techniques our system takes advantage of its distributed nature by delegating storage and computing power to a network of collaborating peers, rather than relying on a centralized scheme.

Centralization often suffers from scalability and reliability problems. The nature of peer-to-peer systems adds scalability, fault tolerance, self organization and the potencial for great amounts of storage and computing power. In terms of our application, peer-to-peer fits and solves the problem of geographical distribution of spatial data. Our approach is to hide this complexity by providing a system conformed by a Search and Retrieval Engine (SRE) [6] peer server and the Terrascope Image Navigator (TIN), the graphical user interface module (GUI) [11].

The current implementation of the TerraScope system provides a basic functionality in terms of the projects objectives. The system has many limitations in terms of scalability, load balancing and a practical peer-to-peer architecture. The web-based graphical user interface lacks some functionality crucial to the inherent dynamic nature of the system. It is then of great importance to deal with these limitations for it to be of practical use.

The current approach of this research consists on various improvements to the current system. First, the improvement and development of a peer-to-peer communication infrastructure for the SRE servers has been completed using FreePastry as development infrastructure. Old SRE's do not implement a true peer-to-peer architecture, they rather lacked the dynamic characteristics of practical peer-to-peer networks. Load balancing techniques will be applied as well as hierarchy based organization of peer nodes. On this last issue, algorithms, as well as part of the implementation, has already been done based on Scribe. Scribe is a large-scale, decentralized application level multicast infrastructure [5] built upon Pastry. This will add scalability and improve the performance of the actual system. The TerraScope's web-based graphical user interface is being improved; specifically the sources discovery, presentation, and results digestion algorithms.

2 Update Literature Review

Peer-to-Peer Technologies

Peer-to-peer (P2P) technology as popularized by Napster, an immensely famous music exchange system, refers to a network architecture as old as the Internet [7]. The Internet was originally conceived as a peer-to-peer system where the goal was to share computing resources around the U.S. Peer-to-peer systems and applications are distributed systems without any centralized control or hierarchy. The software running at each node is equivalent in functionality, removing the client and server differences. Recent peer-to-peer applications include redundant storage, persistence, selection of nearby servers, anonymity, search, authentication, and hierarchical naming. It might look like a full spectrum of applications but the main functionality is the efficient location of peers and data objects. A decentralized P2P system provides the potential for robustness against faults and intentional attacks, network scalability and collaboration among peers.

Chord

Chord is a distributed lookup protocol for p2p networks which goal is to efficiently locate a node that stores a particular data item. Its only operation is to map a given key to a node in the network [10]. The meaning of that key will depend on the application using Chord. Chord uses a variant of consistent hashing to assign keys to Chord nodes for load balancing and acts as a distributed hash function, spreading keys evenly over the nodes. Each node is assigned a subset of keys all of which are mapped to the node. Each node needs to be aware only of it successor to be able to route any message in the ring of keys. Additional routing information is stored for efficiency. In an N-node network, each node maintains information only about O(log N) other nodes, and a lookup requires only O(log N) messages.

Pastry

Pastry is very similar to Chord in terms of operation; given a numerical key, a Pastry node efficiently routes the message to the node with a node-Id that is numerically closest to the key, among all currently live Pastry nodes [9]. Pastry provides a general substrate for the construction of a spectrum of peer-to-peer applications like file sharing, file storage, group communication and naming systems. In an N-node network, the expected number of routing steps is O(log N). Each Pastry node keeps track of its immediate neighbors in the ring of keys, and notifies applications of new node arrivals, node failures and node recoveries within its neighbors. Pastry takes into account locality for proximity in the underlying Internet to minimize the distance messages travel. The Pastry protocol is completely decentralized, scalable, and self-organizing.

Can

Content-Addressable Network (CAN) is defined as a distributed infrastructure that provides hash table-like functionality on wide area networks like the Internet. The CAN architecture is scalable, fault-tolerant and completely self-organizing [8]. The CAN space is divided between the actual nodes in the network. This is similar to how Chord works. New nodes join the system by splitting an existing allocated zone from another node. This process is described in three steps. First the new node must find a node already in the CAN network. Then using CAN routing mechanisms it finds a node that can split its zone. After spliting the zone, neighbors are notified of the change.

SCRIBE – Application-level Multicast

Scribe is a large-scale, decentralized application level multicast infrastructure [5] built upon Pastry. It provides efficient application level multicast and is capable of scaling to a large number of groups, of multicast sources, and of members per group. Scribe builds a multicast tree formed by joining the Pastry routes from each group member to a rendezvous point associated with a group. Scribe's membership maintenance and message dissemination has the same robustness, self-organization, locality and reliability properties of Pastry.

3 Research Objectives

The main objective of the TerraScope project is to develop a Peer-to-Peer earth science data middleware system to facilitate collaboration among a set of data repositories who wish to provide their geospatial data through an integrated portal. The specific objectives of the TerraScope project are listed as follows:

• To develop a functional server with capabilities for search and recovery of data and/or metadata from different sources.

• A Peer-to-Peer architecture for automatic, scalable and cooperative organization and communication among servers.

• To develop a graphical user interface image retrieval and related information from the distributed database and deal with the effective presentation of this kind of data to the user.

The previous implementation of the TerraScope server system had many limitations in terms of scalability, load balancing and a practical peer-to-peer architecture. The web-based graphical user interface lacked some functionality crucial to the inherent dynamic nature of the system. These issues are being attended as the next objectives describe.

• Development of a scalable and self organized peer-to-peer communication architecture for the TerraScope's SRE.

• Improvement and development of a peer-to-peer communication infrastructure for the current implementation of the SRE servers.

• Design hierarchy organization of peer nodes based on real-time spatial and load information.

• Apply load balancing techniques as well as hierarchy based organization of peer nodes for scalability and performance improvement of the actual system.

• Improve TerraScope's web-based graphical user interface sources discovery and presentation algorithm.

• Improve TerraScope's web-based graphical user interface results digestion and presentation algorithm.

• To conduct a series of analysis on the system’s security, performance, scalability and stability/load capabilities.

4 Methodology

A simplified TerraScope architecture is depicted in Figure 1. The TIN or client component is the user interface to the system. It is implemented using Macromedia Flash technologies and communicates via HTTP requests with the SRE's components which are peer servers implemented using Java Servlets. The proposed goals of this research include modifications of internal algorithms as well as added functionality on the client component and are design and reimplementation of the peer-to-peer infrastructure and query distribution scheme used by the SRE's.

Terrascope Image Navigator

The TerraScope Image Navigator was developed using Macromedia Flash, XML and ActionScript as the development platform [11]. This image navigator presents the retrieved satellite images with their corresponding metadata, GIS characteristics, and other information to any web browser with a compatible Flash player installed.

Figure 2 shows the main screen for TIN. It displays a selectable and zoomable world map as well as familiar GUI controls with which users can restrict the scope of their search to a specific data repository, geographical region, type of satellite sensor and data collection date. Part of this research focuses on the method and algorithm followed to populate the sources and sensors lists. On the original implementation these lists were hard coded with the client. This will be replaced with an algorithm to retrieve current sources and sensors on the SRE network and will keep a repository with the gathered data for future connections.

As shown by Figure 3, TIN delivers a geo-representation of the images found in the database corresponding to the selected parameters. On a side panel it shows all the results received from the SRE network. Those can be grouped by parameters for a cleaner view. Users may easily search the database for images overlapping the selected image

[pic]

Figure 1: TerraScope Architecture

[pic]

Figure 2: TIN default screen

TIN client's queries are resolved by a single SRE, but in order to satisfy the most results the SRE must distribute the query to other SRE's in the peer-to-peer network. This will be the other aspect being attended by this research.

Peer-to-peer Search and Retrieval Enginer (SRE)

The TerraScope SRE is the main server and search engine. It is composed of three servlets that interact to process each query or request sent by the client. Each servlet performs a different operation. Refer to [6] for more detail about the internal architecture and operation of the SRE.

The main task for this research is to design and implement a practical peer-to-peer communication scheme and infrastructure for the TerraScope SRE. It must comply with many of the characteristics of traditional peer-to-peer systems like scalability, fault tolerance and self organization. Currently there exists various peer-to-peer algorithms that conform to those specifications, Chord, Pastry, CAN, Gnutella, among others. Since this research is not focused on developing a new generic peer-to-peer algorithm, we will focus on selecting the most appropriate and develop on the application level the specific needs of our project.

There are various tools available for peer-to-peer development. We have examined some of them in terms of stability, documentation and algorithm implementation. Our best fit was Rice University's FreePastry implementation of the Pastry algorithm. Currently there are two Pastry implementations, FreePastry from Rice University and SimPastry/VisPastry from Microsoft Research [2,4]. FreePastry is being developed in the Java Platform while SimPastry is being developed in .NET platform using C# as the programming language. FreePastry is an open source Java implementation of the Pastry algorithm. It is very stable, well documented and well supported. It is under active development and, at the time of this research, version 2.0 has been released. All of this was considered when selecting it above other tools, like OpenChord, which is the open source implementation of the Chord algorithm [3,4].

[pic]

Query Distribution and Load Balancing

Previously, the SRE engine used a simple scheme for query distribution among known peers. It worked by forwarding the client request to a fixed list of known sources. Each response is then encapsulated into the response sent to the client. Each peer SRE, in turn does exactly the same thing until repeated requests are encountered. The client for subsequent peers is the initial SRE. A smarter distribution algorithm is being designed and developed based on availability, load and potential results of peer SRE's. By having the right information and status about neighbor peers, SRE's can decide on who and what to ask for better results and performance of the system. Load balancing is another issue that is being attended. Load balancing will work at the client and server level. An algorithm is to be developed to help minimize the risks of overload on SRE's.

Implementation Status

This section presents the current status of the Terrascope project and details on advances made throughout this year of research. Most part of past semesters was spent on research for peer-to-peer technologies and initial design and development of the TerraScope peer-to-peer infrastructure. The previous semester research was focused on development, integration and testing. Various changes were needed on the TIN client, the SRE server and their communication protocol.

Several changes were done to the internal logic of the TIN client. The most important change is the way TIN loads the available sources in the network. The original implementation presented a static list of sources and sensors. This has been changed to dynamically load the lists. The client has a local file with possible live sources and their addresses. This list is iterated until one of them answers with a current list of sources and sensors. This fresh list of sources and sensors is then showed on the client. Other minor changes were made to fix some bugs and restructure the code.

The SRE's were heavily modified. The most important change was on the peer-to-peer infrastructure used by the SRE's for query distribution. Originally there was no presence, dynamic discovery or aggregation of peers into the TerraScope network. The known peers of each SRE were found on a local file, and it was used to iterate over them and distribute queries. We used the FreePastry implementation API to build a real peer-to-peer network with dynamic discovery, aggregation to the network and presence in the network. It works in the following way. As the server starts up, a FreePastry node is created. This node reads a local file containing a list of potential bootloaders or live nodes in the network. Once a live node is found, it is used to enter the network and a nodeId or unique key is assigned to the current SRE. If no live nodes are found, which means that there might be no network created, a new network is created. In this case a “network” of one node. After the aggregation, the current node broadcasts information about itself to nearby nodes and also sends a request for peer information. Nearby nodes will return an XML file containing important information about status, addresses, load and others of importance. These kind of messages are routed using FreePastry routing methods. Because of the dynamic nature of the network, messages can arrive at any moment or might even not arrive. Every time a message containing peer information arrives, the local repository of known peers is refreshed with new and valuable data. Once on the TerraScope network, SRE's can send an receive messages to and from other nodes. Currently this messaging channel is only used for messages related to the peer-to-peer network. HTTP requests are used as the messaging channel for query distribution and response. But it is possible to distribute queries using FreePastry API capabilities which uses serialization.

Current and future work will include the remaining objectives of this research. A smarter algorithm for query distribution is being developed. It will be based on real time information shared among peer nodes about status, load and availability of geospatial data. Also load balancing techniques will be applied to improve performance and avoid the possibility of overloading a server.

The Terrascope project will continue to be supported by other grants through the 2006-2007 academic year.

8 References

[1] Chord web site.

[2] FreePastry web site. .

[3] OpenChord web site.

[4] SimPastry/VimPastry web site .

[5] M. Castro, P. Druschel, A. Kermarrec, and A. Rowstron. Scribe: A large-scale and decentralized application-level multicast infrastructure. IEEE Journal on Selected Areas in communications (JSAC), 20(8):1489–1499, 2002.

[6] Enna Z. Coronado-Pacora. Sre: Search and retrieval engine of the terrascope database middleware system. Master’s thesis, UNIVERSITY OF PUERTO RICO MAYAGEZ CAMPUS, 2003.

[7] Andy Oram, editor. Peer-To-Peer: Harnessing the Power of Disruptive Technologies. OReilly, 2001.

[8] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Schenker. A scalable content-addressable network. In SIGCOMM ’01: Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, pages 161–172, New York, NY, USA, 2001. ACM Press.

[9] Antony Rowstron and Peter Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), pages 329–350, nov 2001.

[10 ] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM ’01: Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, pages 149–160, New York, NY, USA, 2001. ACM Press.

[11] Bienvenido Velez, Amaury Cabarcas, and Lizvette Malave. Tin: an interactive image navigator providing ubiquitous access to distributed geo-spatial data. In ITCC ’04:Proceedings of the International Conference on Information Technology: Coding and Computing, pages 337 – 343, 2004.

3 Publications

1 Refereed Publications in Journals and Book Chapters (students underlined):

|Authors: |Title |Name of Publication |

|Luis O. Jiménez-Rodríguez, Emmanuel|Unsupervised Feature Extraction Techniques for |IEEE Transactions on Geoscience and Remote Sensing,|

|Arzuaga-Cruz, and Miguel |Hyperspectral Data and its Effects on Supervised and |Volume 45, Issue 2, Feb. 2007 Page(s):469 – 483. |

|Vélez-Reyes |Unsupervised Classification | |

|Julio M. Duarte C., Paul Castillo |Comparative Study of Semi-implicit Schemes for |To appear in IEEE Transactions on Image Processing,|

|and Miguel Velez-Reyes |Nonlinear Diffusion in Hyperspectral Imagery |2007. |

|M. Vélez-Reyes, W. Rivera-Gallego, |A Solutionware for Hyperspectral Image Processing and|To appear in A.J. Plaza and C.I. Chang, editors, |

|and L.O. Jiménez-Rodríguez |Analysis |High-Performance Computing in Remote Sensing, |

| | |Chapman & Hall/CRC Press. |

2 Publications in Conference Proceedings:

|Authors: |Title |Name of Publication |

|Carlos Benitez and Shawn Hunt |Determining the Need for Dither when Re-Quantizing a|Proceedings of the AES 121st Convention, 2006. |

| |1-D Signal | |

|J.A. Goodman, M. Vélez-Reyes, S. Hunt, |“ Development of a field test environment for the |Proceedings of SPIE: Remote Sensing of the |

|and R. Armstrong, |validation of coastal remote sensing algorithms: |Ocean, Sea Ice, and Large Water Regions 2006, |

| |Enrique Reef, Puerto Rico.” |Vol. 6360, Oct. 6, 2006. |

|M. Vélez-Reyes, J.A. Goodman, A. |“Benthic habitat mapping using hyperspectral remote |Proceedings of SPIE: Remote Sensing of the |

|Castrodad-Carrau, L.O. |sensing.” |Ocean, Sea Ice, and Large Water Regions 2006, |

|Jiménez-Rodriguez, S.D. Hunt, and Roy | |Vol. 6360, Oct. 6, 2006. |

|Armstrong. | | |

|Y.M. Masalmah and M. Vélez-Reyes, |“Unsupervised Unmixing of Hyperspectral Imagery.” |In Proceedings of the IEEE Midwest Symposium on |

| | |Circuits and Systems, San Juan, Puerto Rico, |

| | |2006. |

|Y.M. Masalmah and M. Vélez-Reyes |“Unsupervised unmixing of hyperspectral imagery |In Proceedings of SPIE: Independent Component |

| |using the constrained positive matrix |Analyses, Wavelets, Unsupervised Smart Sensors, |

| |factorization.” |and Neural Networks IV, Vol. 6247, April 2006. |

|J.M. Duarte-Carvajalino, M. |“Scale-space in hyperspectral image analysis.” |In Proceedings of SPIE: Algorithms and |

|Vélez-Reyes, and P. Castillo | |Technologies for Multispectral, Hyperspectral, |

| | |and Ultraspectral Imagery XII, Vol. 6233, May |

| | |2006. |

|A. Umaña-Díaz and M. Vélez-Reyes, |“Restoration of hyperspectral imagery.” |In Proceedings of SPIE: Algorithms and |

| | |Technologies for Multispectral, Hyperspectral, |

| | |and Ultraspectral Imagery XII, Vol. 6233, May |

| | |2006. |

|V. Ortiz-Rivera, M. Velez-Reyes, B. |“Change Detection in Hyperspectral Imagery using |In Proceedings of SPIE: Algorithms and |

|Roysam, |Temporal Principal Components.” |Technologies for Multispectral, Hyperspectral, |

| | |and Ultraspectral Imagery XII, Vol. 6233, May |

| | |2006. |

|Y.M. Masalmah and M. Vélez-Reyes, |“A comparison of algorithms to compute the positive |In Proceedings of SPIE: Algorithms and |

| |matrix factorization and their application to |Technologies for Multispectral, Hyperspectral, |

| |unsupervised unmixing.” |and Ultraspectral Imagery XII, Vol. 6233, May |

| | |2006. |

|A. Castrodad-Carrau, M. Vélez-Reyes, |“An algorithm to retrieve coastal water optical |In Proceedings of SPIE: Photonics for Port and |

|and J.A. Goodman, |properties, bathymetry, and bottom albedo from |Harbor Security II, Vol. 6204, May 2006. |

| |hyperspectral imagery.” | |

|V. Manian, L.O. Jimenez and M. |“A comparison of statistical and multiresolution |In Proceedings SPIE: Image and Signal Processing|

|Velez-Reyes |texture features for improving hyperspectral image |for Remote Sensing XI, Vol. 5982, September |

| |classification.” |2005. |

|Hamed Parsiani, Enrico Mattei, Allen |“Soil Moisture Determination based on MCFD/NN-GUI |Proceedings of IASTED SIP 2005, Aug. 15-17, |

|Lizarraga, Mairim Ramos, |Algorithm using Wideband Radar Images of Land” |2005, Hawaii. |

|Cruz Pol, S. L., José Maeso, Margarita |“DSD characterization and computations of expected |IEEE IGARSS 05, Korea., 2005 |

|Baquero |reflectivity using data from a Two-Dimensional Video| |

| |Disdrometer deployed in a Tropical Environment” | |

|Cruz Pol, S. L., Margarita Baquero, V. |”Rain-Rate Estimate Algorithm Evaluation and |IEEE IGARSS 05, Korea, 2005 |

|Bringi and V. Chandrasekar |Rainfall Characterization in Tropical Environments | |

| |Using 2DVD, Rain Gauges and TRMM data” | |

|Morales J., J. Trabal, S. L. Cruz-Pol, |"Ice Water Content (IWC) Retrieval from Cirrus |IEEE IGARSS 05, Korea, 2005 |

|and S. M. Sekelsky |Clouds Using Millimeter-Wave Radar and In-Situ Ice | |

| |Crystal Airborne Data," | |

3 Presentations (not in conference proceedings)

|Presenters: Underline |Title |Year and month of |Name of the conference, |City, State, or country |

|student names put main | |the presentation |symposium, etc. |where the presentation |

|author first | | | |was given |

|M. Vélez-Reyes, A. |“Combined Unmixing of Benthic Bottom and |May 2006. |International Symposium |Bar Harbor, Maine, |

|Castrodad, and J. |Retrieval of Water Column Properties and | |on Spectral Sensing | |

|Goodman |Bathymetry for Aquatic Subsurface Remote | |Research, | |

| |Sensing.” | | | |

|M. Velez-Reyes and V. |Support Vector Classification of Land Cover |May 2006. |International Symposium |Bar Harbor, Maine, |

|Manian, |and Benthic Habitat Classification from | |on Spectral Sensing | |

| |Hyperspectral Images | |Research, | |

|M. Vélez-Reyes and J.M. |Scale-space in Hyperspectral Image Analysis |May 22-26, 2006 |Insitute for Mathematics |University of |

|Duarte Carvajalino | | |and its Applications |Minnesotta, MN |

|M. Vélez-Reyes |Hyperspectral Image Processing |December 5-9, 2005 |Insitute for Mathematics |University of |

| | | |and its Applications |Minnesotta, MN |

4 New Grants

|Principal Investigator|Title |Agency |Amount & Duration |

|M. Vélez-Reyes and V. |A Geometrical Approach for the Analysis |NGA 2006 Historical Black College and University|Amount: $140k |

|Manian |of Hyperspectral Imagery |– Minority Institution Research Initiatives |Duration: August 1, 2006 to |

| | | |July 31, 2008 |

|M. Vélez-Reyes and V. |Improving Algorithms for Target |FY 2005 DoD Instrumentation and Research Support|Amount: $485k |

|Manian |Detection in Hyperspectral Infrared |Program for Hispanic-Serving Institutions |Duration: November 15, 2005 |

| |Imagery | |to November 14, 2008 |

5 Student Supported

1 Graduate Students (fully or partially) Supported by TCESS

|Student Name |Project Title |Advisor |Graduation Date |Degree |If graduated, Plans |

|Enrico Mattei |Material type determination of Subsurface |Hamed Parsiani |July 2006 |MSEE |Industry, Northrop |

| |Large Objects or Layers Using Ground | | | |Gunmman |

| |Penetrating Radar | | | | |

|Christian Nieves|Development of a Website for the SeaBED |Miguel Velez-Reyes|May 2006 |MSCpE |Industry, IBM |

| |TestBED Facility | | | | |

|Yahya Masalmah |Unsupervised unmixing of hyperspectral |Miguel Velez-Reyes|May 2007 |Ph.D. | |

| |imagery | | | | |

2 Graduate Students in Associated Research (use TCESS Facilities)

|Student Name |Project Title |Advisor |FundingSource |Graduation Date |Degree |If graduated, Plans |

|Julio M. Duarte |Object-Based Segmentation in |Miguel |NSF CenSSIS |December 2007 |Ph.D. CISE | |

| |Hyperspectral Imagery |Velez-Reyes | | | | |

|Alejandra Umaña |Restoration and Resolution |Miguel |NSF CenSSIS |May 2008 |Ph.D. CISE | |

| |Enhancement in Hyperspectral |Velez-Reyes | | | | |

| |Imagery | | | | | |

|Alexey |Retrieval of Bottom Properties in |Miguel |NSF CenSSIS |December 2005 |MSEE |Government, National |

|Castrodad-Carrau |Shallow Waters from Hyperspectral |Velez-Reyes | | | |Geospatial Agency |

| |Imagery | | | | | |

3 Undergraduate students (partially or fully) supported by TCESS

|Student Name |Project Title |Advisor |Graduation Date |Degree |If graduated, |

| | | | | |plans |

|Allen Lizarraga |Material type determination of Subsurface Large Objects or |Hamed Parsian |2007 |BSEE | |

| |Layers Using Ground Penetrating Radar | | | | |

4 Undergraduate Students in Associate Research (use TCESS Facilities)

|Student Name |Project Title |Advisor |Funding Source |Graduation Date |Degree |If graduated, |

| | | | | | |plans |

|Enid M. Alvira|Comparison of Unmixing Algorithms for |Miguel |NSF |May 2008 |BS | |

| |Hyperspectral Image Processing |Vélez-Reyes | | | | |

|Mairim Ramos |Material type determination of Subsurface |Hamed Parsiani |NOAA |Dec. 2007 |BS |Industry, |

| |Large Objects or Layers Using Ground | | | | |Northrop Gunmman |

| |Penetrating Radar | | | | | |

6 Post-doctoral Fellows

|Fellow Name |Project Title |Advisor |

|James Goodman |Remote Sensing of Benthic habitats using Hyperspectral |Miguel Vélez-Reyes |

| |Imaging |Funded by CenSSIS |

7 Service Activities

1 NASA

• Dr. Vélez-Reyes served in a NASA Proposal Review Panel in 2006.

2 Not NASA

• Dr. Sandra Cruz-Pol was a reviewer for IEEE Transactions for Geosciences and Remote Sensing.

• Dr. Sandra Cruz is Associate Editor of the Newsletter of the IEEE Geoscience and Remote Sensing Society.

• Dr. Miguel Vélez-Reyes was a reviewer for IEEE Transactions for Geosciences and Remote Sensing.

• Dr. Miguel Vélez-Reyes was a member of the program committee for Multispectral, Hyperspectral, and Ultraspectral Imagery XI, part of the SPIE Defense and Sequrity Symposium held in Orlando, FL in April 2004.

• Dr. Miguel Vélez-Reyes was co-Chair of the SPIE Remote Sensing of the Ocean, Sea Ice, and Large Water Regions 2006, 11-16 September 2006, Stockholm, Sweden.

• Dr. Miguel Vélez-Reyes was inducted to the Academy of Arts and Sciences of Puerto Rico in November 2005.

-----------------------

[1] 2005 Mars Reconnaissance Orbiter mission

-----------------------

[pic]

Figure 3: TIN results screen

WIND

Dm [mm]

lamp B

lamp A

camera B

camera A

NEXRAD

2DVD

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download