Suk Ho Youngjoo Lee***

A comparative study of machine learning methods for lung diseases diagnosis by computerized digital imaging'"

Suk Ho K a n g * * . Youngjoo Lee***

Aostract 1 Introduction U Mater~aland Methods Ill Results

I\.' New Work t o b e Presented

V Conclus~on

Abstract

In this study. we tested and comparcd several state-of-art machine-learning niethods for a u t o m a t e d classification of obstructive l u n g diseases based on t h e features from text,ure analysis using HRCT (High 1Zesolution Computerized Tomography) images. I-IRCT c a n provide a c c u r a t e images for t h e detection of various obstructive lung diseases, including centrilobular emphysema, panlobuclar

e m p h y s e m a a n d constrictive bronchiolitis. F e a t u r e s on t h e HHCT images.

however, can be s u b t l e , particularly in t h e e a r l y s t a g e s of d i s e a s e . a n d image based diagnosis is subject to jnter-observer variation. In order to support the clinical diagnosis a n d improve i t s accuracy. t h r e e different t y p e s of a u t o m a t e d classification systems were developed and comparcd based on the classification performance and clinical applicability. Not only Bayesian classifier, a typical kind of statistic method, b u t also ANX (Artificial Neural Network) a n d SVM (Support

' This work was supported by the Second Stage of Rrain Korea 21 Project in 2007. Department of Industrial Engineering, Seoul National University. Email: s h k a n g @ s n u . a c . k r

"' D e p a r t m e n t of Industrial Engineering, Seoul National University. E m a i l : s p i c i o 3 B s n u . a c . k r

Vector Machine) were employed. We tested these three classifiers for t h e differentiation of normal and three types of obstructive lung diseases. The ANN showed t h e best performance of 86.0% overall sensitivity and there is significant difference among other classifiers (one-way ANOVA, ~ ( 0 . 0 1 ) I.n discussion, we addressed what characteristic of each classifier made differences in the performance and which classifier was more suitable for clinical applications and proposed appropriate way to choose the best classifier and determine its optimal parameters to discriminate the diseases better. This result can be applied to the classifier for differentiation of other diseases.

I . Introduction

HRCT (High Resolution Computerized Tomography) can provide accurate images for the detection of various obstructive lung diseases, including centrilobular emphysema, panlobular emphysema and bronchiolitis obliterans. Features on the thin-section HRCT images can be subtle, however, particularly during early stages of disease, and diagnosis is subject to inter-observer variation. The main characteristics of images used to detect obstructive lung diseases are the presence of areas of abnormally low attenuation in t h e lung parenchyma. which, in t h e case of emphysematous destruction of t h e lung parenchyma, can be detected automatically by means of attenuation thresholding. However, areas of decreased attenuation are a feature of other obstructive lung diseases. 1 Thus, the accurate differentiation among these diseases may be difficult, even for expert thoracic radiologists.

Efforts have been made to develop computerized methods that could assist radiologists in improving diagnostic accuracy, by differentiating among obstructive lung diseases. Using a computer-aided diagnosis (CAD) scheme, radiologists could incorporate t h e output from t h e CAD into their decisions."ne automated computational scheme developed to classify obstructive lung diseases more accurately t h a n radiologists made use of a narve Bayesian classifier.

A comparative study of machine learning methods for lung diseases diagnosis by computerized digital imaging

which was trained t o predict t h e likelihood of obstructive lung diseases based on quantitative t e x t u r e features automatically extracted from the ROI (region of interest) of HRCT images.' Another classification system employs a Bayesian classifier and SVM (Support Vector Machine) to assess 3 D texture features of lung parenchyma with abnormally low attenuation a r e a s (LAA):"

Several classification systems have been employed in medical CAD systems. and t h e selection of a n appropriate classification scheme h a s been shown to be important for improving performance based on t h e characteristics of t h e d a t a

In addition, modification or optimization of parameters for feature extraction, including t h e size of t h e ROI, is i m p o r t a n t . The motivation of this paper is to achieve a b e t t e r understanding of t h e machine classification process for differentiating the obstructive lung diseases, to evaluate the classification in terms of sensitivity a n d specificity, and to analyze the strengths and weaknesses of t h e well-known classifiers for the clinical application.

11. M a t e r i a l a n d M e t h o d s

The images were selected from HRCT obtained in 17 healthy subjects ( n = 6 6 ) . 26 patients with bronchiolitis obliterans ( n = 6 9 ) . 28 patients with mild centrilobular emphysema ( n = 6 4 ) , and 21 patients with panlobular emphysema or severe centrilobular emphysema (n=62).

Every 265 ROI was selected just one a t each half lung for avoidance of t h e redundancy of images. All patients were recruited a t t h e department of radiology. Asan Medical Center. The scanned field of view cover whole chest.

Automated segmentation of t h e lung was performed. The major pulmonary vessels of the lung parenchyma were removed by simple thresholding below-400 Hounsfield Unit. This step is important in that macroscopic structures, such a s t h e major pulmonary vessels and chest wall. a r e of a size approaching t h a t of t h e ROI used. a n d their statistically significant features cannot be obtained by textural extractors. Performance of automated anatomic segmentation allo\ved

texture analysis of t h e finest structures of t h e lung parenchyma, and each feature value of texture analysis was normalized relative to t h e clipped pixels) for categorization into parenchymal ROI area.

( a ) Panlobular emphysema or severe centrilobular emphysema

(b) Mild cent~ilobularemphysema

(c) Bronchiolitis obliterans

(d) Normal lung parenchyma

(Figure 1) Cross-sectional thin-section CT s c a n s of the chest (window level, 8 5 0 H U : width. 400 HU). On each image, the three different sizes of rectangular (16x16. 32x32, and 64x64) highlights region of interest (ROI) that is typical of a particular condition.

For each image, two experienced thoracic radiologists each selected three sizes of rectangular ROI (16x16. 32x32. and 64x64 one of three diseases or normal lung tissue. Areas with HU between-400 and-1024 were segmented for clipping the major pulmonary vessels or chest wall. Sincc t h e bin size could

A comparative study of machine learning methods for lung diseases diagnosis by computerized digital imaging

influence t h e performance of t h e classifier6, we tested various bin sizes (Q-bin size 1 6 , 32. 64. 128. 144. 196. and 256) of r u n length encoding and t h e co-occurrence matrix. Overall sensitivities of t h e system, using each combination of variable ROI and bin sizes, were calculated and compared.

(Table I) Summary of 13 textural features that represent each ROI

Descriptor Histogram

Gradient Run-length matrix

-

Go-occurrence matrix

Dimension Mean S.D

Skewness Kurtosis

Mean S D. Short primitive emphasis (SPE) Long primitive emphasis (LPE) Angular second moment (ASM) Contrast Correlation Inverse difference moment (IDM) Entropy

The machine learning methods, we consider, were: Bayesian classifier. artificial neural network7-' (ANN), and support vector machineg (SVM). The features employed in this study are listed in Table I.

ILI. Results

The overall sensitivity is presented in Figure 2 . The overall sensitivity of ANN with 64x64 ROI discriminates 86.0% obstructive lung diseases obviously better t h a n any other case in t h i s experiment. T h e performance of A N N . however, is not stable so t h a t h a s a large variance. while t h e variance of Bayesian is very small. The AKN shows significantly better performance than

................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches