R



MODEL-BASED RECOGNITION

OF OCCLUDED CURVED OBJECTS

RACHID BENLAMRI

Department of Computer Engineering,

Etisalat College of Engineering

Po. Box 980, Sharjah

UNITED ARAB EMIRATES



Abstract: - This paper describes a new model-based approach to recognize occluded curved objects from single range images. Image segmentation is based on surface curvature detection. Object representation is expressed in terms of an edge-junction graph and a surface-adjacency graph, which are used to infer the surface patches in the scene and their spatial relationship respectively. The matching strategy is based on Local Geometrical Patterns (LGP’s) extracted from adjacent surfaces. The idea is to generate, for each similar pair of object and model LGP, a location and orientation transformation called model-base, expressed in terms of a Translation Rotation Matrix (TRM), that allows a model (defined in its model coordinate system) to be aligned onto its corresponding object (defined in the viewer-centered coordinate system). Unlike existing methods, only one LGP is used to check for matching. Furthermore, once the two coordinate systems are aligned, no more geometrical transformations are involved in the matching process. This significantly enhances the matching search-space and reduces its time complexity. The system has been tested on real range images and synthetic data. The experimental results have shown that the system can reliably recognize scenes with acceptable accuracy.

Keywords: Range image segmentation and representation, Curvature operators, Model-based recognition.

1 Introduction

Although much progress has been made in the field of 3D object recognition using range images, it is still widely accepted that the recognition of real world scenes using a single image view is a difficult task [1]. This is mainly due to the nature of the scene that may include objects with irregular shapes that have no clearly defined geometry; objects viewed from any direction; and objects with self-occlusion or partially occluded by other objects [1]. Many model-based systems [2-4] addressing these recognition problems were proposed. The main issue involved in the design of such systems was object representation that often dictates the matching strategy. Hence, any success in the recognition process will depend on the quality of image segmentation and object representation. Also, most of the existing systems use time-consuming matching algorithms. In this paper, a new model-based approach investigating some of these recognition problems is proposed. The aim is to match arbitrarily viewed occluded curved objects extracted from a single range image against a set of CAD models in a database. A typical application would be the assembly or automated inspection of parts in a manufacturing environment.

The remainder of the paper is organized as follows: section 2 describes the proposed edge detection technique. In section 3, a detailed description of the object representation scheme and modeling is given. The matching algorithm is presented in section 4, and experimental results are discussed in section 5. Finally, conclusions from the work are drawn and further research work is suggested.

2 Edge Detection

Many approaches for range image segmentation have been developed [1, 5-9]. Most of these used local-surface properties, such as surface curvature and surface normal to describe 3D shape. Unlike most existing methods [1, 7-9] which proceed in different stages searching mainly for jump and crease boundaries by computing respectively, at every point, zero-crossings and extrema of surface curvature in some chosen directions, in this paper a single-step process is adopted to localize all types of edges. It consists of applying two partial derivatives of the surface curvature along the x and y directions to signal the presence of edge points in a simpler and a more efficient way in terms of computation.

In this study, to alleviate the effects of any noise present, the image is smoothed prior to computation of surface-curvature properties. The smoothing process in this case is very important since the calculation of curvature involves second-order partial derivatives and hence may significantly magnify the effects of noise. Moreover, curvature, being a local measure, is highly noise sensitive. Consequently, noise is removed by convolving the image with a rotationally invariant Gaussian mask of variance (. In order to reduce just noise and preserve significant features in the image, different values of ( in the range [0.5-2.5] were tried. The value of 1.3 was found to be the most appropriate for the range images used in this study. Once smoothing is performed, two newly developed directional curvature operators are applied to determine edge points forming surface contours. The following section summarizes the segmentation approach adopted in this study. More details can be found in [10].

Let k(p) : be the analytical expression associated to the curvature of the surface function f at the image coordinates (x,y).

[pic] {1}

Based on {1}, Fan [11] has developed the directional surface curvature along a direction (. This is given by:

[pic] {2}

where:

[pic] : is the first-order directional derivative of f

[pic] : is the second-order directional derivative of f

A = f'(0°)

B = f’(90°)

Unlike existing methods, the proposed segmentation approach adopts a one-step process to determine whether an edge point belonging to either a jump-edge, smooth edge or a crease edge exists at a point p(x,y,z). For this, two partial derivatives, kx and ky , of the main curvature k are developed along the two main directions x and y respectively.

[pic] {3}

[pic] {4}

Since k is relatively constant along a small neighborhood of a points belonging to the same surface, no matter that surface is cylindrical or spherical, the derivative operators kx and ky will have both a value equal to zero. However, crossing through a contour point along two neighboring surfaces or occluded surfaces, will result into a sudden change of the curvature k, leading to a non null-value of kx, ky, or both. Also, there will be a change of curvature when going beyond a surface orientation discontinuity (creases). Based on this assumption, the two developed derivatives are used to signal the presence of an edge point. For the discrete range image, kx and ky are approximated by differences as follows:

[pic]

[pic]

where:

1. [pic]

2. [pic]

3. [pic]

4. [pic]

5. [pic]

6. [pic]

In terms of processing, since our goal is to check whether kx and ky are equal to zero, only the computation of their numerators is of interest. This significantly reduces the amount of computation in the convolution process. Fig.1 shows some of the results of edge detection using real-range images. In order to have a more realistic view of treated objects, range images are displayed using a computer graphics rendering technique rather than encoding depth in gray-level.

[pic]

1.a Real range image of scene 1

[pic]

1.b Edge detection results of scene 1

[pic]

1.c. Real range image of scene 2

[pic]

1.d. Edge detection results of scene 2

Fig.1 Edge Detection Results of Real Range Images

3 Object Representation & Matching

The most used features to build surface-based descriptions of 3D complex shapes are surface contours [12, 13]. In this study, these are determined by grouping edge points into three geometrical primitives (linear segments, cylindrical arcs, and spherical arcs). A recursive circular search scheme is used to link those edge points belonging to a linear segment, and which are characterized by a zero-curvature and forming a connex part (fig.2.a). However, cylindrical and spherical arcs are characterized by a nearly constant curvature along a small neighborhood. For example, starting from a point A, an arc is progressively constructed by approximating the curvature around a certain neighborhood of that point. For each determined arc, the direction and the curvature-ray are computed (fig.2.b). The next stage involves fusing/linking small edge segments to form lager segments. Edge grouping is based on parallelism relations, approximate continuity in curvature, and collinearity criteria (fig.3). However experimentation shows that edge grouping based on parallelism is more reliable than grouping based on co-curvilinearity, due to the presence of noisy curve segments. Therefore, a good strategy is to fuse edge segments using parallelism criterion first, and then to apply co-curvilinearity linking, and this, until no more grouping is possible. Details of edge grouping process can be found in [10]. Once all primitives have been determined, cylindrical arcs are separated from spherical arcs by computing the Z-value of the gravity center point G of the arc. If the computed Z-value of G corresponds to that in the image buffer, then the arc is cylindrical; otherwise, it is a spherical one. Spherical arcs are further classified as convex or concave if respectively the computed Z-value of G is less than, or, greater than its corresponding value in the image buffer.

[pic] [pic]

a. Search for linear segments b. Arc construction

Fig.2 Linear-segment and arc construction

[pic]

Fig.3 Edge linking / fusion

Objects are defined using a surface-based approach enforced by a boundary representation scheme. In such scheme, an object is represented by its constituent parts and their spatial relationships. This is achieved by constructing an edge-junction graph based on the extracted primitives. Edge-junction characteristics are then used to infer the visible surfaces and to determine their types. The surface generation process is based on the following assumptions:

A planar surface could either be bounded by a closed-polygon, an open-polygon (occluded or concave surface), cylindrical arcs (cylinder’s base), or a combination of open-polygons and cylindrical arcs (fig.4.a). Based on this assumption, the edge-junction graph is used to generate such surfaces and to compute their equations.

To determine a cylindrical surface, we first use a single cylindrical arc to generate a cylinder of infinite height. Then, all cylindrical arcs and linear segments belonging to that cylinder are considered to construct the corresponding cylindrical surface(s). This process is repeated until all cylindrical surfaces in the scene have been determined (fig.4.b).

3. In order to generate a spherical surface, the preprocessed information (curvature-ray and convexity) associated to a single spherical arc is used to compute respectively the ray, center, curvature and the convexity of a spherical surface. Once the surface equation is computed, all arcs belonging to that sphere are considered to construct the corresponding spherical surface(s). As shown in fig.4.c, a generated spherical surface could be partitioned into two or more spherical surfaces based on arc-connexity criterion. For example the arcs (A, A’, B) and (A, A’’, B) are used to infer a part of the surface, whereas the other part is inferred by arc (C, C’, D). This process is repeated until all spherical surfaces are determined.

[pic]

4.a Example of surface construction [pic]

4.b Construction of cylindrical surfaces

[pic]

4.c Decomposition of spherical surfaces

Fig.4 Surface Construction & Decomposition

Objects in the scene are then inferred using a surface-adjacency graph. In such a graph, a node represents a surface from the scene, and an edge represents an adjacency relation between two surfaces. The objective is to determine connex sub-graphs that correspond to either parts of objects (in case of occlusion) or full objects.

A similar representation scheme is used to model the different objects in the database. A Computer Aided Design tool is used to generate the models as well as configurations of complex objects. A configuration consists of set of instance primitive-models. One of these instances is taken as a reference model. Many instances of the same model could be used in a single configuration. The spatial relationship between models forming a configuration is defined in terms of a set of model-bases that describe geometrical transformations applied on primitive models defined in the model coordinate system to align onto their corresponding model-instances defined in the configuration coordinate system.

Let C1 be the configuration formed by M1, M2, M3, M4, and let M1 be the reference model. As shown in fig.5, C1 is defined by:

C1 = {M1 , (M1,B1 ), (M2,B2 ), (M3,B3 ), (M4,B4 )}

where:

Bi: is the TRM that transforms Mi in configuration C1

[pic]

Fig.5 Matching configurations

4 Object-Model Matching

It is now widely accepted that 3D recognition of real world scenes with possible occlusion is a difficult problem [1] that cannot be achieved without a consistent representation scheme that makes use of its invariant features to reduce the search space, and to confirm object hypotheses. To achieve these goals the proposed matching strategy uses just one LGP (a pair of adjacent surfaces) from the scene to match an object against different models in the database. In particular, for each sub-graph in the surface-adjacency graph, only one LGP is chosen for matching with similar LGPs of a candidate model. The idea is to generate for each similar pairs of object and model surfaces, a model-base (P,r,u,v) expressed in terms of a TRM that allows a model defined in its model coordinate system to be aligned onto its corresponding object defined in the viewer-centered coordinate system (fig.6). If the matching fails, no other LGP from that sub-graph is to be considered. Also, no successive geometrical transformations are applied while matching a model to the object. We just need to project the model into the generated model-base and compare the projection results with the visible part of the object in the scene. The matching process is repeated until all objects in the scene have been recognized. Matching of possible configurations could then be started.

Let Oi be an object (or part of an object) from the scene, an LGP associated with a surface pair (Si,,Sj) of Oi consists of :

1. Ni: The normal vector to Si

2. Nj: The normal vector to Sj

3. Aij: The angle formed by Ni and Nj

4. Gij: The gravity center point of the two surfaces

5. Pij: An arbitrary chosen vertex common to the

6. two surfaces Si and Sj

7. Nseg(Si): The number of linear segments and arcs bounding surface Si

8. Nseg(Sj): The number of linear segments and arcs bounding surface Sj

[pic]

Fig.6. Aligning viewer and model coordinate systems

The choice of (Si,,Sj) is done so that they have the maximum Nseg value and at least one of them is planar. The first condition significantly reduces the number of candidate model-LGPs. A match between a scene LGP (Si,,Sj) and a model LGP (Sk,Sl) is considered, if and only if, the following conditions are verified:

Akl = Aij

Nseg(Sk) ( Nseg(Si)

Nseg(Sl) ( Nseg(Sj)

The matching is confirmed if there exist a model-base (P,r,u,v) in which the model appearance will be similar to that of the object with reference to the viewer centered coordinate system (S,i,j,k) as shown in fig. 6. The base B = (P, r,u,v)T is found if the expressions of Nk , Nl , and the vector product Nk ( Nl , defined in B are equal to those of Ni, Nj and Ni ( Nj defined in the viewer centered base (S,i,j,k)T. For this let:

Ni = (Nix ,Niy ,Niz)T

Nj = (Njx ,Njy ,Njz)T

Ni ( Nj = (Nijx ,Nijy ,Nijz)T

Nk = (Nkx ,Nky ,Nkz)T

Nl = (Nlx ,Nly ,Nlz)T

Nk ( Nl = (Nklx ,Nkly,Nklz)T

A matching is found if:

Nk (r u v)T = Ni

Nl (r u v)T = Nj

Nkl (r u v)T = Nij

The unit vectors r, u and v are determined by solving the above system of equations. For instance, for unit vector r we have:

Nk r = Nix

Nl r = Njx

Nkl r = Nijx

which leads to:

[pic]

r is computed by solving the above equations. Unit vectors u and v are similarly computed to obtain B.

[pic]

The next step is to determine the model-base origin (P). This is done by considering a new geometrical feature called P’ij which is defined either as the gravity center point Gij if both surfaces (Si, Sj) are visible, in which case P’kl = Gkl , or as Pij in which case P’kl = Pkl. P is determined if the coordinates of P’ij defined in (S,i,j,k) are equal to those of P’kl defined in (P,r,u,v). This could be written as:

PP’kl (r u v)T = SP’ij

Which leads to solving the following system of equations:

PP’kl r = P’ijx

PP’kl u = P’ijy

PP’kl v = P’ijz

The above system could be expressed in terms of Px, Py and Pz as described below:

(P’klx - Px ) rx + (P’kl y - Py ) ry + (P’kl z - Pz ) rz = P’ij x

(P’klx - Px ) ux + (P’kl y - Py ) uy + (P’kl z - Pz ) uz = P’ijy

(P’klx - Px ) vx + (P’kl y - Py ) vy + (P’kl z - Pz ) vz = P’ij z

[pic]

Hence, the origin P is finally computed by solving the following system of equations:

[B](Px Py Pz )T= [B] (P’kl x P’kl y P’kl z )T- (P’ij x P’ij y P’ij z )T

For each generated model-base B, the model is projected into B and then, compared with the visible part of the object for possible matching. Once the matching hypothesis of an object-model is confirmed, the object location and orientation with reference to that model is then determined. This is defined in terms of a TRM matrix. The translation is simply defined by the vector (-SP) = (-Px -Py -Pz ). The translation matrix T is hence defined as:

[pic]

The rotation around the x-axis denoted by (x is determined so that the unit vector v of the translated model-base B lies in the plane (S,i,k). The rotation matrix [R(x] is therefore computed as shown below:

[pic]

Where:

(x = Arctang(vz /vy).

Similarly, the rotation angle (y around the y-axis is determined so that the unit vector v of the translated and rotated model-base B lies along the z-axis in the same direction of z. The rotation matrix [R(y] is therefore computed as shown below:

[pic]

Where:

(y = - Arctang(vx /vz).

Finally, the rotation angle(z around the z-axis is determined so that the unit vector r of the transformed model-base B lies along the x-axis in the same direction of x. The rotation matrix [R(z] is therefore computed as shown below:

[pic]

Where:

(z = - Arctang(ry /rx).

The global transformation is given by matrix [TG] product of the four above matrices as:

[TG] = [T] [R(x] [R(y] [R(z]

The remaining step in matching individual objects is object-fusion. In fact, due to occlusion an object might appear divided into many parts. This problem could be solved by looking for parts of objects which are matched to the same model and for which the same location and orientation is found. These are considered as part of a single object as shown in fig. 7.

[pic]

Fig.7 Object fusion

The proposed matching technique can also recognize configurations of objects. To show how matching of configurations is carried out, let us consider the example of configuration C1 shown in fig. 5.

C1 = {M1, (M1, B1), (M2, B2), (M3, B3), (M4, B4)}

While recognizing individual objects, object O1, O2 and O3 were matched with models M2, M3 and M4 by using respectively the model-bases T1, T2 and T3. Let M’1, M’2, M’3 and M’4, be the instances of models M1, M2, M3 and M4. According to the configuration’s definition given in section 3, we have:

M1 = M’1 [B1]

M2 = M’2 [B2]

M3 = M’3 [B3] {5}

M4 = M’4 [B4]

The matching results of the individual objects (O1, O2 and O3) could be interpreted and formulated as:

O1 = M2 [T1]

O2= M3 [T2] {6}

O3 = M4 [T3]

If we substitute M2, M3 and M4 in {6} by their respective values in {5}, we obtain:

O1 = M’2 [B2] [T1]

O2 = M’3 [B3] [T2] {7}

O3 = M’4 [B4] [T3]

Configuration C1 is recognized if the following conditions are met:

1. All instances M’i totally or partially visible must correspond to objects in the scene (for our case M’2, M’3, M’4 correspond respectively to O1, O2, O3).

2. For each object-model association (Oi, Mj), the corresponding model-base designated by [Bj] [Ti] must be equal to [Bk] [Tl] of the couple (Ok, Ml) which is used as hypothesis for matching the configuration (in this case, l equals 1 and k equals 2).

In other words C1 is recognized, if and on if:

[B3] [T2] = [B2] [T1]

[B4] [T3] = [B2] [T1]

It should be noted that the concept of configuration could also be used to recognize parts of objects that are heavily occluded by other objects and for which it was not possible to generate separate model-base. These parts could easily be inferred while matching configurations (case of object Oc in fig.5).

5 Experimental Results

In order to show the performance of the present system, several experiments were carried out on both noisy synthetic data and real range images. The real 256 x 256 range images were captured by a laser range finder using the active triangulation method and a resolution of 0.7mm both for the inter-pixel distance and depth values. Fig.8 shows the segmentation and matching results for scene 1 given in fig.1.a and which consists of two partially occluded polyhedral objects. However, fig.9 shows the results for scene2 given in fig.1.b and which consists of a self-occluded curved object (track wheels/track body occlusion) that is partially occluded by a polyhedral object (chair). The execution time of the entire algorithm applied to the scene in fig.8 was 178 seconds. However, this is 239 seconds for the scene in fig.9 where a more complex scene is considered. These figures show that the present algorithm performs relatively fast compared to a number of existing similar methods. This is mainly due to the fact that the computation for edge detection is not complicated and that the matching requires only one LGP. Fig.8.a and fig.9.a show final segmentation results where surface boundaries are labeled, while fig.8.b and fig.9.b show the object-model matching results. It can be seen in fig.8.a that polyhedral objects are excellently partitioned into their planar surfaces. However, the segmented image in fig.9.a presents few missing edge-segments on curved edges, mainly because of the shadow-smoothing effects. It can also be seen that some proximate parallel curved edge-sections were not fused. This is mainly due to the fact that they share common vertices at either both ends or at one end (case of edge-sections 1-2, 2-4, 4-5, 5-6, 10-8 and 8-14 on the track wheels and 6-11, 6-12 on the track top as shown in fig.9.a) and therefore, do not meet the fusion criteria. However, it is usually relatively easy to resolve this ambiguity if a model-based segmentation correction is applied.

As far as occlusion is concerned, it can be seen that boundaries of occluded surfaces may lead to over-segmentation due to the existence of jump edges (see fig.8.a and fig.9.a). However, these can easily be removed by fusing edge-segments based on parallelism relations (see segments 5-12 and 12-14 in fig.8.a and segments 2-5 and 5-25 in fig.9.a).

[pic]

8.a Segmentation results of scene 1

[pic][pic]

8.b Matching results of scene 1

Fig.8 Segmentation and matching results of scene 1

Despite few segmentation errors, which are mainly due to smoothing, edge-fusion and occlusion effects, experimental results for the matching algorithm are very promising. These can be shown in fig.8.b and fig.9.b by aligning each visible object in the scene onto its corresponding model. The matching results in fig.8.b show almost perfect object-model registration. However, results in fig.9.b which show matching of a configuration consisting of a track-body, track-top and 2 track-wheels, show a minor misregistration of the front-left track-wheel. In overall, the experimental results have shown that the system can successfully be used to recognize complex scenes with occluded objects with acceptable accuracy.

[pic]

9.a Segmentation results of scene2

[pic][pic]

9.b Matching results of scene 2

Fig.9 Segmentation and matching results of scene 2

6 Conclusion

A new model-based approach to recognize occluded curved objects from single range images is presented. Image segmentation is based on surface curvature detection, and is achieved by applying directional surface curvature operators in a simpler and efficient way to detect all types of edges. The matching strategy is based on geometrical transformations applied to some invariant local geometrical patterns derived from adjacent surfaces. The system can also recognize complex objects described by configurations. The experimental results have shown that the system can successfully be used to recognize complex occluded objects. However, the system has some limitations, for example, it would fail if none of the LGPs in the scene has a planar surface (a scene with fully curved objects). A possible extension of the system is to consider other curve-primitives to deal with complicated curved objects. To solve these problems further research is needed.

References:-

[1] F. Arman and J.K. Aggarwal, Model-based Object Recognition in Dense-range Images: A review, ACM Computing Surveys, Vol.25, No.1, 1993, pp.5-43.

[2] I.D. Reid and J.M. Brady, Recognition of Object Classes from Range Data, Artificial Intelligence, Vol..78, No.1-2, 1995, pp.289-326.

[3] Y. Sato and S. Tamura, Detecting Planar and Curved Symmetries of 3D Shapes from a Range Image, Computer Vision & Image Understanding, Vol. 64, No 1, 1996, pp.175-187.

[4] I. Weiss and M. Ray, Model-Based Recognition of 3D Objects from Single Images, IEEE Trans. On PAMI, Vol.23, No.2, 2001, pp.116-128.

[5] D. Marshall et al., Robust Segmentation of Primitives from Range in the Presence of geometric degeneracy, IEEE Trans. on PAMI, Vol.23, No.3, 2001, pp.304-314.

[6] W.Y. Ma and B.S. Manjunath, EdgeFlow: A Teachnique for Boundary Detection and Image Segmentation, IEEE Trans. on Image Processing,, Vol.9, No.8, 2000, pp.1375-1388.

[7] O. Monga, et al., Extraction of the zero-crossings of the curvature derivative in volumic 3D medical images, Proc. Of the Int. Conf. on Computer Vision & Pattern Recognition, 1994, pp.852-855.

[8] S.M. Zhan and R. Mehrotra, A zero crossing-based optimal 3D edge detector, Computer Vision, Graphics and Image Processing, Vol.9, No,2, 1994, pp.346-356.

[9] Y.J. Zhang, Evaluation and comparison of different segmentation algorithms, Pattern Recognition Letters, Vol.18, No.10, 1997, pp.963-974.

[10] R. Benlamri, Range Image Segmentation of Scenes with Occluded Curved Objects, Pattern Recognition Letter, Vol,21, No.12, 2000, pp.1050-1061.

[11] T.J. Fan, Description and Recognizing 3D Objects using Surface Properties, Springer-Verlag, 1990.

[12] M.T. Figueiredo et al. Unsupervised Contour Representation and Estimation Using B-Splines and Minimum Description Length Criterion, IEEE trans. On Image Processing, Vol. 9, No.6, 2000, pp.1075-1087.

[13] M.M. Blane et al. The 3L Algorithm for fitting Implicit Polynomial Curves and Surfaces to Data, IEEE Trans. on PAMI, Vol.22, No.3, 2000, pp.298-313.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download