CONSTRUCT A MEMBERSHIP FUNCTION TO OPTIMIZE RETRIEVAL …



DETERMINING MEMBERSHIP FUNCTION VALUES TO OPTIMIZE RETRIEVAL IN A FUZZY RELATIONAL DATABASE

Directed Research Report

Directed By: Dr. Lorraine M. Parker

Shweta Sanghi

May 2005

Table of Contents

Page No.

1. Introduction……………………………………………………………………………………................1

1.1. Overview of Fuzzy Relational Databases……………………………......................................2

1.2. Overview of Previous work……………………………………………...................................5

2. Machine Learning Methods to Adjust Weights………………………………………………………….8

3. Methods of Constructing Membership Function……………………………………………………….10

3.1. The Fuzzy Linguistic Approach to Fuzzy Set……………………………………………….10

3.2. Fuzzification………………………………………………………........................................11

3.3. Membership function determination…………………………………………………………12

4. Project Description…………………………………………………………….......................................15

4.1. Testing…..……………………………………………………………………………………15

4.2. Results………………………………………………………………………………………..19

4.3. Obstacles……………………………………………………………………………………..23

5. Future Work………………………………………………………………………………………….…24

References…………………………………………………………………………………….………...25

Appendix A. Stored Procedures………………………………………………………………………...27

Appendix B. Source Code………………………………………………………....................................37

Appendix C. Statistical Method to Construct Membership Weight for Images………………………..79

Appendix D. Instructions to install the Fuzzy Database Research on the Local Machine....................101

1. Introduction

Computers have made tremendous progress in the past few decades. They are now decidedly superior to human beings in speed and preciseness of calculation. However, computers do not always satisfy users’ requirements. One reason for this dissatisfaction is the lack of vagueness or imprecision, which are remarkable characteristics of humans. Imprecision is always part of our thinking and reasoning.

As an approach, fuzzy database systems, which are able to represent and manipulate imprecise information, have been presented. In these systems, imprecise information can be stored using fuzzy linguistic terms (e.g. young, big) which are frequently used in daily conversation. Although these words are ambiguous or uncertain, communities can agree on their meanings.

This paper presents an overview of fuzzy relational database theory and analyzes the previous work on a particular fuzzy database implementation [1] designed to retrieve images using fuzzy constructs whose common-language descriptions were defined by the consensus of a particular user community. Further, this paper describes research aimed at determining the best way of setting the membership values via feedback from the community. The desire is to expand the functioning level of the current prototype in order to implement “smart” database retrieval. It is proposed to determine membership values using the Direct Rating method. This method of determining membership function values (described in section 3.3) is then compared with the original prototype to determine which method is better.

1.1 Overview of Fuzzy Relational Database

Conventional relational database systems are based on crisp data, which is precise. Fuzzy relational databases extend the conventional relational database model to allow for representation of imprecise data.

A fuzzy set is created to describe the linguistic variables in more detail. The linguistic variable “age,” for instance, may have overlapping categories (members) of “young,” “very young,” “middle age,” “old,” and “very old.” Once these categories or members are defined, the fuzzy set is obtained, and a membership function is then developed for each member in the set.

Fuzzy logic and fuzzy sets were first introduced by Lotfi A. Zadeh [2]. Fuzzy sets were derived by generalizing the concept of set theory. Fuzzy sets can be thought of as an extension of classical sets. In a classical set (or crisp set), the objects in the set are called elements or members of the set. An element x belonging to a set A is defined as x ( A, an element that is not a member in A is noted as x ( A. A characteristic function or membership function (A(x) is defined as an element in the universe U having a crisp value of 1 or 0. For every x ( U,

This can also be expressed as [pic]

In crisp sets the membership function takes a value of 1or 0. For fuzzy sets, the membership function takes values in the interval [0, 1]. The range between 0 and 1 is referred to as the membership grade or degree of membership [5].

A fuzzy set A is defined as:

Where (A(x) is a membership function belonging to the interval [0, 1]. Fuzzy set theory has equivalent operations to those of crisp set theory. It includes functions such as equality, union and intersection [1].

A Fuzzy Relational Database extends a normal relational database by adding fuzzy logic, fuzzy data and membership functions. Membership functions can be defined as the degree of the truthfulness of the proposition. For example, the predicate “John (X) is tall (A)” is represented by number in unit interval (A(x). (A(x) = 0.7 means that John is tall to the degree 0.7. It is different from probability.

The use of relations is one method of adding fuzzy selection criteria to a query. The rows contain data that is interpreted according to the elements in them and their fuzziness. For instance, consider the following height table.

Name Height

John 6’0’’

Bill 5’0’’

Janice 4’0’’

Fred 5’8’’

Meri 5’6’’

The elements in the height column are crisp values because they give the exact height of the individual. A range query could be created to find people within a certain range of heights, but it would be more convenient to define the range as a relation:

SELECT NAME FROM HEIGHTS

WHERE HEIGHT IS SHORT;

If SHORT was defined equal to 5’0’’ in the database, only “Bill” would be selected since he is the only short person in the group. A query could also be formulated to include different sets of range, that is, people who are VERY SHORT or of MEDIUM height. One way of approaching this is to create a relation that defines SHORT and its relationship to the other heights. In effect, we are defining the membership function for the fuzzy attribute SHORT.

Height Short Membership

4’00’’ 0.00

4’4’’ 0.10

4’8’’ 0.50

5’0’’ 1.00

5’6’’ 0.50

5’8’’ 0.10

6’0’’ 0.00

The query would then receive names from the list based on the threshold. For instance, if we set our threshold for SHORT to be 0.50 or greater, then the query would return both Bill and Meri.

A querying language called SQLf [3] has been developed to support a wide range of fuzzy queries (i.e., fuzzy predicates are introduced into the language wherever possible) while also adhering to the conventions used in SQL. The basic principle of SQLf is the introduction of fuzziness into the SELECT-FROM-WHERE block of SQL. Fuzziness is introduced at two levels within the WHERE condition, both in the predicates themselves and in the way they are combined (in the “connectors like and, or, etc”). To illustrate, consider these relations:

Employee (Emp#, E-name, Salary, Job, Age, City, Dept#) and

Department (Dept#, D-name, Manager, Budget, Location)

The query “Find the best 5 employees from Chicago who are well-paid and who are working in a high budget department” may be expressed as:

1. SELECT E.Emp, E.Name

2. FROM Employee E,Department D

3. WHERE E.City=”Chicago” AND E.Salary=”well paid” AND D.Budget=”high”

4. AND E.Dept#=D.Dept#

In the above select statement, fuzziness is introduced in the line number 3 by the use of words well and high.

1.2 Overview of previous work

A fuzzy relational database [1], which uses a natural language querying system to retrieve images whose common-language descriptions are defined by the consensus of a particular user community, was previously developed.

The architecture of this relational database system required two layers to process fuzzy queries. The first layer was a RDBMS architecture and the second layer was a user interface layer of stored procedures in SQL Server to process the fuzzy queries. To query the database, SQLf was used. The syntax to query the database was

SELECT (attribute list)

FROM (relation List)

WHERE (fuzzy conditions)

Fuzzy conditions are conditions which use fuzzy predicates such as EYE_COLOR = SLIGHTLY BLUE. For each image, each of the possible values of a fuzzy attribute and its corresponding membership weights were listed in a tuple as shown in Table 1. The features used in the implementation were “eye color” and “face width”. For eye color, possible values were green, blue and brown. For face width, values were broad, average and narrow. A list of fuzzy modifiers and their synonyms was developed. The modifiers selected (to be used with either attribute) were “very”, “medium” and “slightly”. Each modifier was assigned a corresponding range on the (0, 1) membership interval as shown in Table 2. Thus, using the two tables, a query requesting people with “light blue eyes” would return all of the images with EYE_COLOR = BLUE and a membership value between 0.0 and 0.29. To illustrate, an image of a person with eyes that are predominantly green with just a hint of blue and no trace of brown could be represented in the database as shown in Table 1.

|IMAGE_ID |EYE_COLOR |WEIGHT (µ) |

|1 |GREEN |0.8 |

|1 |BLUE |0.3 |

|1 |BROWN |0.0 |

Table 1: Using Membership Values (Weights) for Each Attribute

|Modifier |Range_From |Range_To |Midpoint |

|“Very” |0.75 |1.0 |0.87 |

|“Medium” |0.30 |0.74 |.52 |

|“Slightly” |0.0 |0.29 |.20 |

Table 2: Modifier Ranges and Midpoints

Random values between [0, 1] were assigned as weights or membership values to each attribute of every image when the program was initialized for each particular user community. Each modifier had a threshold value that was at the midpoint of the modifier’s range. Threshold value was designed to steady each attribute’s membership value within the modifier ranges so it accurately represented each community’s consensus. The users, after viewing the result of their query, provided the feedback as to whether the image met their criteria or they believed the image would be better defined using a stronger or a weaker modifier. According to the user’s response, the weight of the attribute was increased or decreased by 0.01. For example, after viewing the result of query EYE_COLOR = VERY BLUE, if the user decided that the image should be defined by a stronger modifier then the weight was adjusted towards the threshold for very blue but if the user decided that the image should be defined by a weaker modifier then the weight was decreased by 0.01. Thus, by adjusting the membership weight so that it was as deep in the range as possible (i.e., at the midpoint), the community opinion was strengthened with concurring feedback. This method of constructing membership value from the subjective information is denoted as the Random Method.

The Fuzzy Relational Database was designed to have a natural query language interface where a user would enter a query such as, “List names of the persons with very brown eyes”. The natural query language interface parsed the user query into the actual query and the modifier/attribute pair. This data was stored into separate data files, which the fuzzy relational database took as input. It then created a fuzzy query, queried the database and produced the result. After getting user feedback, the membership weights were adjusted. This process went on until a community consensus was reached.

2. Machine Learning Methods to Adjust Weights

The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. Machine learning is said to occur in a program that can modify some aspect of itself, often referred to as its state, so that on a subsequent execution with the same input, a different (hopefully better) output is produced.

Machine Learning is defined as, “A computer program is said to learn from Experience E with respect to some class of tasks and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”[4]

Learning systems learn through use of some sort of feedback or evolution or reaction to its response of experience E. Broadly speaking machine learning can be divided into two types

Supervised Learning: Supervised learning is where the learning algorithm is provided with a set of inputs for the algorithm along with the corresponding correct outputs, and learning involves the algorithm comparing its current actual output with the correct or target outputs, so that it knows what its error is, and modify things accordingly. For example, supervised learning is used in back propagation algorithm to train neural networks.

Unsupervised learning: Unsupervised learning is where the system is not told the "right answer". It is not trained on pairs consisting of an input and the desired output. Unsupervised clustering algorithms can be used to train the system.

The purpose of the current fuzzy database implementation is to retrieve images by using fuzzy queries whose common-language descriptions are defined by the consensus of a particular user community. In everyday life, these language descriptions are subjective. A person may be described as having dark brown hair and a narrow face by one user and other user can describe that person differently. Therefore the terms like “dark brown” and “narrow” are not objective definitions but rather reflect a person’s (or a group’s) definition of those terms. These definitions depend on factors like culture and experience, and may change over time. Although it is difficult to design a system which satisfies each user’s perception of facial definitions, a consensus can be reached so that it satisfies most users of that community. Since learning is based on feedback provided by the user community and the system does not know the right answer, it is using unsupervised learning.

According to the definition of machine learning, the system can defined as

• Task T: Retrieval of images according to the criteria defined

• Performance Measure P: Percent of community satisfaction

• Training Experience E: User feedback

3. Methods of Constructing Membership Function

A membership function (MF) is a curve that defines how each point in the input space is mapped to a membership value (or degree of membership) between 0 and 1. The input space is sometimes referred to as the universe of discourse, a fancy name for a simple concept.

Summary of Membership Functions

• Fuzzy sets describe vague concepts (fast runner, hot weather, weekend days).

• A fuzzy set admits the possibility of partial membership in it. (Friday is sort of a weekend day, the weather is rather hot).

• The degree an object belongs to a fuzzy set is denoted by a membership value between 0 and 1. (Friday is a weekend day to the degree 0.8).

• A membership function associated with a given fuzzy set maps an input value to its appropriate membership value.

3.1 The Fuzzy Linguistic Approach to Fuzzy Set

The linguistic approach is Zadeh’s original idea [2]. It is based on two main concepts: the linguistic variable and the linguistic term. A linguistic variable represents a concept that is measurable in some way, either objectively or subjectively, like temperature or will. Linguistic variables are characteristics of an object or situation. Linguistic terms rate the characteristic denoted by one linguistic variable. A linguistic term is a fuzzy set, and the linguistic variable defines its domain.

Every adequate representation of fuzzy sets involves the basic understanding of five related conceptual symbols:

• the set of elements θєΘ, as in “image” from “group of images”

• the linguistic variable V, that is a label for one of the attributes of the elements θєΘ, as in “eye color” of “image”;

• the linguistic term A, which is an adjective or adverb describing the linguistic variable, which is a subjective measure of V, as in “blue” describing “eye color”;

• a referential set X С R, that is a measurable numerical interval, for the particular attribute V, as in “[0,100] blue” for “eye color”;

• a subjective numeric attribution µA(θ), of the membership value, i.e., the membership degree of the element θ, labeled by the linguistic variable V as described by A.

3.2 Fuzzification

The first step in every fuzzy system consists of converting the inputs from the traditional crisp universe to the fuzzy universe. This step is known as fuzzification or fuzzy encoding, and identifies that there is an acceptance of uncertainty assigned to the input value. Every input value is associated with a linguistic variable. For each linguistic variable, it should be assigned a set of linguistic terms that subjectively describe the variable. Most of the time, linguistic terms are words that describe the magnitude of the linguistic variable, as “hot” and “large”, or how far they are from a goal value, as in “exact” or “far”. Each linguistic term is a fuzzy set and has its own membership function. It is expected that for a linguistic variable to be useful, the union of the support of the linguistic terms cover its entire domain. It is also expected that there some intersection between the support of linguistic terms that describe similar concepts. So, when talking about temperature, for instance, one would expect to see some values that will be, at the same time, described as “cold” and “freezing”. Usually, adjacent linguistic terms have 10 to 50% superposition. Given all the fuzzy sets which correspond to each linguistic variable, fuzzification means to determine the membership value of each input value in each fuzzy set.

3.3 Membership Function Determination

Mainly, there are six methods used in experiments with the aim of constructing membership functions [14]:

Polling: do you agree that John is tall? (Yes/No)

Direct rating (point estimation): classify John according to his tallness. In general, the question is: “How F is a?”

Reverse Rating: identify the person who is tall to the degree 0.6? In general,

identify a who is F to the degree µF (a).

Interval Estimation (set valued statistics): give an interval in which you think the height of John lies.

Membership function exemplification: What is the degree of belonging of John to the set of tall people? In general, “To what degree a is F?”

Pairwise comparison: which person John or Joe, is taller (and by how much?)

1: Polling

In polling one subscribes to the point of view that fuzziness arises from interpersonal disagreements. The question “Do you agree that a is F?” is asked to different individuals. The answers are polled and an average is taken to construct the membership function. Polling is also one of the natural ways of eliciting membership functions for the likelihood interpretation.

2: Direct Rating

Direct rating seems to be the most straightforward way to come up with a membership function. This approach subscribes to the point of view that fuzziness arises from individual subjective vagueness. The subject is required to classify a with respect to F over and over again in time. The experiment has to be carefully designed so that it will be hard for the subject to remember past answers. The same question is asked to the same subject over and over again, and the membership is constructed using the assumption of probabilistic errors and by estimating a few key parameters as is usual for this type of construction. Chameau & Santamarina (1987a) use several subjects and aggregate their answers as opposed to asking a single subject same questions over and over.

3: Reverse Rating

In this method, the subject is given a membership degree and then asked to identify the object for which that degree corresponds to the fuzzy term in question. This method can be used for individuals by repeating the same question for the same membership function as well as for a group of individuals. Once the subject’s (or subjects’) responses are recorded, the conditional distributions can be taken to be normally distributed and the unknown parameters (mean and variance) can be estimated as usual. This method also requires evaluations to be made on at least interval scales to determine membership function.

4: Interval Estimation

The subject is asked to give an interval that describes the Fness of a. Let Ii be the set-valued observation (the interval) and mi the frequency with which Ii is observed. Then R = (Ii; mi) defines a random set. It means that mi population is defining the interval in which the fuzzy term lies. The rest of the population defines the different interval for the Fness of a.

5: Membership Exemplification

The subject is asked to write the degree which is appropriate for “large”, “very

large”, “small” on scale of 0 to 100. The same question is asked to the group of individuals. The results are analyzed to determine membership function.

6: Pairwise Comparison

The subject is asked a question “which is a better example of a bird: an eagle or a pelican?”, and based on answer (say, an eagle is chosen) to that question, another question is asked: “How much more of a bird is an eagle than pelican?” Weights are then derived from the principal eigenvector of the square reciprocal matrix of pairwise comparison between all contributing attributes.

In all of the elicitation methods, Chameau & Santamarina [15] obtain the membership functions based on averaging or aggregation of the responses from several assessors. In that sense they do not subscribe to the individualistic interpretation of fuzziness. Chameau & Santamarina justify this approach by assuming that fuzziness is a property of the phenomenon rather than a property attributed by the observer. All of the above described approaches are called manual methods of determining membership function because expert/experts are required to obtain membership function. All the manual approaches suffer from the deficiency that they rely on very subjective interpretation of words, the foibles of human experts and generally all the knowledge acquisition problems that are well documented [6] with knowledge based systems.

4. Project Description

This research aims to determine the best way of setting the membership values using feedback from the community for the fuzzy relational database. The purpose of the fuzzy database implementation is to retrieve images by using fuzzy queries whose common-language descriptions are defined by the consensus of a particular user community. The fuzzy set, which is presentation of fuzzy attribute values of the images, is determined through membership function.

How best to determine the membership function is the first question to answer? It is proposed to construct membership values by the Direct Rating method. This approach subscribes to the point of view that fuzziness arises from individual subjective vagueness. This method of constructing the membership function will then be compared with the present prototype which uses the random method as explained in section 1.2 to determine which method gives the most user satisfaction with minimum feedback from the community.

4.1. Testing

The previous implementation was modified to alter the method of assigning the membership value, which was based on direct rating method, for the blue eye color attribute of each image. User feedback for both the prototypes was taken in two sessions, which were training and testing. The user interface of both the prototypes was altered to make it user friendly. The source code with changes can be referenced in Appendix B.

In the direct rating method, random weights were not assigned to the blue eye color attribute of the images as was done in previous prototype [1]. In the training session, the question put to the community was “How blue are the eyes? and they responded using a simple indicator on a sliding scale. The left most bar on the sliding scale represented that the eyes were not blue whereas the right most bar on the sliding scale represented that the eye were 100% blue. The training session was comprised of 21 people. The values given for blue eye color attribute were recorded into the database as shown in Figure 1.

[pic]

Figure 1: Scores and Distribution for Image 1

The first column in Figure 1 represents the community’s view of the degree to which image 1 had blue eyes. Frequency distribution method was chosen to summarize the community feedback for each attribute in order to achieve maximum user satisfaction. A frequency distribution table was created that grouped scores into non overlapping intervals called frequency ranges. The frequency ranges were chosen based on range values of the very (75-100), medium (30-74), slightly (0 -29) modifiers. The number of scores that fell into each frequency range was calculated. The counts, or frequencies, of scores were then listed in their respective frequency ranges. The base of the rectangles in the histogram corresponds to the frequency ranges, and height of each rectangle equals the number of scores in that range. The frequency histogram for each of the 40 images can be referenced in Appendix C. Based on Table 2; each image acquired the threshold membership value for the maximum frequency range. For example, 13 people out of 21 chose the medium blue eye color for the image 1, while 6 people out of 21 said that the image had very blue eyes. Therefore, the membership value of 0.52 was assigned to the blue eye color attribute of the image id 1. Weights were similarly assigned for each of the 40 images as shown in Figure 2.

In the random method, random weights were assigned initially to the blue eye color attribute of the images. The training session which comprised of 21 people was conducted to learn membership weights. Based on the community feedback, membership weight of the blue eye color attribute of each image was adjusted as explained in section 1.2.

[pic]

Figure 2: Membership Weights for Direct Rating Method

A testing session was conducted to compare both methods of eliciting membership values and to determine which method gives the better community satisfaction. In the testing session, the fuzzy relational database was queried for “slightly”, “medium” and “very” blue eye color of the images based on the membership value assigned by the random method and the direct rating method. The community was asked whether, they were satisfied with the result or not. The percentage of user satisfaction was calculated. The percentage of user satisfaction of random method and direct rating method can be referenced in Appendix C. The line graph in figure 3 represents the comparison of user satisfaction for 40 images. For example, for image id 1, user satisfaction was 89% for the direct rating method and 67% for the random method.

[pic]

Figure 3: Comparison of Percentage of user satisfaction for all images

4.2 Results

The proposed direct rating method was the better method to determine membership function values, compared to the random method. The reasons are:

• The direct rating method is more efficient because it reaches consensus with fewer iterations. For example, the random method can assign 1.0 membership value initially to an image whose eyes are not blue. It would take at least 100 feedbacks to reach a weight of 0.0, provided each user’s perception is the same.

• It is based on maximum vote. The threshold value is assigned based on the maximum value of frequency distribution table.

• It achieved a higher user satisfaction in 80% cases compared to random method.

• It is efficient even for smaller communities. For random method to be efficient in all cases, the community should comprise of at least 100 people. The extreme case for random method will be when the random method assigns 1.0 membership value initially to an image whose eyes are not blue. It would take at least 100 feedbacks to assign 0.0 weights, provided each user’s perception is same.

However, there were many factors that affected the results.

• The images were described by the community, not the database designers. Thus, it was expected that each user could describe the same images differently. Therefore a term like “very blue eye” is not objective but rather reflects a person’s definition of those terms. These definitions depend on factors like culture and experience, and may change over time.

• For those images that got less than 25% user satisfaction, the pictures were not very clear. It was very hard to make out the eye color. It would have been better if the focus of the image was more on eye. This was true for both methods. But in this real world, perfect pictures can not be guaranteed in such a system.

• There were some images whose user satisfaction was between 35% and 70% for the direct rating method because the frequency distribution of the attribute for two ranges was very close. Although, the maximum user’s satisfaction range was picked, but over all satisfaction was low. For example see Figure 4. Nine users out of 21 said that the eyes were medium blue, and 10 out of 21 said that the eyes were slightly blue eye. The membership weight of 0.2 was assigned to the image to reach maximum satisfaction but testing revealed 44% of user satisfaction.

[pic]

Figure 4

It is suggested that frequency distribution should be recalculated by removing 0 score value of the attribute. For example in case of Figure 4, after removing 0 score, 4 out of 15 users said that the eyes were slightly blue eye, and 9 users out of 15 said that the eyes were medium blue. The membership weight of 0.52 could be assigned to the image 33. This requires further testing.

• The community which trained the prototype and the tested the prototype was not same. 21 users trained the both prototypes. 27 users tested the direct rating method and 15 users tested the random method. The varied community may be the one reason for not reaching community’s consensus more than 80% for all images. There may be other reasons such as communities may never agree that often. This requires further testing.

• The community did not define the range of modifiers. It was defined by the database designer. One user can describe range between 25 and 35 as slightly and other user can describe the same range as medium. One strategy could be to designate certain user as “expert” whose opinions would be weighted more heavily than the rest of the group’s. The expert would define the range of modifiers and educate the rest of community about the range of modifiers. Other strategy could be learning the range of modifiers from the community feedback. This requires further investigation.

• The slightly modifier includes those images that have no blue eye color. The results would have improved by excluding images with no blue eye color. For example in Figure 4, only six people would have described the image as slightly, if 0 score is ignored.

It is believed that if these factors were taken into account the direct rating method would reach higher satisfaction. But this requires more study.

4.3 Obstacles

There were many obstacles that had to be overcome to conduct this research. The main obstacle was learning and installing Microsoft .Net and SQL Server. Installation of the previous prototype on the local machine caused lots of problems because of inadequate documentation. Instructions to install this application locally can be referenced in Appendix D. Problems like changes in the user interface to use a slide bar, making the user interface friendly to conduct experiments, finding a way to display all images were other challenges that took time.

5. Future Work

The research raised some questions, which can be explored in future fuzzy relational database research.

One suggested area is in using unsupervised machine learning techniques such as Least Mean Square algorithm or clustering techniques to adjust weights in order to reach maximum satisfaction. Different strategies can be used to train the system. One strategy could be to designate certain users as “experts” by community whose opinions would be weighted more heavily than the rest of the group’s. Rather than using random values, these experts would initially be shown the images and their descriptions would establish the initial membership values; the remaining users would then modify the weights from that point forward. Other strategy could be start learning by initializing weights with random values and learn until certain percentage of satisfaction reached, then stop learning and again start learning after certain period like after a month or year or after the level of dissatisfaction reaches a particular percentage.

Another area of future interest could be improvements in user interface. Instead of showing whole image, the focus can be more on attribute in study. For example, while training images for eye color, if focus would have been on eyes then results may have been better. User interface should be improved to make it more users friendly.

Another area of interest would be defining the range of modifiers. One strategy could be to designate certain user as “expert” whose opinions would be weighted more heavily than the rest of the group’s. The expert would define the range of modifiers and educate the rest of community about the range of modifiers. Other strategy could be learning the range of modifiers from the community feedback. Another area of future interest would be to find the best method to handle unclear pictures.

References

[1] Joy,Karen and Dattatri, Smita ,” Implementing a Fuzzy Relational Database and Querying System With Community Defined Membership Values “, VCU Directed Research Report, November 2004.

[2] Zadeh, L. A. (1965). “Fuzzy Sets.” Information and Control, 8, 338-353.

[3] Bosc, P. & Pivert, O. (1995). ”SQLf: A Relational Database Language for Fuzzy Querying.” IEEE Transactions on Fuzzy Systems, 3, 1-17.

[4] Mitchell, Tom M. “Introduction to Machine Learning” in Machine Learning (7th ed.), McGraw Hill Publishers, 2-5.

[5] Turksen, I.B., Measurement of membership functions and their acquisition, Fuzzy Sets and Systems, 40:5--38, 1991.

[6] Motoda, H., Mizoguchi, R., Boose, J. H., and Gaines, B. R., Knowledge Acquisition Tools, Methods, and Mediating Representations, Proceedings of the First Japanese Knowledge Acquisition for Knowledge-Based Systems Workshop: JKAW-90, Ohmsha, Japan.

[7] Norwich, A.M. & Turksen, I.B., The construction of membership functions, Fuzzy Sets and Possibility Theory: Recent Developments.

[8] Watanabe, N., Statistical Methods for Estimating Membership Functions, Japanese Journal of Fuzzy Theory and Systems, 5(4), 1979.

[9] John, R. I., Fuzzy Inference Systems: Problems and Some Solutions, De Montfort University Computing Science Research,

[10] Eminov, Mubariz, Querying a Database by Fuzzification of Attribute Values.

[11] Wang, Li-Juan & Wang, Xi-Zhao & Ha, Ming-Hu & Yin-Shan, Mining the Weights of Similarity Measure Through Learning.

[12] Tashiro, H. & Ohki, N. & Yokoyama, T. & Matsushita, Y., Managing Subjective Information in Fuzzy Database Systems, 156 – 161.

[13] Baklarz, George, Using Neural Nets to Optimize Retrieval in a Fuzzy Relational Database, 191 – 200.

[14] Bilgic, Taner & Turksen, I.B, Measurement of membership functions: Theoretical and Empirical work, 17 – 21.

[15] Chameau, J. L. & Santamarina, J. C. (1987a), Membership Part I: Comparing Methods of Measurement, 287-301.

Appendix A

Stored Procedures

A. Fetch_All_Images

/* Database Research Spring 2005

Shweta Sanghi

Fetch_All_Images, stored procedure, fetches all images from the database for training */

CREATE PROCEDURE [Fetch_All_Images]

AS

SET ANSI_NULLS ON

exec('select * from Person P')

GO

B. Fetch_Data

/* Database Research Spring 2005

Shweta Sanghi

Fetch_Data, stored procedure, is used to fetch the data from the database based on old method of assigning weights. It interprets the fuzzy modifiers and translates them into SQL Queries for our database */

CREATE PROCEDURE [Fetch_Data]

@query as varchar(10),

@s1 as varchar(50),

@s2 as varchar(50)

AS

SET ANSI_NULLS ON

declare @f_query as varchar(500),

@high as float,@low as float

print @high

if @s2 = 'Blue' or

@s2='Green' or

@s2='Brown'

begin

select @high =high from Range where modifier=@s1

select @low= low from Range where modifier=@s1

print @s2

exec('select * from Person P,Color C where

P.ID=C.ID and C.weight = '+@low+' and C.Color='+'"'+@s2+'"')

end

if @s2 = 'Broad' or

@s2 = 'Average' or

@s2='Narrow'

begin

select @high = high from Range where modifier=@s1

select @low =low from Range where modifier=@s1

exec('select * from Person P,Face F where

P.ID=F.ID and F.weight >= '+@low+' and F.weight = '+@low+' and F.weight 0 Then

Count = ds.Tables(0).Compute("COUNT(ID)", "")

'The following block of code creates a new object of ImageSet for each

'record in the which result set which contains the image and the slide bar

'associated with it. It also reads the image in binary format from the database

'and displays it appropriately.

For I = 0 To Count - 1

Dim S As New ImageSet

Dim bits As Byte() = CType(ds.Tables(0).Rows(I).Item(2), Byte())

Dim memorybits As New MemoryStream(bits)

Dim bitmap As New Bitmap(memorybits)

S.picture.Image = bitmap

al.Add(S)

al(I).ID = ds.Tables(0).Rows(I).Item(0)

Next I

'variable indicating the first picture to be displayed on a page

nextPix = 0

'variable indicating the remaining pictures to be displayed

remainingPix = al.Count

'The following block of code places the components on the Group Box and

'then places the group box on the panel to be displayed.

placeComponents(nextPix)

'This procedure is evoked so that update button is disabled

'until form displays last page of the query. At the last page, more button is disabled

updateButtons()

'label on the Panel.

Label1.Text() = "Please Answer the following question."

Else

MessageBox.Show("No images match your criteria")

End If

End Sub

'The following block of code places the components on the Group Box and

'then places the group box on the panel to be displayed.

Private Sub placeComponents(ByVal Position1 As Integer)

Dim X As Integer, Y As Integer

Y = 8

Dim I As Integer

'limit the display to 6 images

Dim lastPix As Integer

'determing the last picture for the current page

If remainingPix 6 Then

remainingPix = remainingPix - 6

Else

remainingPix = 0

End If

'update first picture on next page

nextPix = nextPix + 6

End Sub

'set up buttons depending on number of images in query result

'This procedure is evoked so that update button is disabled

'until form displays last page of the query. At the last page, more button is disabled

Private Sub updateButtons()

If al.Count 0 Then

Count = ds.Tables(0).Compute("COUNT(ID)", "")

'The following block of code creates a new object of ImageSet for each

'record in the which result set which contains the image and the Check Boxes

'associated with it. It also reads the image in binary format from the database

'and displays it appropriately.

For I = 0 To Count - 1

Dim S As New ImageSet(color_face)

Dim bits As Byte() = CType(ds.Tables(0).Rows(I).Item(2), Byte())

Dim memorybits As New MemoryStream(bits)

Dim bitmap As New Bitmap(memorybits)

S.picture.Image = bitmap

al.Add(S)

al(I).ID = ds.Tables(0).Rows(I).Item(0)

Next I

'variable indicating the first picture to be displayed on a page

nextPix = 0

'variable indicating the remaining pictures to be displayed

remainingPix = al.Count

'The following block of code places the components on the Group Box and

'then places the group box on the panel to be displayed.

placeComponents(nextPix)

'This procedure is evoked so that update button is disabled

'until form displays last page of the query. At the last page, more button is disabled

updateButtons()

'Changes the labels on the Panel depending on the query.

If color_face = "Green" Or color_face = "Blue" Or color_face = "Brown" Or color_face = "green" Or color_face = "blue" Or color_face = "brown" Then

Label1.Text() = "People with " + modifier + " " + color_face + " Eyes"

Else

Label1.Text() = "People With " + modifier + " " + color_face + " Faces"

End If

Else

MessageBox.Show("No images match your criteria")

End If

End Sub

'The following block of code places the components on the Group Box and

'then places the group box on the panel to be displayed.

Private Sub placeComponents(ByVal Position1 As Integer)

Dim X As Integer, Y As Integer

Y = 8

Dim I As Integer

'limit the display to 6 images

Dim lastPix As Integer

'determing the last picture for the current page

If remainingPix 6 Then

remainingPix = remainingPix - 6

Else

remainingPix = 0

End If

'update first picture on next page

nextPix = nextPix + 6

End Sub

'set up buttons depending on number of images in query result

'This procedure is evoked so that update button is disabled

'until form displays last page of the query. At the last page, more button is disabled

Private Sub updateButtons()

If al.Count 0 Then

Count = ds.Tables(0).Compute("COUNT(ID)", "")

'The following block of code creates a new object of ImageSet for each

'record in the which result set which contains the image and the Check Boxes

'associated with it. It also reads the image in binary format from the database

'and displays it appropriately.

For I = 0 To Count - 1

Dim S As New ImageSet(color_face)

Dim bits As Byte() = CType(ds.Tables(0).Rows(I).Item(2), Byte())

Dim memorybits As New MemoryStream(bits)

Dim bitmap As New Bitmap(memorybits)

S.picture.Image = bitmap

al.Add(S)

al(I).ID = ds.Tables(0).Rows(I).Item(0)

Next I

'variable indicating the first picture to be displayed on a page

nextPix = 0

'variable indicating the remaining pictures to be displayed

remainingPix = al.Count

'The following block of code places the components on the Group Box and

'then places the group box on the panel to be displayed.

placeComponents(nextPix)

'This procedure is evoked so that update button is disabled

'until form displays last page of the query. At the last page, more button is disabled

updateButtons()

'Changes the labels on the Panel depending on the query.

If color_face = "Green" Or color_face = "Blue" Or color_face = "Brown" Or color_face = "green" Or color_face = "blue" Or color_face = "brown" Then

Label1.Text() = "People with " + modifier + " " + color_face + " Eyes"

Else

Label1.Text() = "People With " + modifier + " " + color_face + " Faces"

End If

Else

MessageBox.Show("No images match your criteria")

End If

End Sub

'The following block of code places the components on the Group Box and

'then places the group box on the panel to be displayed.

Private Sub placeComponents(ByVal Position1 As Integer)

Dim X As Integer, Y As Integer

Y = 8

Dim I As Integer

'limit the display to 6 images

Dim lastPix As Integer

'determing the last picture for the current page

If remainingPix 6 Then

remainingPix = remainingPix - 6

Else

remainingPix = 0

End If

'update first picture on next page

nextPix = nextPix + 6

End Sub

'set up buttons depending on number of images in query result

'This procedure is evoked so that update button is disabled

'until form displays last page of the query. At the last page, more button is disabled

Private Sub updateButtons()

If al.Count ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download