A Novel Concept of Uncertainty Optimization Based Multi-Granular Rough Set a nd Its Application

Data is generating at an exponential pace with the advancement in information technology. Such data highly contain uncertain and vague information. The rough set approximation is a way to ﬁnd information in the data-set under uncertainty and to classify objects of the dataset. This work presents a mathematical approach to evaluate the data-sets uncertainties and their application to data reduction. In this work, we have extended the multi-granulation variable precision rough set in the context of uncertainty optimization. We develop an uncertainty optimization-based multi-granular rough set (UOMGRS) to minimize the uncertainties in the data set more effectively. Using UOMGRS, we ﬁnd the most informative attribute in the feature space. It is desirable to minimize the rough set boundary region using the attribute having the highest approximation quality. Thus we group the attributes whose relative quality of approximation is the maximum to maximize the positive region and to minimize the uncertain region. We compare the UOMGRS with the single granulation rough set (SGRS) and the multi-granular rough set (MGRS). By our proposed method, we require only an average of 62% attributes for approximation whereas, SGRS and MGRS need an average of at least 72% of attributes in the data set for approximation of the concepts in the data-set. Our proposed method requires less amount of data for the classiﬁcation of objects in the dataset. The method helps minimize the uncertainties in the dataset in a more efﬁcient way.


Introduction
Each object possesses the information, and we can classify objects into distinct categories using this information. The object information stored using different attributes is termed a knowledge system. The attributes or features are the granules of information, and we can label object's using them. Each label in the knowledge system is termed a concept in the knowledge system [1]. The rough set theory (RST) proposed by Pawlak [2] is a practical approach for approximation of sets in the knowledge system. It provides many useful tools for data analysis using the information available in the knowledge system. The growing applications of RST have developed a great deal of interest among researchers of expert systems, and artificial intelligence. These includes, but not limited to data engineering, knowledge acquisition, decision making [3,4,5,6,7]. RST is very advantageous as it handles data at hand and requires no posterior information [8]. One of the applications of RST is the classification of objects of the universe into different categories. The subsets of the feature set which keeps the classification of the objects intact and preserve the information of the dataset as a whole are termed as reducts of the feature space [2]. Rough set reducts for feature selection is the wrapper method in which feature subsets are formed using their quality of approximation of the object classification. Quality of approximation is used for classification of objects into distinct categories. The new possibilities of application of artificial intelligence methods for high-precision solution of boundary value problems is studied in [9]. Reducts are mainly used for feature selection and data reduction. RST uses a single binary relation for defining the set approximations. The indiscernibility relation is an equivalence relation that partition the universal set into equivalence classes. The classes of this indiscernibility relation described the objects-classification as specific classification and possible classification. This concept approximation approach is very rigid as there is no space provided for any misclassification of the objects. The exploratory spatial data analysis is studied in [10]. In a practical situation, we have to deal with uncertain information, and the main limitation of RST is that it is incapable of modelling uncertain information [8]. The variable precision rough set (VPRS) model by Ziarko is the generalization of the classical RST in which a specified degree of misclassification error is allowed in the set approximations. The main advantage of VPRS is that it minimizes the boundary region of the rough set. The VPRS model applications in data mining show its usefulness in data mining under uncertainty [11,12,13,14]. The classical rough set uses a single binary relation in the definition of set approximations. The multi-granulation rough set (MGRS) is the extension of the classical rough set using multi-equivalence relation on the universe of discourse. Qian et al. [15] proposed the MGRS approach in the case of multi-granulation. The MGRS model under the variable precision rough set environment was proposed by Wei et al. [16]. A specified value of misclassification error is used to find the classes of the multi-equivalence relation over the universe. The main limitation of this model is as follows: • There is no method described for obtaining the value of misclassification error.
• The same value of misclassification error is used in the multi-equivalence relation.
In this work, we consider the value of misclassification error depending on the class inclusion values of the multiequivalence relation. We formulate an uncertainty optimization problem for minimizing the uncertainty using the multiequivalence relation over the attribute sets. We define the set approximation for variable precision multi-granulation rough set from the perspective of uncertainty optimization. The reducts obtained are used for data reduction and feature selection. We see the attribute with the highest quality of the approximation, as it is the most informative attribute in the feature space. It is desirable to minimize the uncertain region of this attribute using the positive region of the remaining attributes. Hence we group the attributes whose relative quality of approximation is the maximum for data reduction. This study's main contribution is the extension of the variable precision multigranulation rough set model from an uncertainty minimization point of view. This work gives useful results and properties in terms of multi-equivalence relation and class inclusion values. The application of the UOMGRS is discussed to find reducts of the feature space using the uncertainty optimization problem.

Pawlak's single granulation rough set approximations (SGRS)
Let U denotes the non-empty finite set of objects also called as universe of discourse, and Atr be the non-empty set of condition attributes, and {D} be the decision attribute. For any A ∈ Atr and x ∈ U, let A(x) be the attribute value of A for x, and V be the set of all attribute values, then (U, Atr ∪ {D}, V ) is the representation of knowledge system. Let x, y ∈ U, then θ Atr be the relation defined as follows: It is clear that the relation θ Atr is an equivalence relation. Let the class of relation θ Atr containing x is denoted as Atr(x), then we have Let S ⊆ U, and x ∈ S, we define the class of x given the set S by the relation θ Atr as Clearly Atr(x|S) ⊆ Atr(x). Ziarko [8] define the misclassification-error of the class Atr(x) ∈ Atr in the concept χ as where, |X| is the cardinality of X. For any S ⊆ U, we define the majority inclusion-relation over the set S as follows: For the value of α = 0, we have Atr(x) The relation θ Atr partition U into equivalence classes. Let Atr denotes the partition of U, and χ be any concept in the knowledge system then the lower approximation of χ is denoted by χ Atr and is defined as The lower approximation χ Atr is the set of x ∈ U which are certainly classified as χ. Another approximation of χ using Atr is the upper approximation of χ denoted as χ Atr and is defined as The upper approximation χ Atr is the set of x ∈ U which are possibly belongs to χ. The pair < χ Atr , χ Atr > is termed as rough set, and the boundary region ∂ Atr (χ) of the rough set is given by The boundary region, also termed as the uncertain-region, is the set of objects we can't classify with certainty using Atr.

Variable precision rough set approximations [8]
Ziarko [8] generalized the notion of rough set approximations by allowing a specified degree of misclassification error in set approximation. With the inclusion-relation over U and given fix value of misclassification error α ∈ [0, 0.5), the generalized approximations are defined by:

Multi-granulation rough set approximations (MGRS) [17]
Let (U, Atr∪{D}, V ) be the knowledge system, At 1 , At 2 ⊆ Atr, and At 1 , At 2 be two partitions of U by the relation θ At1 and θ At2 respectively, and χ be any concept in U, then the approximations of χ are defined by:

Multi-granulation variable precision rough set approximations [18]
Let δ is the fix predefined value of misclassification error, then the approximations of χ are given as: where δ ∈ [0, 0.5)

Uncertainty optimization model
This section specifies the multi-granular variable precision rough set in the context of uncertainty optimization. We define the uncertainty optimization based multi-granulation rough set (UOMGRS) approximations, and give its comparisons with the single granulation rough set (SGRS) and multi-granulation rough set (MGRS) approximations.

Uncertainty minimization problem
Let Atr be any n-dimensional attribute set, δ is the specified misclassification error value, and D is the decision attribute. Let D be the partition of U then each element of D is the concept in the knowledge systeml. Let χ i ∈ D, i = 1, 2, ..., k then each χ i is the concept in the knowledge system. Let At i ⊆ Atr, i = 1, 2, ..., m denotes the subsets of the attribute set Atr, and At i (x) denotes the equivalence classes of x ∈ U by the relation θ Ati as defined by (1). The positive region [12,16] of the knowledge system with respect to decision attribute D using the attribute set Atr is defined as The positive region is the set of elements which are classified into the given concept with a certainty. The complement of the positive region is the set of all elements which cannot be classified with a certainty in the given concepts. We call this region as the uncertain region U nc(Atr) of the attribute set Atr i.e. U nc(Atr) =∼ P os Atr (D) The attribute At ⊆ Atr for which U nc(At) is the minimum is the most informative feature subset. Thus we have to minimize the uncertainty region U nc(At) of the attribute set At using the positive region P OS Ati (D) of the attribute set At i , i = 1, 2, ..., m. Let P OS (At|U nc(At)) (D) denotes the positive region of At i in the uncertain region of At then the uncertainty minimization problem is given as: subject to the constraints 3.2 Uncertainty optimization based multigranulation rough set approximations (UOMGRS) Definition 4.2.1: Tolerance error: Let P (x) denotes the class of the equivalence relation θ P for the attribute set P , then we define the tolerance error over U for the given value of misclassification error δ as the average of the values e( P (x|U), χ) such that e( P (x|U), χ) ≤ δ.
Let P and Q be two attribute sets, χ be any concept in the knowledge system, and δ be the specified value of misclassification error then the lower approximation of χ , Q given P for the misclassification error value δ denoted as χ δ (Q|P ) and is given by where α is the tolerance error over U and β is the tolerance error over ∼ χ α P for the misclassification error value δ. Note that χ α P is the positive region of decision D with respect to the attribute P for the concept χ. Thus ∼ χ α P is the uncertain region of P with respect to the decision class χ. Hence by (20), we minimize the uncertain region of attribute set P , using the positive region of attribute set Q, for the decision concept χ.
The upper approximation of χ , Q given P for the misclassification error value δ denoted as χ δ (Q|P ) and is given by It is clear that when δ = 0, we have α, β = 0.  Table 1.
It contains the information of eight patients using three attributes. Let us assume the misclassification error value is assigned as δ = 0, thus tolerance error α, β = 0. By definition, we have Thus by comput-ing we have, Now, to find upper approximation of χ, we have ∼ χ = (Decision, N egative), i.e. ∼ χ = {p 1 , p 4 , p 5 , p 6 }. We have by definition The upper approximation of χ, Q given P is obtained as Also, the lower and upper approximation of multi-granulation rough set are obtained as: Also, the Pawlak rough set approximations are obtained as The above example illustrates the difference between Pawlak's rough set approximations with multi-granulation rough set approximations, and uncertainty optimization-based rough set approximations. Here we give some of the results regarding various rough set approximations. Proposition 3.2.1. Let (U, Atr ∪ {D}, V ) be the knowledge system, P , Q denotes the partition of U by the relation θ P and θ Q respectively and χ be any concept in the knowledge system. Then we have Proof: If P = Q(P, Q ⊆ Atr), then from (13) we have (15) we have χ 0 for P = Q, we can prove that χ P ∪Q = χ 0 (Q|P ) = χ 0 P + Q . Let us suppose that, P = Q.

(1a) It is clear that if
Thus we have (2) From first part we have χ 0 P + Q ⊆ χ 0 Proposition 3.2.2. Let (U, Atr ∪ {D}, V ) be the knowledge system, P , Q denotes the partition of U by the relation θ P and θ Q respectively and χ be any concept in the knowledge system, then we have , ∀δ ≥ 0 . Thus . Thus . Hence we have χ 0 .
But from Proposition 3.2.1.
we have χ 0 Thus we have For the decision attribute D, the quality of approximation of D also termed as degree of dependency [2] is given by Example 3.2.2: Let us take knowledge system represented in Table 1. Thus we have Atr = {Dry cough (P), Tireness (Q), Fever (R)}. Here we are considering each attribute as the singleton attribute subset of Atr. Let us assume that the concept in the knowledge system are denoted by χ 1 = (Decision, P ositive), and χ 2 = (Decision, N egative), then we have χ 1 = {p 2 , p 3 , p 7 , p 8 }, and χ 2 = {p 1 , p 4 , p 5 , p 6 }. For misclassification error value δ = 0, the positive region of the each Singleton attribute set is obtained by using (15) as The positive region of attribute P is the maximum among the attribute subsets of Atr. Hence we choose attribute P as rank 1. Next, by (16), we have uncertain region of the attribute P as U nc(P ) = {p 6 , p 7 , p 8 }. The positive regions of the remaining attribute subsets in U nc(P ) are obtained as Thus, we have |P OS (Q|U nc(P )) | ≥ |P OS (R|U nc(P )) |. Hence we sort attribute Q as rank 2. Now, we have P OS {P,Q} (D) = U, and by (22), γ P OS {P,Q} (D) = 1. Hence the sorted attribute list of Atr according to rank is {P, Q}.
Here we give an algorithm to find the given attribute set's positive region and sort them according to their positive region for data reduction. Initially, we consider each attribute as a singleton subset of the given attribute set and finds each attribute's positive region. Let P os(F ) denotes the positive region of the feature set F ⊆ Atr with respect to the decision attribute D. Next, we minimize the attribute's uncertain-region, whose positive region is the maximum using the remaining attributes' positive regions. The Algorithm 1 find and sort attributes according to their positive region. We measure the attribute's quality of approximation to group them for uncertainty minimization, using the proposed uncertainty minimization problem. We have used three data-sets from UCI machine learning [19] in our study. The information of the used data-sets is provided in Table 2.  Here we give the step-by-step information about the selected attributes using their quality of approximation by the proposed algorithm. We assigned the misclassification error value as δ = 0.26 because we had an empty positive region of all the attributes for less than δ = 0.26. Here we denote γ(P OS Ati (D) Table 3. Iteration wise sorted attributes list according to quality of approximation for misclassification error value δ = 0.26  by γ(At i ). Also, let AT T denotes an iteration-wise sorted list of attributes. The iteration-wise chosen attribute is given in Table 3. From Table 3, we have the sorted list of attributes for the Tic-Tac-Toe data-set according to their quality of approximation as {A 5 , A 1 , A 9 , A 3 , A 7 , A 8 , A 2 }. Here, we have sorted the attribute according to the positive region. The most informative attribute is the one whose positive region is the maximum. Hence we minimize the uncertain region of the the most informative attribute using the positive region of the remaining attributes. Thus by using uncertainty optimization based multi-granulation rough set, we get greater measure of quality of approximation as compared to other approaches of approximation. We have compared the quality of approximation of the attribute set by UOMGRS with that of SGRS and MGRS. The quality of approximations of the attributes with different approaches for data-sets is listed in Table 4-6 and Figure 1-3. The comparison shows that, the quality of approximation with UOMGRS is greater than that of SGRS and MGRS.
The SGRS requires 8 out of 9 attributes for approximation of the concepts in the Tic-Tac-Toe dataset. Also, SGRS required 21 attributes out of 33 for approximation of the Dermatology dataset concepts and required 6 out of 6 attributes for Car dataset concepts. Thus it required 35 out of 48 attributes to approximate the concepts in the three datasets, i.e., approximately 72% of the attributes. Also, MGRS requires 48 out of 48 attributes for approximation of concepts in the dataset, i.e., it requires 100% attributes to approximate the concepts in the datasets. In comparison, UOMGRS requires 30 out of 48 attributes in the three data sets to approximate the concepts, i.e., approximately 62% attributes.

Discussion and Conclusions
In this work, we have extended the multi-granulation variable precision rough set in the context of uncertainty optimization. As a result, we formulated the uncertainty optimization-  based multi-granulation rough set approximation (UOMGRS). Using the proposed method, we find the attribute with the highest quality of the approximation as it is the most informative attribute in the feature space. It is desirable to minimize the rough set boundary region using the attribute having the highest approximation quality. Thus we group the attributes whose relative quality of approximation is the maximum to maximize the positive region and to minimize the uncertain region. The subsets whose quality of the approximation is 100% are the reducts of the attribute set. An algorithm based on uncertainty optimization is given for sorting the attributes according to their approximation quality and feature grouping to maximize the quality of approximation. The UOMGRS is compared with the single granulation rough set (SGRS) and multi-granulation rough set (MGRS). The measure of the quality of approximation using UOMGRS is found more remarkable than that of SGRS and MGRS. The SGRS and MGRS required an average of at least 72% attributes for approximation of concepts in the dataset. In comparison, UOMGRS only needed an average of 62% of the attributes for approximation of the concepts in the dataset. So by our proposed method, object classification is possible with comparatively less dataset. The method helps minimize the uncertainties in the dataset in a more efficient way. The study provides a novel and more effective approach for minimizing boundary regions and its application for data reduction.