A GENERAL DATA MINING MODEL Anita Wasilewska, Department of Computer Science, State University of New York, Stony Brook, NY, USA Ernestina Menasalvas, Facultad de Informatica, Universidad Politecnica de Madrid, Madrid, Spain ABSTRACT We usually view Data Mining results and present them to the user in their syntactical descriptive form as it is the most natural form of communication. But the Data Mining process is deeply semantical in its nature. The algorithms process records (semantics) finding similarities which are then presented in a descriptive i.e. syntactic form. Our General Model addresses this the semantics-syntax duality inherent to any Data Mining process. Moreover, our General Model formalizes the definition of Data Mining as the process of information generalization. In the model the Data Mining algorithms are defined as generalization operators. We use our framework to show that only three generalizations operators: classification, clustering, and association operator are needed to express all Data Mining algorithms for classification, clustering, and association, respectively. Moreover, we formally prove that classification, clustering and association analysis fall into three different generalization categories. Finally, we examine, as a particular case, a General Classification Model. In particular, we define the notion of truthfulness, or a degree of truthfulness of syntactic descriptions obtained by any classification algorithm, represented within the Model by a classification operator. We use our framework to prove that for any classification operator (method, algorithm) the set of all discriminant rules that are fully true (partially true) form semantically the lower approximation (approximate lower approximation) of the class they describe. The set of characteristic rules describes semantically its upper approximation. The notion of the approximate lower approximation extends to any classification operator (method, algorithm) the ideas first expressed in 1986 by Wong, Ziarko, Ye, and in the VPRS model of Ziarko.