Data Mining

Read Complete Research Material



Data Mining

[Name of the Institute]

Data Mining

Introduction

Data mining is process of finding relevant information in data as stated by Krivec and Matjaz in their book named Data Mining Techniques. However, the writer is of the belief that this relevant information can be frequent patterns. The pattern can be a set of items, subsequence, and substructure. Data mining is an activity of finding interesting patterns from large amounts of data; data can be stored in a database. A simple definition of data mining is the extraction of information or important or interesting patterns from existing data in large databases. Data mining, often referred to as Knowledge Discovery in Databases (KDD) is an activity that involves the collection, usage data, historical to discover regularities, patterns or relationships in large data sets (Michael, 1993).

To consider pattern as frequent it must comply with condition of minimum support. There are methods for data mining in static data sets, but e.g. financial data streams - which is never ending stream of data - different approaches. This is because we cannot scan whole history of database, so we cannot easily determine which patterns are frequent. There are two approaches to overcome difficulty with determination of frequent and infrequent patterns - which frequency of occurrence can vary during the time:

1. It is possible to keep track just for predefined, limited set of items.

2. Or it is possible to derive approximate set of answers, which approximates the frequency of items within user defined error bound.The first approach has very limited usage because it requires the system to confine the scope of predefined patterns beforehand (Sholom, 1998).

Educational Data Mining

As per the book the educational data mining is emerging as a research area with a suite of computational and psychological methods and research approaches for understanding how students learn. New computer-supported interactive learning methods and tools, intelligent tutoring systems, simulations, games have opened up opportunities to collect and analyze student data, to discover patterns and trends in those data, and to make new discoveries and test hypotheses about how students learn. Data collected from online learning systems can be aggregated over large numbers of students and can contain many variables that data mining algorithms can explore for model building.

Demanding Data Mining

While demand for data mining of scholarly content is mounting, lack of standardization of search technologies, interfaces and licensing terms hinders its use.

When the authors of the book make these mining requests, publishers of Scientific, Technical, and Medical content (STM) generally handle mining requests from third parties liberally. However, they have concerns if the mining results can replace, or compete with, the original content or if the mining is burdening their systems. Many publishers have publicly available mining policies, and most handle mining requests case-by-case.

According to the book for open-access journals, mining is generally allowed as part of standard terms and conditions.  For content published in a more restrictive way, however, nearly all publishers require information about the intent and purpose of the mining request. In addition, for many publishing organizations ...
Related Ads