Data Mining

Read Complete Research Material



Data Mining



Data Mining

Question 1

Part A

Essentially our challenge is that we have a collection of attributes, some discrete, some continuous, and we want to predict a continuous variable (a number). A typical customer profitability model would use customer demographics, channel, product requested, volume etc to predict future profitability (Abraham, 2004). Our model will use geographic location, client agent, resource etc to predict response time. There are thousands of ways that you could use a very similar structure to predict a continuous number for a different real-world application. These applications include predicting delivery time, project time, project revenue, employee tenure, lease duration, residual value etc. Therefore, once you have gone through this exercise, it should be very easy for you to build your own customer profitability model or other similar model using your organization's real data.

Part B

The classical example of a data mining problem is "market basket analysis". Stores gather information on what items are purchased by their customers. The hope is, by finding out what products are frequently purchased jointly (i.e. are associated with each other), being able to optimize the marketing of the products (e.g. the layout of the store) by better targeting certain groups of customers. A famous example was the discovery that people who buy diapers also frequently buy beers (probably exhausted fathers of small children) (Abraham, 2004). Therefore nowadays one finds frequently beer close to diapers (and of course also chips close to beer) in supermarkets. Similarly, Amazon exploits this type of associations in order to propose to their customer's books that are likely to match their interests.

Part C

The rapid e-commerce growth has made both business community and customers face a new situation. Due to intense competition on the one hand and the customer's option to choose from several alternatives, the business community has realized the necessity of intelligent marketing strategies and relationship management. Web usage mining attempts to discover useful knowledge from the secondary data obtained from the interactions of the users with the Web. Web usage mining has become very critical for effective Web site management, creating adaptive Web sites, business and support services, personalization, and network traffic flow analysis and so on (Rathipriya & Thangavel, 2012).

Part D

Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that

data points in one cluster are more similar to one another

Data points in separate clusters are less similar to one another.

Similarity measures

Euclidean distance if attributes are continuous

Problem specific measures

Question 2)

Realizing the importance of data mining to the field of reliability and risk, Professor Krishna B. Misra, Editor-in-Chief of IJPE requested me to bring out a special issue on Data Mining as applied to reliability and risk. Invitations were sent out to several researchers active in this field and the result of the exercise is that we received only four papers which relate to Data mining principles for this issue. However, it is hoped that these papers will act as catalyst to generate further interest of the ...
Related Ads