Running Head: DATA MINING
Assignment 4: Data Mining
Data Mining is also called as Knowledge Discovery in Databases (KDD). It is a powerful technology which has great potential in helping companies to focus on the most important information they have in their data base. Due to the increased use of technologies, interest in data mining has increased speedily. Data mining can be used to predict future behavior rather than focus on past events. This is done by focusing on existing information that may be stored in their data warehouse or information warehouse. Companies are now utilizing data mining techniques to assess their database for ...view middle of the document...
Than based on these assumptions the owner may keep full stock of liquor in order to make sure both of them are sold at full price and apply various other schemes to improvise the business. Good results of data mining are based on two important factors; Size of the data, if large data is processed in the database than the result will be more accurate and Query complexity, the system will be more powerful if the queries will be greater in number and more complex (Data Mining, 2011).
Web mining to discover business intelligence from Web customers
Web mining is helpful for analyzing the patterns of data through content mining, structure mining, and usage mining. Web mining cannot function on its own. The data which is gathered through web mining is further categorized by a process called clustering. Further output is accessed by data mining and future by the users. In this process users web history records are extracted. The biggest advantage of this technology is in the field of e-commerce. With the help of this, companies can establish better customer relationship by giving them exactly what they need and can find, attract and retain customers (Data Mining, 2011).
Clustering to find related customer information
Clustering is the second phase of Web mining. In this process data objects that are similar to each other in some or the other way are found. The aim of this analysis is to form quality clusters such that the inter cluster similarity is low and the intra cluster similarity is high (Data Mining, 2011)
Reliability of the data mining algorithms
Although data mining is very effective and appreciated tool but it lacks in terms of reliability. It can be easily misused and can also unintentionally give outputs which seem to be very useful but which do not predict future behavior. Data mining model is reliable only if it generates the same type of estimates or finds the same general kinds of patterns despite of the test data that is supplied. For example, model generated for the store that used the wrong accounting method will not simplify well to other stores, and therefore cannot be relied upon. In my opinion they can be trusted and predict the errors they are likely to produce but if handled carefully. We cannot rely on these algorithms up to fullest. (Claudio M. Rocco, 2011)
Privacy concerns rose by the collection of personal data for mining purposes. Three concerns raised by consumers.
The three concerns raised by consumers are:
* The first concern raise is the intrusion of the people in obtaining the information which affects the privacy of the people in collecting the data. Obtaining information for different purposes as a consequences it harm in many ways.
* Another concern is that few mining algorithms might utilize controversial issues which are not valid ethically like sex, race, religion, or sexual orientation to classified individuals and might cause an issue in legislation on the basis of anti-discrimination.
* This has...