subject: Some of the most important techniques for extracting data [print this page] Data mining is the process of extracting relations from large amounts of data. It 'an area of computer science, has received significant commercial interest. This article will detail some of the most common methods of data analysis and data mining.
Association rule discovery: association rule discovery methods are used to extract groups of data sets. Traditionally, technology has been developed at the supermarket purchase data. An association rule is a rule of the form X -> Y. Amay be, for example: "When a customer buys milk (->), This implies that the client buys the bread." An association rule is associated with a carrier and a value of trust. Support is the percentage of all messages (or transactions in this case) that all parties are included. For example, the proportion of all transactions in which they were bought milk and bread. Confidence is the percentage of transactions that the left side of the rule, which meetthe right side of the rule. For example, in this case, the share of the trust purchases the milk, which also bought the bread to buy. Association of extracts of all possible association rules of investigation of a dataset, so the user a minimum support and confidence specified.
Cluster analysis: cluster analysis is the process of receiving one or more numeric fields and assign their values groupings. These groups represent groups of pointsare close. For example, if you watch a documentary about space, you will see that many galaxies of stars and planets included. There are many galaxies in space, but the stars and planets form in all clusters, galaxies are. This is not, the stars and planets randomly located in space, but the galaxies are in groups that block. A method of cluster analysis is used to find such groups. When a cluster analysis method was the stars in space, has appliedcan see that each galaxy is a cluster and a clear identification of all the stars in a galaxy cluster data. This cluster is then identified another field in the record and can be used in further analysis of data mining. For example, you can use a cluster ID field association rules established in other fields, in the form of data.
Decision Trees: Decision trees were used to help a decision tree in a record, provide, to form a value. For example, if you are looking for a recordwas used to predict whether an applicant for a loan would be a potential credit risk, a decision tree would be taken based on factors in the record. The tree may include decisions such as whether the applicant for a loan defaulted before the age of the applicant if the applicant was employed or not, the applicant and the total income for the repayments of the loan. You can then follow the decision tree to say, for example, if an applicant has never before a loan, the precipitateApplicant is employed, their incomes in the top 15 percentile for the country and the amount of credit is relatively low, then there is a very low risk of default