Data Mining
What is Data mining?
- Data mining is the process of extracting the useful information stored in the large database.
- It is the extraction of hidden predictive information.
- Data Mining is the practice of automatically searching the large stores of data to discover patterns.
- Data Mart is a powerful new technology with great potential that helps organization to focus on the most important information in their data warehouse.
- It uses mathematical algorithms to segment the data and evaluates the probability of future events.
- Data mining is a powerful tool used to retrieve the useful information from available data warehouses.
- Data mining can be applied to relational databases, object-oriented databases, data warehouses, structured-unstructured databases etc.
- Data mining is also known as Knowledge Discovery in Databases (KDD).
Different steps of KDD as per the above diagram are:
1. Data cleaning removes irrelevant data from the database.
2. Data integration: The heterogeneous data sources are merged into a single data source.
3. Data selection retrieves the relevant data to the analysis process from the database.
4. Data transformation: The selected data is transformed in forms which are suitable for data mining.
5. Data mining: The various techniques are applied to extract the data patterns.
6. Pattern evaluation evaluates different data patterns.
7. Knowledge representation: This is the final step of KDD which represents the knowledge.