What is Data Mining?
Data mining is the process of discovering useful correlations hidden among vast amounts of data, extracting actionable information for future use, and utilizing it for decision-making.
Doosan Encyclopedia
The term “Data Mining” does not accurately represent all the core components of mining.
Therefore, some people also use the term “Knowledge Discovery from Data(KDD)“. KDD encompasses the entire process of discovering useful knowledge from data, including data preprocessing, data mining, and post-processing of discovered patterns.
Although KDD is a term that conveys the meaning more clearly, we will continue to use the term “Data Mining” which is already widely used and has a well-established meaning, in this context.
The Process of Data Mining
Data mining involves repeating the following steps in sequence:
- Data Cleaning: Removing noise and inconsistent data
- Data Integration: Combining data from various sources
- Data Selection: Retrieving relevant data for analysis tasks from the database
- Data Transformation: Summarizing or aggregating data to transform and integrate it into a form suitable for mining analysis
- Data Mining: Applying intelligent analysis methods to extract data patterns
- Pattern Evaluation: Identifying patterns that represent knowledge based on interest
- Knowledge Presentation: Utilizing visualization and knowledge representation techniques to present mined knowledge to users