Why Data Mining ?
1. Moving toward the Information Age
This explosively growing, widely available, and gigantic body of data makes our time truly the data age.
Powerful and versatile tools are badly needed to automatically uncover valuable information from the tremendous amounts of data and to transform such data into organized knowledge.
This necessity has led to the birth of data mining.
For example, Google’s Flu Trends uses specific search terms as indicators of flu activity.
It found a close relationship between the number of people who search for flu-related information and the number of people who actually have flu symptoms.
Using aggregated Google search data, Flu Trends can estimate flu activity up to two weeks faster than traditional systems can.
2. Evolution of Information Technology
The research and development in database systems since the 1970s progressed from early hierarchical and network database systems to relational database systems.
One emerging data repository architecture is the data warehouse.
This is a repository of multiple heterogeneous data sources organized under a unified schema [plan] at a single site to facilitate management decision making.
In summary, the abundance of data, coupled with the need for powerful data analysis tools, has been described as a data-rich but information poor situation.
- data repositories, has far exceeded our human ability for comprehension without powerful tools.
As a result, data collected in large data repositories become “data tombs”—data archives that are seldom visited.
Consequently, important decisions are often made based not on the information-rich data stored in data repositories but rather on a decision maker’s intuition, simply because the decision maker does not have the tools to extract the valuable knowledge embedded in the vast amounts of data.
Efforts have been made to develop expert system and knowledge-based technologies, which typically rely on users or domain experts to manually input knowledge into knowledge bases.
Unfortunately, however, the manual knowledge input procedure is prone to biases and errors and is extremely costly and time consuming.
The widening gap between data and information calls for the systematic development of data mining tools that can turn data tombs into “golden nuggets” of knowledge.