CS1004 Data Warehousing & Mining B.E Question Bank : niceindia.com
Name of the College : Noorul Islam College of Engineering
University : Anna University
Degree : B.E
Department : Information Technology
Subject Code/Name : CS 1004 – Data Warehousing & Data Mining
Document Type : Question Bank
Website : niceindia.com
Download Model/Sample Question Paper :https://www.pdfquestion.in/uploads/niceindia.com/3044-CS_1004_-_DATA_WAREHOUSING_AND_DATA_MINING.pdf
NICE Data Warehousing & Mining Question Paper
1.What are the uses of statistics in data mining? :
Statistics is used to
** to estimate the complexity of a data mining problem;
** suggest which data mining techniques are most likely to be successful; and
** identify data fields that contain the most “surface information”.
Related : Noorul Islam College of Engineering IT1401 Component Based Technology B.E Question Bank : www.pdfquestion.in/3045.html
2. What is the main goal of statistics? :
The basic goal of statistics is to extend knowledge about a subset of a collection to the entire collection.
3. What are the factors to be considered while selecting the sample in statistics? :
The sample should be
** Large enough to be representative of the population.
** small enough to be manageable.
** accessible to the sampler.
** free of bias.
4. Name some advanced database systems.
Object-oriented databases,Object-relational databases.
5. Name some specific application oriented databases.
Spatial databases,
Time-series databases,
Text databases and multimedia databases.
6. Define Relational datbases.
A relational databases is a collection of tables,each of which is assigned a unique name.Each table consists of a set of attributes(columns or fields) and usually stores a large set of tuples(records or rows).Each tuple in a relational table represents an object identified by a unique key and described by a set of attribute values.
7.Define Transactional Databases.
A transactional database consists of a file where each record represents a transaction.A transaction typically includes a unique transaction identity number(trans_ID), and a list of the items making up the transaction.
8.Define Spatial Databases.
Spatial databases contain spatial-related information.Such databases include geographic(map) databases,VLSI chip design databases, and medical and satellite image databases.Spatial data may be represented in raster format, consisting of n-dimensional bit maps or pixel maps.
9.What is Temporal Database? :
Temporal database store time related data .It usually stores relational data that include time related attributes.These attributes may involve several time stamps,each having different semantics.
10.What is Time-Series databases? :
A Time-Series database stores sequences of values that change with time,such as data collected regarding the stock exchange.
11.What is Legacy database? :
A Legacy database is a group of heterogeneous databases that combines different kinds of data systems,such as relational or object-oriented databases,hierarchical databases,network databases,spread sheets,multimedia databases or file systems.
12. What is learning? :
Learning denotes changes in the system that enables the system to do the same task more efficiently the next time.
Learning is making useful changes or modifying what is being experienced.
13. Why machine learning is done? :
To understand and improve the efficiency of human learning.
To discover new things or structure that is unknown to human beings.
To fill in skeletal or computer specifications about a domain.
14. Give the components of a learning system.
1 Critic
2 Sensors
3 Learning Element
4 Performance Element
5 Effectors
6 Problem generators.
15. Give some of the factors for evaluating performance of a learning algorithm.
1 Predictive accuracy of a classifier.
2 Speed of a learner
3 speed of a classifier
4 Space requirements
16. What are the steps in the data mining process? :
a. Data cleaning
b. Data integration
c. Data selection
d. Data transformation
e. Data mining
f. Pattern evaluation
g. Knowledge representation
17. Define data cleaning
Data cleaning means removing the inconsistent data or noise and collecting necessary information
18. Define data mining
Data mining is a process of extracting or mining knowledge from huge amount of data.
20. Define pattern evaluation
Pattern evaluation is used to identify the truly interesting patterns representing knowledge based on some interesting measures.
21. Define knowledge representation
Knowledge representation techniques are used to present the mined knowledge to the user.