07A70505 Data Warehousing & Data Mining B.Tech Question Paper : scce.ac.in
Name of the College : SREE CHAITANYA COLLEGE OF ENGINEERING
University : JNTUH
Department : Computer Science And Engineering
Subject Code/Name : 07A70503/DATA WAREHOUSING AND DATA MINING
Year/Sem : IV/I
Website : scce.ac.in
Document Type : Model Question Paper
Download Model/Sample Question Paper : https://www.pdfquestion.in/uploads/scce.ac.in/4954-07A70503-DATAWAREHOUSINGANDDATAMINING.pdf
SCCE Data Warehousing Question Paper
Code No: 07A70503
R07 Set No. 2
IV B.Tech I Semester Examinations,December 2011
Related : Sree Chaitanya College Of Engineering 07A70505 Network Programming B.Tech Question Paper : www.pdfquestion.in/4953.html
Computer Science And Engineering
Time: 3 hours
Max Marks: 80
Answer any FIVE Questions :
All Questions carry equal marks :
1. (a) Discuss about mining frequent item sets without candidate generation.
(b) Explain about multidimensional Association rules in detail. [8+8]
2. Write the syntax for the following data mining primitives:
(a) The kind of knowledge to be mined.
(b) Measures of pattern interestingness. [16]
3. (a) Explain data mining as a step in the process of knowledge discovery.
(b) Differentiate operational database systems and data warehousing. [8+8]
4. (a) Attribute-oriented induction generates one or a set of generalized descriptions. How can these descriptions be visualized?
(b) Discuss about the methods of attribute relevance analysis. [8+8]
5. (a) Can any ideas from association rule mining be applied to classification? Explain.
(b) Explain training Bayesian belief networks.
(c) How does tree pruning work? What are some enhancements to basic decision tree induction? [6+5+5]
6. (a) What is spatial data warehouse? What are the dierent types of dimensions in a spatial data cube? What are the dierent types of measures in a spatial data cube?
(b) What is keyboard-based association analysis? How can automated document classification be performed?
(c) Briefly discuss about mining the World Wide Web. [2+2+2+2+2+6]
7. The following table contains the attributes name, gender, trait-1, trait-2, trait-3, and trait-4, where name is an object-id, gender is a symmetric attribute, and the remaining trait attributes are asymmetric, describing personal traits of individuals who desire a penpal. Suppose that a service exists then attempt to find pairs of compatible penpals.
Name gender trair-1 trait-2 trait-3 trait-4
Kevan M N P P N
Caroline F N P P N
Erilk M P N N P
. . . . . .
. . . . . .
. . . . . .
For asymmetric attribute values, let the value P be set to 1 and the value N be set to 0. Suppose that the distance between objects (potential penpals) is computed based only on the asymmetric variables.
(a) Show the contingency matrix for each pair given Kevan, Caroline, and Erik.
(b) Compute the simple matching coecient for each pair.
(c) Compute the Jaccard coecient for each pair.
(d) Who do you suggest would make the best pair of penpals? Which pair of individuals would be the least compatible. [4+4+4+4]
8. (a) Briefly discuss the data smoothing techniques.
(b) Suppose that the data for analysis include the attribute age. The age values for the data tuples are (in increasing order):
13,15,16,16,19,20,20,21,22,22,25,25,25,25,30,33,33,35,35,35,35,36,40,45,46, 52,70.
i. Use smoothing by bin means to smooth the above data, using a bin depth of 3. Illustrate your steps. Comment on the eect of the technique for the given data.
ii. How might you determine outliers in the data?
iii. What other methods are there for data smoothing? [16]
IV B.Tech I Semester Examinations,December 2011 :
Data Warehousing And Data Mining :
1. Briefly discuss the Discretization and concept hierarchy techniques. [16]
2. (a) What is Cluster Analysis? What are some typical applications of clustering? What are some typical requirements of clustering in data mining?
(b) Discuss about model-based clustering methods. [2+2+5+7]
3. (a) Explain the design and construction process of data warehouses.
(b) Explain the architecture of a typical data mining system. [8+8]
4. Write the FP-growth algorithm for discovering frequent item sets without candidate generation. Explain an example. [16]
5. (a) How scalable is decision tree induction? Explain.
(b) Explain about prediction. [8+8]
6. (a) How can object identiers be generalized, if their role is to uniquely identify objects? Can inherited properties of objects be generalized.
(b) What kinds of association can be mined in multimedia data? Explain.
(c) Describe similarity search in time-series analysis. [4+6+6]
7. (a) List and describe any four primitives for specifying a data mining task.
(b) Write about Semitight coupling and Loose Coupling. Dierentiate them. [8+8]
8. Write short notes for the following in detail :
(a) Attribute-oriented induction.
(b) Ecient implementation of Attribute-oriented induction. [8+8]