Dataware Q&a Bank
Dataware Q&a Bank
Dataware Q&a Bank
Padappai
Department of Information Technology
Sub Code & Name : CS1004 – Datawarehousing and Mining
UNIT 1
2 mark
1.Define warehousing?
2.Distinguish between data warehouse and data mart?
3.List out the components of data warehouse?
4.Define data cube? Give an example?
5.What is fact table and dimension table?
6.Compare OLAP and OLTP?
7.What are meta data?
8.What is the need for OLAP?
9.Define star schema, snowflake schema and fact constellation?
10.What is starnet query model?
11.Write down the applications data warehousing.
12. When is data mart appropriate?
13. What is concept hierarchy? give an example.
14.What are the uses of statistics in data mining?
15.Name some advanced database systems?
16.Name some specific application oriented datdabases?
17.Define Relational Database?
18.Define Transactional Database?
19.Define Spatial database?
20.What is Temporal Database?
21.What is Time series database?
22.What is legacy database?
23.What is learning?
24.Why machine learning is done?
25.Give the Components of a learning system?
26.Give some factors for evaluating performance of learning system?
27.What are the steps in data mining process?
28.Define datamart?
29.List the merits of data modeling tool?
30.What is data warehouse performance issue?
31.What are the types of performance issue?
32.Why do you need data ware house life cycle process?
33.Merits of data ware house?
34.What are the steps in data ware house life cycle process?
35.What are the Characteristics of data ware house ?
36.List some of the data ware house tools?
37.What is end user data access tool?
38.Define molap?
39.Define Holap?
40.Define Rolap?
41.What is ad hoc query tool?
42.List few of the data mining applications?
43.Define Supervised learning scheme?
44. .Define UnSupervised learning scheme?
45.What is the necessity of data mining?
46.Draw the flow chart of database Evolution?
47.Define OLTP?
48.Expand OLAP?
49. Expand OLTP?
50.What is data stream?
51.What is a data tomb?
52.What is data archarology?
53.What is data dredging?
54.What is KDD?
55.What is data warehouse server?
56.Point out few Advanced databases?
57.What is an sttribute?
58.Define Tuple?
59.What is SQL?
60.Define heterogenous database?
61.Define WWW?
62.List the applications of WWW?
63.What is web log mining?
64.Define Outlier Analysis?
65.Define Evolution Analysis?
66.Define Technical meta data?
67. Define Businessl meta data?
68.What is a Distributive measure?
69. What is an Algebraic measure?
70. What is a Holistic measure?
71.What is Roll up data?
72.Define Drill down operation?
73.Define slice?
74. Define Dice?
75.What is pivot?
76. Define Drill within operation?
77. Define Drill across operation?
78. Define Drill through operation?
79.What is top down view?
80. What is Data source view?
81. What is Data ware house view?
82. What is Business query view?
83.Define Data Cube?
84.What is a cube operator?
85.What is No Materialization?
86. What is Full Materialization?
87. What is Partial Materialization?
88.merits of bitmap indexing?
89. merits of join indexing?
90.Define MDDB?
91.What is Query driven approach?
92.What is Update driven Approach?
93.What is a Dimension?
94.Define Fact?
95.Define Cuboid?
96.Define Aggregation?
97.What is a statistical database?
98.What is CRM?
99.What is the use of Load ?
100.Define Refresh?
4 mark
1.What is the difference between view and materialized view?
2. Explain the Difference between star and snowflake schema?
3. Mention the various tasks to be accomplished as part of data pre-processing.
4. Mention the advantages of Hierarchical clustering?
5.What is the difference between view and materialized view?
6. Explain the Difference between star and snowflake schema?
7. What is Data Warehouse Metadata?
8. What is Dimensionality Reduction?
9. What is Concept Description?
10..Difference between Supervised and UnSupervised learning scheme?
11.Discuss Join Indexing?
12. DiscussBitmap Indexing?
13. Explain the steps in knowledge discovery?
14.Give short notes on database Evolution?
15. Give short notes on dataware house?
16.Explain data stream?
17.Describe KDD?
18.Explain Relational database?
19.Give the importance ofER model?
20.What are the functions of relational database?
21.How can be a customer analyzed by data mining system?
22.Differentiate data ware house and data mart?
23. Differentiate data ware house vs Heterogenous DBMS?
24. Differentiate data ware house vs Operational DBMS?
25.Discuss Star schema?
26.Discuss Snowflake Schema?
27.Discuss Fact Constellation?
28.Discuss a Starnet query model?
29.Describe top down view?
30. Describe Data source view?
31. Describe Data ware house view?
32. Describe Business query view?
33.Explain Enterprise ware house?
34.Explain Data mart?
36.Describe Virtual ware house?
37.Expalin ROLAP?
38. Expalin MOLAP?
39. Expalin HOLAP?
40. Expalin Specialized SQL server?
41.Discuss efficient processing of OLAP queries?
42.Compare OLAM and OLAP?
43.Explain the Curse of Dimensionality?
44.Define Full and Ice berg cube?
45. Define closedl and shell cube?
46.Explain Knowledge mining from data?
47.Explain Knowledge Extraction?
48.Describe data analysis?
49.Explain Data archaeology?
50.Discuss data dredging?
51.Describe Database?
52.Explain Information Repository?
53.Explain Knowledge base?
54.Discuss Data mining Engine?
55.Describe Pattern Evaluation Module?
56.ExplainUser Interface?
57.Discuss data discrimination?
58.Explain Mining different kinds of knowledge in DB?
59.Discuss Interactive mining of Knowledge at multiple levels of abstraction?
60.Describe Incorporation of background knowledge?
61. Discuss data mining Query language?
62.E xplain ad hoc Data mining?
63. Discuss Presentation and visualization of data mining results?
64. Describe Handling Incomplete data?
65. Explain Scalability of data mining algorithms?
66. Describe Efficiency of data mining?
67.Explain Parallel mining algorithm?
68. Explain Distributed mining algorithm?
69. Explain Incremental mining algorithm?
70.Discuss Spreadsheets?
71.Describe Dimension table?
72.Explain Fact table?
73.Give the cube definition statement?
74.Interpret the measures for data cube?
75.Discuss Schema Hierarchy?
8 mark
1.What is over fitting and what can you do to prevent it?
2. In classification trees, what are surrogate splits, and how are they used?
3. What is the objective function of the K-Means algorithm?
4.What are the difference between three main types of data usage: information processing,
analytical processing and data mining?
5.Discuss the motivation behind OLAP mining.
6. Discuss the various types of metadata?
7. Categorize OLAP tools?
8.Explain various data mining issues?
9.Describe Indexing technique of OLAP with example?
10. Give short notes on database Evolution with a neat flow chart?
11.Give the importance of data mining?
12.Give the functions of OLAP?
13. What are the major components of data mining?
14.Give the importance of pattern evaluation model?
15.How decision making is performed in data warehouse?
16.Is data warehouse suited for OLAP, Explain in Brief?
17.Explain object relational database in detail?
18.What is raster format? Explain its use with an example?
19.Describe the role of DBMS in data mining?
20.Explain Information Delivery System?
21.Discuss Access Tools?
22.Explain Conceptual modeling of data ware house?
23.Explain Three categories of measures?
24.Explain Business Analysis Framework?
25.Discuss 4 views regarding the design of a data ware house?
25.Describe data ware house design process?
26.Explain 3 data ware house Models?
27.Describe Efficient Computation of Data cube?
28.Discuss data ware house Back end tools and utilities?
29.Explain Data ware house applications?
30.Describe the architecture of OLAM?
31.Discuss in detail about the Lattice of Cuboid?
32.How many cuboids are there in n dimensional data cube?
33. Describe Full cube?
34. Describe closed cube?
35. Describe Ice berg cube?
36. Describe shell cube?
37.Explain Significance Constraint?
38. Explain Probe Constraint?
39. Explain Gradient Constraint?
40.Discuss on Information System?
41.Describe Temporal Database?
42. Describe Time seriesl Database?
43. Describe Sequence Database?
44. Describe Spatial Database?
45. Describe SpatioTemporal Database?
46. Describe Textl Database?
47. Describe Mu;timedia Database?
48.Give short notes on Heterogenous DBMS?
49. Give short notes on Legacy Database?
50.Explain Classification of Data mining Systems?
16 mark
1.Enumerate the building blocks of a data warehouse. Explain the
importance of metadata in a data warehouse environment. What are the
challenges in metadata management?
2. Distinguish between the entity-relationship modeling technique
and dimensional modeling. Why is the entity-relational modeling
technique not suitable for the data warehouse?
3. Create a star schema diagram that will enable FIT-WORLD GYM
INC. to analyze their revenue. The fact table will include – for every
instance of revenue taken – attribute(s) useful for analyzing
revenue. The star schema will include all dimensions that can be
useful for analyzing revenue. Formulate query: “Find the
percentage of revenue generated by members in the last year”.
How many cuboids are there in the complete data cube?
4.Briefly compare the following concepts. Explain your points with an example
(i) Snowflake schema, fact constellation, star net query model
5. ) Discuss the typical OLAP operations with an example.
6.Discuss how computations can be performed efficiently on data cubes.
(ii) Write short notes on data warehouse meta data.
7. Describe the multidimensional data model.How it is used in data warehousing?
8. Explain the architecture of data warehouse with a neat sketch?
9. Explain the operations performed on data warehouse with examples?
10. Distinguish between data mining and data warehousing?
11. Discuss various data mining issues with some examples?
12.Explain Data mining Functionalities?
13.Explain different types of data repositories on which mining can be performed?
14.What are the major components of data mining? Explain with a neat Flowchart?
15.Explain SQL inDetail?
16.Discuss in detail OLAP server Architectures?
17.Explain data ware house Implementation?
18.Describe Selected Computation of Cuboids?
19.Explain Efficient methods for data cube computation?
20.Explain Optimization technique?
21.Discuss Multiway array aggregation for full cube computation?
22.Explain BUC Algorithm?
23.Discuss Star cubing?
24.Write an algorithm for Shell Fragment computation?
25.Discuss Constrained Gradient Analysis Data cube?
UNIT 2
2 mark
1.What is the need of data preprocessing?
2.Define smoothing and Binning?
3.What is the need for discretization in data mining?
4.What is concept hierarchy? Give an example?
5.Define DQML?
6.What are functional components of GUI in data mining?
7.Define task relevant data?
8.What is meant by concept description?
9.What is data generalization?
10.How to perform class comparison?
11. Define Data Mining.?
12.What is the main goal of statistics?
13.What are the factors to be considered while selecting samples in statistics?
14.Define data cleaning?
15.Define Data integration?
16.Define Data Selection?
17. Define Data Transformation?
18. What is pattern evaluation?
19.What is Knowledge Presentation?
20.List the steps in preprocessing?
21.What is visualization?
22.Name some conventional visualization techniques?
23Give the features included in modern visualization techniques?
24.Define Conventional visualization ?
25.Define Spatial visualization ?
26.Define Descriptive Data mining?
27.What is Predictive data mining?
28.What is data generalization?
29.Define attribute oriented induction?
30.What is Jack knife?
31.What is Bootstrap?
32.Give the views of Statistical approach?
33.What are the assumptions of Statistical approach?
34.What is the use of Probablistic graphical model?
35.Give the Importance of of Probablistic graphical model?
36.Define Deterministic model?
37.Define System?
38.Define Model?
39.How to choose the best model?
40.Principles of Qualitative Formulation?
41.What is linear regression?
42.State the types of linear model?
43.What is the use of linear model?
44.What are the goals of time series analysis?
45.What is smoothing?
46.What is lag?
47.What do you mean by concept hierarchy?
48.Define inconsistency cleaning?
49.What is Column level cleaning?
50.Define Descriptive data summarization?
51.What is a missing value?
52.Define Normalization?
53.What is attribute subset selection?
54.Define Dimensionality reduction?
55. Define Numerosity reduction?
56.What is a Central Tendency?
57.Define mean?
58.Define Mode?
59.Define Median?
60.What is a mid range?
61.What is Dispersion of data?
62.Define IQR?
63.What is variance?
64.Define range?
65. List the data transformation operations?
66.Define Quartiles?
67.What is weighted arithmetic mean?
68.What is Unimode?
69. What is Bimode?
70. What is Trimode?
71.Define Multimode?
72.Give the empirical relation for unimodal frequency?
73.What is Dispersion?
74.Define Standard Deviation?
75.What is 5 number summary?
76.What is a boxplot?
77.Define first Quartile?
78.Define Third Quartile?
79.What are Whiskers?
80.Give the formula for standard deviation?
81. Give the formula for variance?
82.What is discrepancy detection?
83.Define Unique rule?
84.Define Consecutive rile?
85.Define Null rule?
86.What is a data scrubbing tool?
87.What is data auditing tool?
88.Define data migration tool?
89.What is an ETL?
90.Define Redundancy?
91.What is correlation analysis?
92.Define Correlation coefficient?
93.Define attribute construction?
94.Define Discrete wavelet Transform?
95.Define Sampling?
96.What is comparison?
97.What is Discrimination?
98.Define attribute removal?
99.What is data focusing?
100.What is attribute generalization control?
4 mark
1.Distinguish between concept description and OLAP?
2.What is quantitative rule?
3.What is attribute relevance analysis?
4.What do you mean by attribute oriented induction?
5.List out the methods for implementing class comparison?
6. Write a short note on regression?
7. Write a short note on correlation?
8.Discuss Parametric methods?
9.Explain Non Parametric methods in detail?
10.Explain Data Generalization?
11.Describe Concept Hierarchy generation?
12.Explain Data mining Primitives?
13.Explain attribute oriented induction?
14.Discuss on Descriptive data summarization?
15.Explain Histogram?
16.Discuss Quantile plot?
17.Describe Q-Q plot?
18.Explain Scatter plot?
19.Discuss Loess curve?
20.Describe Missing values?
21.Explain Noisy data?
22.Describe Binning?
23.Explain Regression?
24.Discuss Clustering?
25.Describe the Mean,median,mode,mid range?
26.Explain IQR ,variance, quartiles?
27.Discuss Discrepancy detection?
28.Explain data scrubbing tools?
29.Discuss data Auditing tools?
30.Explain data migration tools?
31.Discuss Entity identification problem?
32.Explain Correlation analysis?
33.Explain smoothing?
34.Describe Aggragation?
35.Discuss Generalization?
36.Explain Normalization?
37.Describe Attribute Construction?
38.Discuss Min max Normalization?
39.Explain z-score Normalization?
40.Discuss Normalization by decimal scaling?
41.Describe Data cube aggragation?
42.Explain attribute subset selection?
43.Discuss on Dimensionality reduction?
44.Explain Numerosity reduction?
45.Explain discretization?
46.Explain concept hierarchy generation?
47.Describe stepwise forward selection?
48. Describe stepwise backward Elimination?
49. Describe the combination of forward selection and backward elimination?
50.Discuss on decision tree induction?
51.Explain DFT?
52.Explain Hierarchical pyramid algorithm?
53.Describe Orthonormal?
54.Give short notes on PCA?
55.Discuss Log Linear Models?
56.Describe Equal Width histogram?
57. Describe Equal Frequency histogram?
58.What is V-Optimal?
59.Describe MaxDiff Histogram?
60.What are the 3 data clusters?
61. Describe Multidimensional histogram?
62.Define Centroid distance?
63.Describe Multidimensional index trees?
64.Explain SRSWOR?
65.Describe SRSWR?
66.Explain Cluster sample?
67.Discuss Stratified Sample?
68.List the merits of Sampling?
69.Discuss Top down Discretization?
70.Explain Splitting?
71. Discuss bottom up Discretization?
72.Discuss Merging?
73.Draw a flow chart for stepwise forward selection?
74Draw a flow chart for stepwise backward Elimination?
75.Draw a flow chart for the combination of forward selection and backward elimination?
8 mark
1..Mention the various tasks to be accomplished as part of data pre-processing.?
2. What is over fitting and what can you do to prevent it?
3.Explain the 5 steps in the Knowledge Discovery in Databases (KDD)
process.
4.Discuss in brief the characterization of data mining algorithms.
5.Discuss in brief important implementation issues in data mining.
6. List and discuss the various data mining primitives?
7. Distinguish between statistical inference and exploratory data analysis.?
8. Write a short note on machine learning. What is supervised and unsupervised learning?
9. Write a short note on regression and correlation?
10. Discuss on Descriptive data summarization with examples?
11.Explain Graphic display of basic descriptive data summaries?
12.Explain data cleaning?
13Describe data cleaning as a process?
14.Explain the measures of Central tendency?
15.Describe the measures of Dispersion of data?
16.Explain about data integration?
17. Describe Attribute Construction with example?
18.Discuss Min max Normalization with example?
19.Explain z-score Normalization with example?
20.Discuss Normalization by decimal scaling with example?
21.Discuss about data transformation?
22.Explain About data reduction?
23.Discuss the basic heuristic methods of attribute subset selection?
24.Explain Wavelet Transforms?
25.Explain Principle Component Analysis?
26. .Explain Histogram with examples?
27.Discuss Quantile plot with examples?
28.Describe Q-Q plot with examples?
29.Explain Scatter plot with examples?
30.Discuss Loess curve with examples?
31.Describe Missing values with examples?
32.Explain Noisy data with examples?
33.Describe Binning with examples?
34.Explain Regression with examples?
35.Discuss Clustering with examples?
36.Apply binning method for data smoothing for 4,8,15,21,21,24,25,28,34?
37.Discuss that data integration is the detection and resolution of data value conflicts?
38.How can we find the good subset of original attributes?
39.Justify Wavelet Transforms can be applied to multidimensional data?
40.How can we reduce the data volume by choosing alternative smaller forms of data
representation?
41.How are the buckets determined and the attribute values partitioned in Histogram?
42.Explain Discretization by Intuitive partitioning?
43.Explain x2 merging?
44.Explain 3-4-5 rule with an example?
45.Describe the specification of a partial ordering of attributes explicitly at schema level by users
or experts?
46.Explain the specification of a portion of a hierarchy by explicit data grouping?
47.Discuss on the specification of a set of attributes not of their partial
Ordering?
48.Discuss issues to consider during data integration?
49.Data quality can be accessed in terms of accuracy , completeness and consistency. Propose
other two dimensions of data quality?
50.Suppose a group of 12 sales price records has been sorted as follows
5,10,11,13,15,35,50,55,72,92,204,215
Partition them into 3 bins by
i)Equidepth partition
ii)Equal width partitioning
iii)Clustering
16 mark
1.Explain the need and steps involved in data preprocessing?
2.List out the primitives for specifying a data mining task?
3.Describe how concept hierarchies are useful in data mining?
4.What are the various issues addressed during data integration?
5.Write in detail about attribute oriented induction with algorithm?
6.Describe the various descriptive statistical measures for data mining?
7.Explain various methods of data cleaning in detail?
8. Give an account on data mining Query language?
9. How is Attribute-Oriented Induction implemented? Explain in detail.?
10. Write and explain the algorithm for mining frequent item sets without candidate generation.
Give relevant example.?
11. With relevant examples discuss the role of statistics in data mining?
12. Enumerate and discuss various statistical techniques and methods for
data analysis?
13. For class characterization, what are the main differences between a data cube based
implementation and a relational implementation such as attribute-oriented induction?
14.Explain Smoothing Techniques?
15.Explain Data Transformation in detail?
16.Explain Normalization in Detail?
17.Discuss Data Reduction in detail?
18.Describe Parametric and Non Parametric methods in detail?
19. Explain Data Generalization and Concept Hierarchy generation?
20.Describe the Alternative method for Data generalization snd Concept Descrip[tion?
21.Given 1dimensional data set X={-5,0,23.0,17.6,9.23,1.11} normalize the data set using i)Min-
Max Normalization[0,1]
ii) Min-Max Normalization[-1,1]
iii)Standard Deviation Normalization
22.Explain Designing the GUI based On DMQL?
23.A data set for analysis includes X={7,12,5,18,9,13,12,19,7,12,12,13,3,4,5,13,8,7,6} Find
Mean, median, mode and Standard Deviation for X?
24.Give the Graphical summarization of the data set X using boxplot representation. Find
Outliers in X?
X={7,12,5,18,9,13,12,19,7,12,12,13,3,4,5,13,8,7,6}
25.Explain Entropy based Discretization?
UNIT 3
2 mark
1.What is market basket analysis?
2.Define frequent itemset?
3.Define Association rule?
4.Define FP-growth?
5.Write the use of conditional pattern base in FP-tree?
6.What is the use of Multi-level association rulr?
7.List the techniques for improving the efficiency of Apriori?
8.Define Support and Confidence?
9.What is level cross filtering?
10.How to determine redundant association rule?
11.Give the General properties of Boolean networks?
12.What is support?
13.Define Confidence?
14.How are Association rule mined from large database?
15.List the merits of Dimensional modeling?
16.What comprises of a dimensional model?
17.What is Bottleneck Detection?
18.What is Back room metadata?
19.What isFront room metadata?
20.What is Active metadata?
21.What is meta data catalogue?
22.What is association mining?
23.Give examples for association mining?
24.Define Market Basket analysis?
25.Give the applications of association rule?
26. Define Minimum Confidence threshold ?
27.Define Minimum support Threshold?
28.What are the two step process?
29.Define Single level association rule?
30. Define Multi level association rule?
31. Define Single Dimensional association rule?
32.Define Multi Dimensional association rule?
33. Define Boolean association rule?
34.Define Quantitative association rule?
35.Define Frequent Itemset Mining?
36.Define Sequential pattern mining?
37. .Define Structured pattern mining?
38.What is Apriori Principle?
39.Define Join step?
40.What is Prune step?
41.Why Progressive Refinement used?
42.What is Superset coverage property?
43.Define two or multi step mining?
44.What is frequency?
45.Define Support count?
46.What is an itemset?
47.Define absolute support?
48.What is closed frequent itemset?
49.What is Maximal frequent itemset?
50.Define Antimonotone?
51.Define Conditional Database?
52.What is Horizontal data format?
53.State Item Merging?
54.What is Sub item pruning?
55. State Item skipping?
56.Define Uniform Support?
57.What is an Ancestor?
58.Define Intra Dimensional Association rule?
59.Define Inter Dimensional Association rule?
60.Define Hybrid Dimensional Association rule?
61.What are Categorical Attributes?
62. What are Nominal Attributes?
63. What are Quantitative Attributes?
64.What is Dynamic Quantitative Attributes?
65.Define Predicate set?
66.Define null transaction?
67.Define null invariant?
68.What is a knowledge type constraints?
69.Define data constraints?
70.State level constraints?
71.Define Dimension constraints?
72.What is an Interestingness constraints?
73.Define Rule constraints?
74.List the classifications of Rule constraints?
75.Define Antimonotonic?
76.What is monotonic?
77.Define succinct?
78.Define Convertible?
79. Define InConvertible?
80.Define ECLAT?
4 mark
1.In classification trees, what are surrogate splits, and how are they used?
2. What is the objective function of the K-Means algorithm?
3.List two interesting measures for association rules.
4. What are Iceberg queries?
5.Explain Market Basket analysis?
6.Describe the basic concepts of association rule?
7.Define Minimum Confidence threshold and Minimum support Threshold?
8.Discuss association rule mining can be viewed as two-step process?
9.Explain the levels of abstraction involved in the rule set?
10.Discuss the number of data dimensions involved in the rule?
11.Describe the types of values handled in the rule?
12.Explain the kinds of patterns to be mined?
13.Explain the various extensions to association mining?
14.Explain constraint-based association mining?
15. Discuss Apriori algorithm?
16.Describe the Apriori property as a two step process?
17.Explain the Association rule based on conditional probability ?
18.Explain Hash based itemset counting?
19.Describe Transaction reduction?
20.Explain Partitioning?
21.Discuss Sampling?
22.Describe Dynamic Itemset Counting?
23.What are the Bottle neck of Apriori algorithm?
24.What are the benefits of FP tree structure?
25.What are the major steps to mine FP tree?
26.What is the principle of Frequent pattern growth?
27.Why is Frequent pattern growth fast?
28.Discuss Ice berg Queries?
29.Explain Progressive Deepening?
30.Discuss Progressive Refinement of data mining quality?
31.Explain Uniform Support?
32. Describe Reduced Support?
33.Explain Level by Level Independent?
34.Describe Level cross filtering by k-itemset?
35.Discuss Level cross filtering by single item?
36. Discuss Controlled Level cross filtering by single item?
37. Compare closed frequent itemset with that of Maximal frequent itemset?
38.How is the Apriori property used in the algorithm?
39.How can we mine closed frequent itemset?
40.Compare Intra Dimensional Association rule and
Inter Dimensional Association rule?
41.Discuss Correlation analysis using lift?
42.Compare null transaction and null invariant?
43.How are meta rules useful?
44.Specify the rule constraint types?
45. How can we use rule constraint to prune the search space?
46.Explain Antimonotonic constraint?
47.Discuss monotonic constraint?
48.Explain succinct constraint?
49.Describe Convertible constraint?
50.Explain InConvertible constraint?
8 mark
1.Find all the association rules that involve only B, C.H (in either left
or right hand side of the rule). The minimum confidence is 70%?
2. Discuss the approaches for mining multi level association rules from the transactional
databases. Give relevant example.
3.Explain the classification of association rule Mining?
4. With an algorithm explain constraint-based association mining?
5. Discuss Apriori algorithm with suitable example?
6.Design the Apriori Algorithm?
7.Illustrate the Working of Apriori Algorithm ?
8.Generate Association rule from Frequent Itemsets?
9.Suppose the data contains the Frequent Itemsets l={I1,12,15}.What are the Association rule
that can be Generated from l?
10. Given a Sample Transactional database X:
X: TID Items
T01 A.B,C,D
T02 A,C,D,F
T03 C,D,E,G,A
T04 A,D,F,B
T05 B,C,G
T06 D,F,G
T07 A,B,G
T08 C,D,F,G
Using Threshold values support =25% and Confidence = 60%.Find
All large item sets in database X?
11. Given a Sample Transactional database X:
X: TID Items
T01 A.B,C,D
T02 A,C,D,F
T03 C,D,E,G,A
T04 A,D,F,B
T05 B,C,G
T06 D,F,G
T07 A,B,G
T08 C,D,F,G
Using Threshold values support =25% and Confidence = 60%.Find
Strong association rules for database X?
12. Given a Sample Transactional database X:
X: TID Items
T01 A.B,C,D
T02 A,C,D,F
T03 C,D,E,G,A
T04 A,D,F,B
T05 B,C,G
T06 D,F,G
T07 A,B,G
T08 C,D,F,G
Using Threshold values support =25% and Confidence = 60%.
Analyze misleading association for the rule set obtained in b?
13.Explain the method Frequent pattern growth?
14.Describe the procedure for creating a conditional pattern base?
15. What are the approaches to mining multi level association rules?
16.Discuss the Strategies for mining multi level association rules using reduced support?
17.Explain multi level association rules: Redundancy Filtering?
18.Design a method that mines the complete set of frequent itemset without candidate
generation?
19.Discuss Constraint based Association mining?
20.Explain meta rules Guided mining of Association rule?
21.Explain how rule constrains can be used in mining process?
22.Prove that all nonempty subsets of a frequent itemset must also be frequent?
23.Prove that the support of any nonempty subset’s of itemset s must be at least as great as the
support of s`?
24.Give an example to show that items in a strong association rule may actually be negatively
correlated?
25.Discuss effective methods that can be used to reduce the number of rules generated while still
preserving most of the interesting rule?
16 mark
1.Write the algorithm to discover frequent itemsets without candidate generation and explain it
with an example?
2.Discuss Apriori algorithm with suitable example and explain how its efficiency can be
improved?
3.Discuss mining of multi-level association rules from transactional databases?
4. Explain with an algorithm, how to mine single dimensional Boolean Association Rules from
transactional database. Give relevant example?
5. Describe the multi-dimensional association rule, giving a suitable example?
6. With an algorithm explain constraint-based association mining. Give relevant example
7.There are 9 transactions in this database |D|=9 and a minimum support count is taken as 2.Use
Apriori Algorithm for finding frequent itemsetsin D?
TID ITEMS
T1001 1,12 ,15
T2001 2,14
T3001 2,13
T4001 1,12,14
T5001 1,13
T6001 2,13
T7001 1,13
T8001 1,12,13,15
T9001 1.12.13
8.There are 4 transactions in this database |D|=4 and a minimum support count is taken as 2.Use
Apriori Algorithm for finding frequent itemsetsin D?
TID ITEMS
100 134
200 235
300 1235
400 25
12.Build a decision tree classification model to classify bank loan applications by assigning
applications to one of 3 classes?
Own’s Home Married Gender Employed Class
Yes Yes Male Yes B
No No Female Yes A
Yes Yes Female Yes C
Yes No male No B
No Yes Female Yes C
No No Female Yes A
No No male No B
Yes No Female Yes A
No Yes Female Yes C
Yes Yes Female Yes C
13.Classify a training sample using ID3 and construct a decision tree with your own example?
14.Given a training data set Y:
A B C Class
15 1 A C1
20 3 B C2
25 2 A C1
30 4 A C1
35 2 B C2
25 4 A C1
15 2 B C2
20 3 B C2
31.Discuss Sequential pattern mining algorithm Based on candidate generate and Test?
32.Explain SPADE?
33.Discuss Prefix span?
34.Describe mining closed sequential pattern?
35.Explain mining multidimensional, Multilevel sequential pattern?
36.Discuss Constraint based mining of sequential pattern?
37.Explain Periodicity Analysis for Time related sequence data?
38.Discuss mining sequential pattern in Biological data?
39.Describe Alignment of Biological Sequences?
40. Elaborate Markov Chain?
41.Design Forward Algorithm?
42. Design Viterbi Algorithm?
43.Design Baum-welch Algorithm?
44.Discuss Graph Mining?
45. Describe mining closed Frequent Substructures?
46. Discuss Constraint based mining of Substructures pattern?
47.Explain Generalization of Class Composition Hierarchies?
48.Discuss Text Retrieval Methods?
49.Explain How to choose a Data Mining System?
50.Describe Data mining for Intrusion Detection?
16 mark
1.Explain the mining of spatial databases?
2.Discuss the mining of text databases?
3.What are the salient features of time series data mining?
4.What is web mining? Discuss the various web mining techniques?
5.Discuss in detail the Application of Data mining for financial data analysis?
6.Discuss the application of data mining for biomedical and DNA data analysis and
telecommunication industry?
7.Discuss the social impacts of data mining Systems?
8. Why is outline mining importantt? Briefly describe the different approaches behind statistical
based outlier detection, distance-based outlier detection and deviation-based outlier detection?
9. What is multidimensional analysis? Discuss the same with an example?
10. What is time series analysis? Discuss the same with an example?
11.Describe BC weather pattern Analysis?
12.Explain Mining Spatial Association and Co Location Pattern?
13.Describe Multimedia data Mining?
14.Elaborate on Sequential pattern mining?
15.Is data mining merely managers Business or every one’s Business?
16. Is data mining a threat to privacy and data security?
17.Explain Mining data streams?
18.Discuss stream query processing?
19.Explain data reduction and transformation techniques?
20.Elborate Indexing Methods for similarity search?
21.Describe Mining Sequence Pattern in Transactinal Database?
22.Discuss Data mining in Telecommunication and retail Industry?
23.Explain few examples of Commercial data mining systems?
24.Discuss Statistical Data mining?
25.Explain the Trends in Data mining?