Model Question Paper Subject Code: BC 0058 Subject Name: Data Warehouse Credits: 4 Marks: 140 Part A (One mark questions) 1. When a large number of transactions are taken and stored to be dealt with at a later time without the presence of a user, the process is known as ……….. a. OLTP system b. batch processing. c. Data warehousing d. None of the above. 2. OLTP stands for a. Online Transaction Processing b. Offline Transaction Processing c. Online Transfer Process d. Offline Transfer Process 3. Who is the Father of Data Warehouse? a. Michael Reed b. Bill Inmon c. Ralph Kimball d. Dr.
Rakesh Agarwal. 4. Which is not a type of data warehouse. a. Federated Data warehouse b. Distributed Data warehouse c. Real time Data warehouse d. Subject oriented Data warehouse 5. Which is not a Key Issue during data warehouse construction. a. Values and expectations. b. Risk assessment. c. a & b both d. None of the above 6. The _____approach is to start by building individual data marts one by one. a. Top down b. Bottom up c. best-of-breed d. None of the above 7. _________ is known worldwide as an innovator, writer, educator, speaker and consultant in the field of Data Warehousing. a.
Michael Reed b. Bill Inmon c. Ralph Kimball d. Dr. Rakesh Agarwal 8. With the ___________ approach, compatibility among the tools from different vendors could become a serious problem a. Top down b. Bottom up c. best-of-breed d. None of the above 9. The structure that brings all the components of a Data Warehouse together is known as the………. a. Organization b. Architecture c. System d. structure 10. Which data is supported by different database systems and operating systems and is the data from many vertical applications. a. Production Data b. Internal Data c. External Data d. Archived Data 11.
Which applications are knowledge discovery systems where the algorithms help you discover trends and patterns from the usage of your data. a. Ad hoc b. MD analysis c. Data-mining d. EIS Feed 12. A __________ is a repository of data gathered from operational data and other sources that is designed to serve a particular community of knowledge workers . a. Data-Mining b. Data Mart c. Metadata d. Data warehouse 13. Which Data Model describes data from a high level. a. Logical b. Physical c. External d. Conceptual 14. Which technique is a discipline used to illuminate the microscopic relationships among data elements. . Dimensional Modeling b. Star schema c. E-R modeling d. Data warehousing 15. The main purpose of ER modeling is a. To improve analysis for decision making b. To remove redundancy c. To record historical data d. All the above 16. Dimension modeling is represented using a. Snow-flake schema b. Star schema c. Fact constellation d. Granularity schema 17. Data extraction, transformation, and loading encompass the areas of data acquisition and ______ a. Data loss b. Data regain c. data storage. d. Data transfer 18. ________ presupposes a selection process and select the needed data based on the user requirements. . Data extraction b. Data transformation c. Data loading d. data storage 19. _______ implies physical movement of the data from the computer storing the source database to that which will store the data warehouse database, assuming it is different. a. Extraction b. Transformation c. Loading d. Storing 20. Which of the following is not the type of data loading a. Initial Load b. Incremental Load c. Full Refresh d. decremental load 21. The state of completeness, validity, consistency, timeliness and accuracy makes data appropriate for a specific use. a. Data warehouse b. Data mining c.
Data quality d. Data mart 22. ____refers to the methods of automatically identifying objects, collecting data about them and entering data into the data warehouse a. Data capture b. Data Integration c. Data Profiling d. Data Cleansing 23. Which is not a popular OLAP tool a. Business Objects b. Cognos c. SAS/OLAP Studio d. None of the above 24. Data stored by operational systems, such as point-of-sales, are in types of databases called ____ a. ERP b. Supply chain management c. Online Transaction Processing. d. Customer relationship management 25. A data model consists of a.
Structural part b. Manipulative part c. Set of integrity rules d. All the above 26. ______is a process of modeling and formalizing data requirements is an intrinsic part of the database design process. a. Data modeling b. Data extraction c. Data integration d. Data distribution 27. RAID stands for a. Redundant Array of Inexpensive Disks b. Redundant Application of Integrated Data c. Reduced Application of Inexpensive Data d. Reduced Application of Integrated Disks 28. “According to Ralph Kimball, Back-room metadata guides: 1. Extraction 2. Cleaning 3. Loading processes” a. 1,2 only b. ,2 only c. 1,2,3 all d. 2,3 only 29. Which is not an example of Technical Metadata. a. hardware details b. software details c. Query response time details d. system configuration details 30. In ETL process all the records in the _________ that should be brought into the data warehouse actually are extracted into the data warehouse. a. Destination system b. source system c. Data system d. user system 31. _____ Testing ensure that the ETL Process functions well with other upstream and downstream processes. a. Unit b. Regression c. Performance d. Integration 32. UAT stands for a.
Unit Acceptance Technique b. User Acceptance Testing c. Unique Acceptance Testing d. Union Acceptance Technique 33. ______ is a technique of extracting the hidden predictive information from large databases. a. Data Mining b. Data warehousing c. Data extraction d. Data transformation 34. Which of the following is not a technique of data mining: a. Artificial Neural Networks b. Decision Trees c. Genetic Algorithms d. chinese postman problem algorithm 35. CART stands for a. Chisquare Automatic and Regression Testing b. Classification and Regression Trees c. Credit-card Automatic Route Transfer d.
Chisquare Algorithm Regression Testing 36. ______ provides tools for searching, sorting, filtering and drilling down enabling previously complex data models to be viewed intuitively through real-time 3-D graphical representation a. Intelligent Miner b. Enterprise Miner c. MineSet d. Clementine 37. Which of the following is not a tool of data mining a. MineSet b. Intelligent Miner c. Weka3 d. Redbrick 38. A ______ Web house is a single most important tool for identifying, prioritizing, and retaining e-commerce customers. a. click stream b. Data stream c. Design stream d. rag stream 39. When you bring your data warehouse to the web, from the point of view of the users,then which of the following is not the key requirement for the same: a. self-service data access b. interactive analysis c. low availability and performance d. zero-administration client 40. _______which provides low-cost transmission of information, and exchange information with anyone within or outside the company. a. Intranet b. Extranet c. Internet d. user net Part B (Two mark questions) 41. State whether the following statement is true or false for Data warehouse and OLTP 1.
A Data Warehouse (DW), is a database that is designed for facilitating querying and analysis. 2. OLAP system is designed to be read-optimized. a. 1-T , 2-F b. 1-F , 2-T c. 1-F , 2-F d. 1-T , 2-T 42. A Data Warehouse is a relational database that is designed for …… and ………. a. Process and transaction b. Query and analysis c. Searching and selection d. Searching and sorting 43. State whether the following statement is true or false for the differences between OLTP and Data Warehouse projects. 1. The Data Warehouse record transactions in real time and aims to automate the clerical data entry processes of a business entity. 2.
The OLTP database does not cater to real time operational requirements of the enterprise. a. 1-T , 2-F b. 1-F , 2-T c. 1-F , 2-F d. 1-T , 2-T 44. “What are the vital areas in data warehouse development life cycle “ 1. Warehouse management 2. Data management 3. Project management a. Only 1, 2 b. Only 1, 3 c. Only 2 ,3 d. All the above 45. What are the components of data warehouse architecture 1. Source Data Component 2. Data Staging Component 3. Data Storage Component a. Only 1, 2 b. Only 1, 3 c. Only 2 ,3 d. All the above 46. There are three staged archival methods, so arrange these methods in to the appropriate sequence. . Recent data is archived to a separate archival database that may still be online. 2. The oldest data is archived to tape cartridges or microfilm and even kept off-site. 3. The older data is archived to flat files on disk storage. a. 1,2,3 b. 2,3,1 c. 3,2,1 d. 1,3,2 47. State whether the following statements are true or false for fact table: 1. Fact Table is made up of two or more foreign keys. 2. Fact table always expresses a one-to-many relationship. a. 1-T,2-T b. 1-T,2-F c. 1-F,2-T d. 1-F,2-F 48. The process of turning ________ into _________ is called archiving. a. archived redo log files, redo log files b.
Flat files, dump files c. redo log files, archived redo log files d. dump files, Flat files 49. Arrange the functional steps of ETL into an appropriate sequence. 1. Filtering for refreshes and incremental loads 2. Triggering for incremental changes 3. Data extraction 4. Transformation 5. Integration 6. Cleansing 7. Applying to the Data Warehouse database. a. 1,2,3,4,5,6,7 b. 1,3,2,6,5,4,7 c. 2,1,5,4,3,6,7 d. 7,6,5,4,3,2,1 50. _____and ______tools are available to translate the data from one platform to another, and populate the Data Warehouse. a. Statistical Analysis System, Informatics b.
Oracle 10g Warehouse builder, MySAP c. Data Extraction, loading d. SAP BI, Oracle 10g 51. State whether the following statements are true or false for CMM levels. 1. At level 3, companies can begin to predict future IT implementation performance. 2. At level 4, IT best practices are documented and performed throughout the enterprise. a. 1-T,2-F b. 1-F,2-T c. 1-T,2-T d. 1-F,2-F 52. State whether the following statements are true or false: 1. ROLAP technology tends to have greater scalability than MOLAP technology 2. MOLAP technology tends to have greater scalability than ROLAP technology a. 1-T,2-F b. -T,2-T c. 1-F,2-T d. 1-F,2-F 53. State whether the following statements are true or false for characteristics of OLAP 1. An OLAP system typically adopts either a star or snowflake model and a subject oriented database design. 2. OLAP systems are not market-oriented. a. 1-T,2-F b. 1-T,2-T c. 1-F,2-T d. 1-F,2-F 54. State True(T) or False(F). 1. RAID 0 provides data stripping 2. RAID 1 provides block level stripping a. 1–F, 2–F b. 1–F, 2–T c. 1–T, 2–F d. 1–T, 2–T 55. Design metadata consists of 1. Schema definition 2. Source tables 3. ETL information” a. Only 1 and 2 b. Only 1 and 3 c. Only 2 and 3 d. Only 1 6. The tools of metadata management are 1. Erwin data modeler 2. Ablnitio EME 3. Pentaho Metadata 4. Microsoft repository a. Only 2 and 4 b. Only 1 and 2 c. Only 3 and 4 d. All of the above 57. ________ and _______ensures that data loads and queries perform within expected time frames and that the technical architecture is scalable. a. Performance, Scalability Testing b. Unit, Regression Testing c. Integration, Requirements Testing d. None of the above 58. State whether the following statement is true or false: 1. Regression testing is revalidation of existing functionality with each new release of code. . UAT typically focuses on data loaded to the Data Warehouse and any views that have been created on top of the tables. a. 1-T,2-F b. 1-T,2-T c. 1-F,2-T d. 1-F,2-F 59. State whether the following statements are true or false: 1. Data Mining is user driven approach not data driven approach. 2. Prediction is the other aspect of Data Mining. ” a. 1-T,2-F b. 1-T,2-T c. 1-F,2-T d. 1-F,2-F 60. Which of the following are the useful informations of Web house: 1. Site statistics 2. Visitor conversions 3. Referring partner links 4. Site navigation resulting in orders 5. Site navigation not resulting in orders a.
Only 1, 2, 3 b. .Only 1,2,3,4 c. All 1, 2,3,4,5 d. Only 1,2,3,5 Part C (Four mark questions) 61. Match the following for the functionality of data warehousing. Set – A A. Roll-up B. Drill-down C. Pivot D. Slice and Dice E. Selection Set – B 1. Data is summarized with increased generalization. 2. Cross tabulation that is, rotation is performed. 3. Increasing levels of detail are revealed. 4. Data is available by value or range. 5. Performing projection operations on the dimensions. a. A-1 , B-2 , C-3 , D-4, E-5 b. A-5 , B-4 , C-3 , D-2, E-1 c. A-1 , B-3 , C-2 , D-5, E-4 d.
A-2 , B-1 , C-4 , D-3, E-5 62. State whether the following statements are true or false for Data warehouse 1. Data Warehouse contains data for information purpose. 2. Data Warehouse is a store house of historical data. 3. In Data Warehouse, the requirements are gathered area wise. a. 1-T, 2-T, 3-T b. 1-T, 2-T, 3-T c. 1-F, 2-T, 3-F d. 1-T, 2-T, 3-T 63. Fill in the blank 1. The Data Warehouse functions as a Decision Support System and an…………… 2. The ……… tool will host a meta data repository. 3. A ……… DW architecture is a system that is composed of multiple architectures. . 1- Executive Information System, 2- ETL, 3 – Federated b. 1- ETL, 2- Executive Information System, 3 – Federated c. 1- Executive Information System, 2-Federated , 3 – ETL d. 1- Federated, 2- ETL, 3 – Executive Information System 64. What are the characteristics of fact table in star schema? 1. Concatenated key 2. Data grain 3. Fully additive measure 4. Factfull fact table 5. Semi additive measure 6. Table is wide a. 1, 2, 3, 6 only b. 1,2,3,4 only c. 1, 2, 4, 6 only d. 1, 2, 3, 5 only 65. Fill in the blanks. 1. ________ a type of application of data is similar to the initial load. . ________ is the application of ongoing changes from the source systems. 3. ________populating all the Data Warehouse tables for the very first time. a. 1- full refresh, 2- Incremental Load, 3-Initial Load b. 1- Incremental Load, 2-full refresh , 3-Initial Load c. 1- Initial Load, 2- Incremental Load, 3-full refresh d. 1- Incremental Load, 2- Initial Load, 3- full refresh 66. Match the following: 1. Level 0 – a. Performed Informally 2. Level 1 – b. Not Performed 3. Level 2 – c. Well-Defined 4. Level 3 – d. Planned and Tracked 5. Level 4 – e. Quantitatively Controlled 6. Level 5 – f.
Continuously Improving a. 1-b,2-a,3-d,4-c,5-e,6-f b. 1-a, 2-b,3-d,4-c,5-e,6-f c. 1-c,2-b,3-a,4-d,5-f,6-e d. 1-f,2-a,3-d,4-c,5-b,6-e 67. Rearrange the following steps in OLAP creation process 1. Transform and standardize data 2. Build cubes 3. Produce reports 4. Extract data 5. Import to OLAP database a. 2, 3, 4, 1, 5 b. 1, 4, 5, 2, 3 c. 4, 1, 5, 2, 3 d. 5, 3, 2, 1, 4 68. Match the following sets Set -1 1. RAID 0 2. RAID 1 3. RAID 0+1 4. RAID 5 Set-2 a. Data Stripping b. Shadowing/Mirroring / Duplexing c. Striping and Mirroring d. block-level striping a. 1-a,2-b,3-c,4-d b. 1-b,2-a,3-c,4-d c. -a,2-b,3-d,4-c d. 1-d,2-c,3-b,4-a 69. Match the following sets Set -1 1. Design Metadata 2. Population Metadata 3. Administrative Metadata Set-2 a. ETL information, sources and interface details b. Schema definition, source tables, and views. c. Access rights, protocols, physical location. a. 1-a,,2-b,3-c b. 1-b,2-a,3-c c. 1-c,2-b,3-a d. 1-b,2-c,3-a 70. Which of the following are not the parts of Unit testing: 1. Whether ETLs are accessing and picking up right data from right source. 2. Sequence of ETLs jobs in batch. 3. Testing the rejected records that don’t fulfill transformation rules. 4.
Checking the source system connectivity. 5. Dependency and sequencing. 6. Error log generation. a. 1,2,3 only b. 2,4,5 only c. 2,5,6 only d. 2,3,6 only 71. Arrange the decision support progress of data mining steps in the appropriate sequence. 1. Database systems 2. File based systems 3. OLAP systems 4. Data warehouses 5. Data mining applications a. 1, 2,3,4,5 b. 5, 4,3,2,1 c. 2, 1,4,3,5 d. 3, 2,4,5,1 72. The clickstream data enables analysis of different measures like:– 1. Customer demand 2. Demographic data collection 3. Site statistics 4. Feedback on website design 5. Customer buying patterns 6.
Effectiveness of marketing promotions a. 1, 2, 3, 4 only b. 1, 2,4,5,6 only c. 1, 2,3,4,5 only d. 4, 5, 6 only 73. State whether the following statements are true or false for features of the data Web house 1. It is a fully distributed system. as Kimball would say, there is no center to the data Web house. 2. It is a Web-enabled system; it is beyond a client/server system. 3. The Web does not support to the data types including textual, numeric, graphical, photographic, audio, video, and more. 4. The Web sleeps for some time. a. 1-F,2-T,3-F,4-F b. 1-T,2-T,3-F,4-F c. 1-F,2-F,3-F,4-T d. 1-F,2-F,3-T,4-T 4. Match the following two sets correctly Set–I a. MineSet b. Clementine c. DMMiner d. Intelligent Miner Set–II 1. DBMiner technology inc. 2. SPSS Inc. integral solutions 3. IBM Corp 4. Silicon Graphics Inc–SGI a. a–4, b–3, c–2, d–1 b. a–4, b–2, c–1, d–3 c. a–2, b–3, c–4, d–1 d. a–4, b–2, c–3, d–1 75. Match the following two sets correctly Set–I a. Disk mirroring b. Disk striping c. Parity checking d. Disk duplexing Set–II 1. Writing the same data to two disk drives connected to the same controller 2. Each disk has its own distinct controller 3. Ensures correct transmission of data 4.
Data spread across multiple disks by sectors a. a–3, b–1, c–2, d–4 b. a–1, b–2, c–3, d–4 c. a–1, b–4, c–3, d–2 d. a–2, b–4, c–3, d–1 Answer Keys Part – A Q. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Ans. Key B A B D D B C C B A C B D C B B C A C D Q. No. 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Ans. Key C A D C D A A C C B D B A D B C D A C C Q. No. 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Part – B Ans. Key D B C A D D B C C C D A A C A D A B C C Q. No. 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 Part – C Ans. Key C C A D A A C A B C C B B B C