Database General Exam Reading List
Basic Textbooks
Data Models
Systems
Query Optimization
Query Execution
Mathematical Foundations
Database Statistics and Indexing
Transaction Processing
- P. A. Bernstein, E. Newcomer. Principles of Transaction Processing, 2nd ed.,
Chapter 1 (Introduction), Chapter 9 (Two-Phase Commit),
Chapter 6 (Locking)--the new version, Chapter 8 (Database System Recovery).
- Michael J. Franklin. Concurrency Control and Recovery. The Handbook of Computer Science and Engineering, A. Tucker, ed., CRC Press, Boca Raton, 1997.
Parallel and Distributed Databases
- P. A. Bernstein, E. Newcomer. Principles of Transaction Processing,
2nd ed., Chapter 10 (Replication).
- T. Oszu, P. Valduriez. Principles of Distributed Database Systems,
2nd ed. Chapter 4 (Distributed Database Systems), pp. 82-99; Chapter 5
(Distributed Database Design), pp. 107-154, skimming examples, algorithms,
and Section 5.4.3; Chapter 13 (Parallel Database Systems), pp. 420-452.
- D. Kossman.
The State of the Art in Distributed Query Processing. ACM
Computing Surveys 32(4), 2000, pp. 418-469.
Data Integration
Semistructured and XML Data
- D. Chamberlin.
XQuery: an XML query language. IBM Systems Journal 41(4), 2002.
- Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver
Seeliger, Leo Giakoumakis, Vasili Vasili Zolotov: Indexing XML Data Stored
in a Relational Database. VLDB 2004.
Data Warehousing and Mining
- S. Chaudhuri, U. Dayal.
An Overview of Data Warehousing and OLAP Technology. SIGMOD Record
26(1), 1997, pp. 65-74.
- R. Agrawal, R. Srikant.
Fast Algorithms for Mining Association Rules in Large Databases. VLDB
1994.
- J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao,
F. Pellow, H. Pirahesh.
Data Cube: A Relational Aggregation Operator Generalizing Group-By,
Cross-Tab, and Sub-Totals. Data Mining and Knowledge Discovery 1997.
Ranking and Information Retrieval
- Ronald Fagin, Amnon Lotem, Moni Naor: Optimal aggregation algorithms for
middleware. JCSS 66(4): 614-656 (2003)
- Sanjay Agrawal, Surajit Chaudhuri, Gautam Das, Aristides Gionis. Automated Ranking of Database Query Results. CIDR, 2003.
Optional:
- R. Baeza-Yates, B. Rebeiro-Neto. Modern Information Retrieval. 1999.
Chapter 2.4-2.5 (Modeling); Chapter 8.1-8.3 (Indexing and Searching).
Stream Processing
- Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom: Models and Issues in Data Stream Systems. PODS 2002: 1-16.
- D. Abadi, D. Carney, U. ?etintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik.
Aurora: A New Model and Architecture for Data Stream Management. In VLDB Journal (12)2, 2003.
Web
More Optional Reading
- M.J. Carey, D.J. DeWitt, J. Naughton.
The OO7 Benchmark. SIGMOD 1993.
- J. Goldstein and P. Larson. Optimizing queries using materialized views:
a practical scalable solution. Sigmod 2001
- P. Buneman, S. A. Naqvi, V. Tannen, L. Wong:
Principles of Programming with Complex Objects and Collection Types.
TCS 149(1): 3-8 (1995).
- J. Hellerstein, P. Haas, H. Wang. Online Aggregation. SIGMOD 1997.
- Z.G. Ives, A.Y. Levy, D.S. Weld, D. Florescu, M. Friedman. Adaptive Query Processing for Internet
Applications. IEEE Data Engineering Bulletin 23(2), 2000.
- Alon Y. Halevy, Zachary G. Ives, Peter Mork, Igor Tatarinov. Peer Data Management Systems: Infrastructure for the Semantic Web. WWW Conference, 2003.
- C. Zaniolo, S. Ceri, C. Faloutsos, R. T. Snodgrass, V.S. Subrahmanian,
R. Zicari. Advanced Database Systems. Chapter 5 (Overview of
Temporal Databases), pp. 99-121.
- S. Ceri, R. Cochrane, J. Widom.
Practical Applications of Triggers and Constraints: Successes and Lingering
Issues. VLDB 2000.
- A. Doan, P. Domingos, A. Halevy:
Learning to Match the Schemas of Data Sources: A Multistrategy Approach.
Machine Learning 50(3): 279-301 (2003).
The list was updated on Oct 1 2007 by YongChul Kwon, using advice from Dan Suciu and Magda Balazinska.
The student should read all of the papers listed here, including at least eight (8) of the papers listed as optional.