Explain how a database could be recovered via reprocessing. why is this generally not feasible?

What you Will Learn – Database Recovery

From Transaction Processing, we know that every transaction should reach a commit point (or abort). What happens when the database or DBMS fails before commit (or abort)? The Database Recovery subsystem supports recovery and the Durability (the “D” in ACID) property to ensure data can never be lost.


Textbook Resources for Database Recovery

Connolly, Begg, Holowczak BDSConnolly and Begg DSFundamentals of DB Systems
Elmasri and Navathe
Concepts of DB Management
Pratt and Last
Chapter 14 5th Ed: 22
6th Ed: 22
Chapter 22 Chapter 7
  • There are many situations in which a transaction may not reach a commit or abort point.
    1. An operating system crash can terminate the DBMS processes
    2. The DBMS can crash
    3. The system might lose power
    4. A disk may fail or other hardware may fail.
    5. Human error can result in deletion of critical data.
  • In any of these situations, data in the database may become inconsistent or lost.
  • For example, if a transaction has completed 30 out of 40 scheduled writes to the database when the DBMS crashes, then the database may be in an inconsistent state as only part of the transaction’s work was completed.
  • Database Recovery is the process of restoring the database and the data to a consistent state. This may include restoring lost data up to the point of the event (e.g. system crash).
  • Two approaches are discussed here: Manual Reprocessing and Automated Recovery.

Manual Reprocessing

  • In a Manual Reprocessing recovery approach, the database is periodically backed up (a database save) and all transactions applied since the last save are recorded
  • If the system crashes, the latest database backup set is restored and all of the transactions are re-applied (by users) to bring the database back up to the point just before the crash.
  • Several shortcomings to the Manual Reprocessing approach:
    1. Time required to re-apply transactions
    2. Transactions might have other (physical) consequences
    3. Re-applying concurrent transactions in the same original sequence may not be possible.

Automated Recovery with Rollback / Rollforward

  • As with the manual recovery approach we also make periodic backups of the database (time consuming operation).
  • In the Automated Recovery approach, we introduce a Log file – this is a file separate from the data that records all of the changes made to the database by transactions.
    Also referred to as a Journal.
  • This transaction log Includes information helpful to the recovery process such as: A transaction identifier, the date and time, the user running the transaction, before images and after images
  • Before Image: A copy of the table record (or data item)  before it was changed by the transaction.
  • After Image: A copy of the table record (or data item)  after it was changed by the transaction.
  • Rollback: Undo any partially completed transactions (ones in progress when the crash occurred) by applying the before images to the database.
  • Rollforward: Redo the transactions by applying the after images to the database. This is done for transactions that were committed before the crash.
  • The Automated Recovery process uses both rollback and rollforward to restore the database.
  • In the worst case, we would need to rollback to the last database backup point and then rollforward to the point just before the crash.
  • Checkpoints can also be taken (less time consuming) in between database saves.
    • The DBMS flushes all pending transactions and writes all data to disk and transaction log.
    • Database can be recovered from the last checkpoint in much less time.

Recovery Example

  • Consider our prior concurrency control example
  • Assume we have a physical backup of the data taken just before Transaction A begins.
    The backup contains the following data: Amy = 45 Bill = 38 Carl = 51 R_R = .05
  • Now the following transactions are running in the database

Time Operation TA TB TC R_R Amy Bill Carl 1 Begin Transaction TA Begin .05 45 38 51 2 Begin Transaction TB Begin .05 45 38 51 3 Begin Transaction TC Begin .05 45 38 51 4 TA Shared Lock R_R SL(R_R) .05 45 38 51 5 TA Read R_R Read(R_R) .05 45 38 51 6 TB Shared Lock R_R SL(R_R) .05 45 38 51 7 TC Excl. Lock R_R (wait) .05 45 38 51 8 TA Excl. Lock Amy XL(Amy) (wait) .05 45 38 51 9 TA Read Amy Read(Amy) (wait) .05 45 38 51 10 TB Read R_R Read(R_R) (wait) .05 45 38 51 11 TB Excl. Lock Bill XL(Bill) (wait) .05 45 38 51 12 TB Read Bill Read(Bill) (wait) .05 45 38 51 13 TB Write Bill Write(Bill) (wait) .05 45 39.9 51 14 TA Write Amy Write(Amy) (wait) .05 47.3 39.9 51 15 TA Release Lock Amy Release(Amy) (wait) .05 47.3 39.9 51 16 TB Release Lock R_R Release(R_R) (wait) .05 47.3 39.9 51 17 TB Release Lock Bill Release(Bill) (wait) .05 47.3 39.9 51 18 TB Commit Commit (wait) .05 47.3 39.9 51 19 TA Release Lock R_R Release(R_R) (wait) .05 47.3 39.9 51 20 TC Excl. Lock Granted XL(R_R) .05 47.3 39.9 51 21 TC Read R_R Read(R_R) .05 47.3 39.9 51 22 TC Write R_R Write(R_R) .03 47.3 39.9 51 SYSTEM CRASH

  • Log file (Journal) at the time of the system crash has the following entries:

    Transaction A Begin Transaction B Begin Transaction C Begin Transaction B Write: Bill Before: 38 After: 39.9 Transaction A Write: Amy Before: 45 After: 47.7 Transaction B Commit Transaction C Write: R_R Before .05 After: .03

  • IF the system crash was due to a process failure, the DBMS recovery mechanism would ROLL BACK all non-committed transactions (Transactions A and C in this case)
  • If the system crash was due to a media failure, the DBMS recovery mechanism would first restore the most recent data backup, and then ROLL FORWARD all of the committed transactions it sees in the log (Transaction B only in this case).