Skip to main content
Engineering LibreTexts

3.5: Before-Or-After Atomicity I - Concepts

  • Page ID
    58510
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    The mechanisms developed in the previous sections of this chapter provide atomicity in the face of failure, so that other atomic actions that take place after the failure and subsequent recovery find that an interrupted atomic action apparently either executed all of its steps or none of them. This and the next section investigate how to also provide atomicity of concurrent actions, known as before-or-after atomicity. In this development we will provide both all-or-nothing atomicity and before-or-after atomicity, so we will now be able to call the resulting atomic actions transactions.

    Concurrency atomicity requires additional mechanisms because when an atomic action installs data in cell storage, that data is immediately visible to all concurrent actions. Even though the version history mechanism can hide pending changes from concurrent atomic actions, they can read other variables that the first atomic action plans to change. Thus, the composite nature of a multiple-step atomic action may still be discovered by a concurrent atomic action that happens to look at the value of a variable in the midst of execution of the first atomic action. This indicates that making a composite action atomic with respect to concurrent threads—that is, making it a before-or-after action—requires further effort.

    Recall that Section 3.2.6 defined the operation of concurrent actions to be correct if every result is guaranteed to be one that could have been obtained by some purely serial application of those same actions. Thus, we are looking for techniques that guarantee production of the same result as if concurrent actions had been applied serially yet maximize the performance that can be achieved by allowing concurrency.

    In this section, we explore three successively better before-or-after atomicity schemes, where "better" means that the scheme allows more concurrency. To illustrate the concepts we return to version histories, which allow a straightforward and compelling correctness argument for each scheme. Because version histories are rarely used in practice, in the following Section 3.6 we examine a somewhat different approach: locks, which are widely used because they can provide higher performance, but for which correctness arguments are more difficult.

    Achieving Before-or-After Atomicity: Simple Serialization

    A version history assigns a unique identifier to each atomic action so that it can link tentative versions of variables to the action’s outcome record. Suppose that we require that the unique identifiers be consecutive integers, which we interpret as serial numbers, and we modify the procedure BEGIN_TRANSACTION by adding enforcement of the following simple serialization rule: each newly created transaction \(n\) must, before reading or writing any data, wait until the preceding transaction \(n-1\) has either committed or aborted. (To ensure that there is always a transaction \(n-1\), assume that the system was initialized by creating a transaction number zero with an OUTCOME record in the committed state.) Figure \(\PageIndex{1}\) shows this version of BEGIN_TRANSACTION. The scheme forces all transactions to execute in the serial order in which threads happen to invoke BEGIN_TRANSACTION. Since that order is a possible serial order of the various transactions, by definition simple serialization will produce transactions that are serialized and thus are correct before-or-after actions. Simple serialization trivially provides before-or-after atomicity, and the transaction is still all-or-nothing, so the transaction is now atomic both in the case of failure and in the presence of concurrency.

    Procedure BEGIN_TRANSACTION creates, initializes and assigns a variable id, and calls the procedure NEW_RECORD_OUTCOME with a starting state of PENDING to create a dummy initial transaction. The procedure then creates a variable previous_id and assigns a value of id - 1 to this variable. The procedure waits until the record outcome of previous_id is no longer equal to PENDING, and then returns id.

    Figure \(\PageIndex{1}\): BEGIN_TRANSACTION with the simple serialization discipline to achieve before-or-after atomicity. In order that there be an id - 1 for every value of id , startup of the system must include creating a dummy transaction with id = 0 and id.outcome_record.state set to COMMITTED. Pseudocode for the procedure NEW_OUTCOME_RECORD appears in Figure \(\PageIndex{6}\).

    Simple serialization provides before-or-after atomicity by being too conservative: it prevents all concurrency among transactions, even if they would not interfere with one another. Nevertheless, this approach actually has some practical value—in some applications it may be just the right thing to do, on the basis of simplicity. Concurrent threads can do much of their work in parallel because simple serialization comes into play only during those times that threads are executing transactions, which they generally would be only at the moments they are working with shared variables. If such moments are infrequent or if the actions that need before-or-after atomicity all modify the same small set of shared variables, simple serialization is likely to be just about as effective as any other scheme. In addition, by looking carefully at why it works, we can discover less conservative approaches that allow more concurrency, yet still have compelling arguments that they preserve correctness. Put another way, the remainder of study of before-or-after atomicity techniques is fundamentally nothing but invention and analysis of increasingly effective—and increasingly complex—performance improvement measures.

    The version history provides a useful representation for this analysis. Figure \(\PageIndex{2}\) illustrates in a single figure the version histories of a banking system consisting of four accounts named A, B, C, and D, during the execution of six transactions, with serial numbers 1 through 6. The first transaction initializes all the objects to contain the value 0 and the following transactions transfer various amounts back and forth between pairs of accounts.

    Transaction 1 initializes the value of all four accounts, A, B, C, and D, to 0; the outcome record state is COMMITTED. Transaction 2 transfers 10 from account B to A, leaving B with a value of -10; the outcome record state is COMMITTED. Transaction 3 transfers 4 from C to B, leaving C with a value of -4 and B with a value of -6; the outcome record state is COMMITTED. Transaction 4 transfers 2 from D to A, but the transaction is ABORTED. Transaction 5 transfers 6 from B to C, leaving B with a value of -12 and C with a value of +2; the outcome record state is COMMITTED. Transaction 6 transfers 10 from A to B, leaving A with a value of 0 and B with a value of -2; the outcome record state is COMMITTED.

    Figure \(\PageIndex{2}\): Version history of a banking system.

    This figure provides a straightforward interpretation of why simple serialization works correctly. Consider transaction 3, which must read and write objects B and C in order to transfer funds from one to the other. The way for transaction 3 to produce results as if it ran after transaction 2 is for all of 3's input objects to have values that include all the effects of transaction 2—if transaction 2 commits, then any objects that it changed and that 3 uses should have new values; if transaction 2 aborts, then any objects it tentatively changed and 3 uses should contain the values that they had when transaction 2 started. Since in this example transaction 3 reads B and transaction 2 creates a new version of B, it is clear that for transaction 3 to produce a correct result it must wait until transaction 2 either commits or aborts. Simple serialization requires that wait, and thus ensures correctness.

    Figure \(\PageIndex{2}\) also provides some clues about how to increase concurrency. Looking at transaction 4 (the example shows that transaction 4 will ultimately abort for some reason, but suppose we are just starting transaction 4 and don’t know that yet), it is apparent that simple serialization is too strict. Transaction 4 reads values only from A and D, yet transaction 3 has no interest in either object. Thus the values of A and D will be the same whether or not transaction 3 commits, and a discipline that forces 4 to wait for 3's completion delays 4 unnecessarily. On the other hand, transaction 4 does use an object that transaction 2 modifies, so transaction 4 must wait for transaction 2 to complete. Of course, simple serialization guarantees that, since transaction 4 can't begin till transaction 3 completes and transaction 3 couldn't have started until transaction 2 completed.

    These observations suggest that there may be other, more relaxed, disciplines that can still guarantee correct results. They also suggest that any such discipline will probably involve detailed examination of exactly which objects each transaction reads and writes.

    Figure \(\PageIndex{2}\) represents the state history of the entire system in serialization order, but the slightly different representation of Figure \(\PageIndex{3}\) makes that state history more explicit. In Figure \(\PageIndex{3}\) it appears that each transaction has perversely created a new version of every object, with unchanged values in dotted boxes for those objects it did not actually change. This representation emphasizes that the vertical slot for, say, transaction 3 is in effect a reservation in the state history for every object in the system; transaction 3 has an opportunity to propose a new value for any object, if it so wishes.

    A system state history of the banking system from Figure 2 above, with objects A through D and transactions 1 through 6. Here, for every transaction (aborted or not) the value that each object would have at the end of the transaction is given explicitly. Values that have changed in that transaction are surrounded by solid boxes, while values that remained unchanged are in dotted boxes.

    Figure \(\PageIndex{3}\): System state history with unchanged values shown.

    The reason that the system state history is helpful to the discussion is that as long as we eventually end up with a state history that has the values in the boxes as shown, the actual order in real time in which individual object values are placed in those boxes is unimportant. For example, in Figure \(\PageIndex{3}\), transaction 3 could create its new version of object C before transaction 2 creates its new version of B. We don't care when things happen, as long as the result is to fill in the history with the same set of values that would result from strictly following this serial ordering. Making the actual time sequence unimportant is exactly our goal, since that allows us to put concurrent threads to work on the various transactions. There are, of course, constraints on time ordering, but they become evident by examining the state history.

    Figure \(\PageIndex{3}\) allows us to see just what time constraints must be observed in order for the system state history to record this particular sequence of transactions. In order for a transaction to generate results appropriate for its position in the sequence, it should use as its input values the latest versions of all of its inputs. If Figure \(\PageIndex{3}\) were available, transaction 4 could scan back along the histories of its inputs A and D, to the most recent solid boxes (the ones created by transactions 2 and 1, respectively) and correctly conclude that if transactions 2 and 1 have committed then transaction 4 can proceed—even if transaction 3 hasn’t gotten around to filling in values for B and C and hasn't decided whether or not it should commit. 

    This observation suggests that any transaction has enough information to ensure before-or-after atomicity with respect to other transactions if it can discover the dotted-versus-solid status of those version history boxes to its left. The observation also leads to a specific before-or-after atomicity discipline that will ensure correctness. We call this discipline mark-point.

     

    The Mark-Point Discipline

    Concurrent threads that invoke READ_CURRENT_VALUE as implemented in Figure \(3.3.10\) can not see a pending version of any variable. That observation is useful in designing a before-or-after atomicity discipline because it allows a transaction to reveal all of its results at once simply by changing the value of its OUTCOME record to COMMITTED. But in addition to that we need a way for later transactions that need to read a pending version to wait for it to become committed. The way to do that is to modify READ_CURRENT_VALUE to wait for, rather than skip over, pending versions created by transactions that are earlier in the sequential ordering (that is, they have a smaller caller_id ), as implemented in lines 4–9 of Figure \(\PageIndex{4}\). Because, with concurrency, a transaction later in the ordering may create a new version of the same variable before this transaction reads it, READ_CURRENT_VALUE still skips over any versions created by transactions that have a larger caller_id . Also, as before, it may be convenient to have a READ_MY_VALUE procedure (not shown) that returns pending values previously written by the running transaction.

    Procedure READ_CURRENT_VALUE takes the arguments data_id and this_transaction_id. Starting at the end of data_id and repeating until the beginning, the procedure assigns the previous version of data_id to the variable v and assigns the value of v.data_id to the variable last_modifier. If last_modifier is greater than or equal to this_transaction_id, the procedure skips v. The procedure waits until the outcome record state of last_modifier is no longer equal to PENDING, then checks that outcome record state. If it is COMMITTED it returns v.state; else it skips v.

    Figure \(\PageIndex{4}\):READ_CURRENT_VALUE for the mark-point discipline.This form of the procedure skips all versions created by transactions later than the calling transaction, and it waits for a pending version created by an earlier transaction until that earlier transaction commits or aborts.

    Adding the ability to wait for pending versions in READ_CURRENT_VALUE is the first step; to ensure correct before-or-after atomicity we also need to arrange that all variables that are needed as inputs for a transaction, but that earlier, not-yet-committed transactions plan to modify, have pending versions. To do that we call on the application programmer (for example, the programmer of the TRANSFER transaction) do a bit of extra work: each transaction should create new, pending versions of every variable it intends to modify, and announce when it is finished doing so. Creating a pending version has the effect of marking those variables that are not ready for reading by later transactions, so we will call the point at which a transaction has created them all the mark point of the transaction. The transaction announces that it has passed its mark point by calling a procedure named MARK_POINT_ANNOUNCE, which simply sets a flag in the outcome record for that transaction.

    The mark-point discipline then is that no transaction can begin reading its inputs until the preceding transaction has reached its mark point or is no longer pending. This discipline requires that each transaction identify which data it will update. If the transaction has to modify some data objects before it can discover the identity of others that require update, it could either delay setting its mark point until it does know all of the objects it will write (which would, of course, also delay all succeeding transactions) or use the more complex discipline described in the next section.

    For example, in Figure \(\PageIndex{3}\), the boxes under newly arrived transaction 7 are all dotted; transaction 7 should begin by marking the ones that it plans to make solid. For convenience in marking, we split the WRITE_NEW_VALUE procedure of Figure \(3.3.10\) into two parts, named NEW_VERSION and WRITE_VALUE, as in Figure \(\PageIndex{5}\). Marking then consists simply of a series of calls to NEW_VERSION. When finished marking, the transaction calls MARK_POINT_ANNOUNCE. It may then go about its business, reading and writing values as appropriate to its purpose.

    Procedure NEW_VERSION references data_id and this_transaction_id. If the outcome record state of this_transaction_id is MARKED, the procedure signals "Tried to create new version after announcing mark point!". The procedure then appends a new version v to data_id, assigns the value of NULL to v.value, and assigns the value of transaction_id to v.action.id. Procedure WRITE_VALUE references data_id and takes the arguments new_value and this_transaction_id. Starting at the end of data_id and repeating until the beginning, the procedure checks if v.action_id is equal to this_transaction_id. If that statement is true, it assigns new_value to v.value and returns. The procedure then signals "Tried to write without creating new version!".

    Figure \(\PageIndex{5}\): Mark-point discipline versions of NEW_VERSION and WRITE_VALUE.

    Finally, we enforce the mark point discipline by putting a test and, depending on its outcome, a wait in BEGIN_TRANSACTION, as in Figure \(\PageIndex{6}\), so that no transaction may begin execution until the preceding transaction either reports that it has reached its mark point or is no longer PENDING. Figure \(\PageIndex{6}\) also illustrates an implementation of MARK_POINT_ANNOUNCE. No changes are needed in procedures ABORT and COMMIT as shown in Figure \(3.3.8\), so they are not repeated here.

    Procedure BEGIN_TRANSACTION from Figure 1 above is repeated. Procedure NEW_OUTCOME_RECORD takes the argument starting_state. It acquires an outcome record lock, assigns TICKET(outcome_record_sequencer) to id, allocates id.outcome_record, assigns the value of starting_state to id's outcome record state, assigns the value of NULL to id's outcome record mark-state, releases the outcome record lock, and returns id. Procedure MARK_POINT_ANNOUNCE references this_transaction_id; the procedure assigns the state MARKED to the outcome record mark-state of this_transaction_id.

    Figure \(\PageIndex{6}\): The procedures BEGIN_TRANSACTION, NEW_OUTCOME_RECORD, and MARK_POINT_ANNOUNCE for the mark-point discipline. BEGIN_TRANSACTION presumes that there is always a preceding transaction, so the system should be initialized by calling NEW_OUTCOME_RECORD to create an empty initial transaction whose starting_state is COMMITTED and immediately calling MARK_POINT_ANNOUNCE for the empty transaction.

    Because no transaction can start until the previous transaction reaches its mark point, all transactions earlier in the serial ordering must also have passed their mark points, so every transaction earlier in the serial ordering has already created all of the versions that it ever will. Since READ_CURRENT_VALUE now waits for earlier, pending values to become committed or aborted, it will always return to its client a value that represents the final outcome of all preceding transactions. All input values to a transaction thus contain the committed result of all transactions that appear earlier in the serial ordering, just as if it had followed the simple serialization discipline. The result is thus guaranteed to be exactly the same as one produced by a serial ordering, no matter in what real time order the various transactions actually write data values into their version slots. The particular serial ordering that results from this discipline is, as in the case of the simple serialization discipline, the ordering in which the transactions were assigned serial numbers by NEW_OUTCOME_RECORD.

    There is one potential interaction between all-or-nothing atomicity and before-or-after atomicity. If pending versions survive system crashes, at restart the system must track down all PENDING transaction records and mark them ABORTED to ensure that future invokers of READ_CURRENT_VALUE do not wait for the completion of transactions that have forever disappeared.

    The mark-point discipline provides before-or-after atomicity by bootstrapping from a more primitive before-or-after atomicity mechanism. As usual in bootstrapping, the idea is to reduce some general problem—here, that problem is to provide before-or-after atomicitiy for arbitrary application programs—to a special case that is amenable to a special-case solution—here, the special case is construction and initialization of a new outcome record. The procedure NEW_OUTCOME_RECORD in Figure \(\PageIndex{6}\) must itself be a before-or-after action because it may be invoked concurrently by several different threads and it must be careful to give out different serial numbers to each of them. It must also create completely initialized outcome records, with value and mark_state set to PENDING and NULL, respectively, because a concurrent thread may immediately need to look at one of those fields. To achieve

    before-or-after atomicity, NEW_OUTCOME_RECORD bootstraps from the TICKET procedure to obtain the next sequential serial number, and it uses ACQUIRE and RELEASE to make its initialization steps a before-or-after action. Those procedures in turn bootstrap from still lower-level before-or-after atomicity mechanisms, so we have three layers of bootstrapping.

    We can now reprogram the funds TRANSFER procedure of Figure \(3.3.10\) to be atomic under both failure and concurrent activity, as in Figure \(\PageIndex{7}\). The major change from the earlier version is addition of lines 4 through 6, in which TRANSFER calls NEW_VERSION to mark the two variables that it intends to modify and then calls MARK_POINT_ANNOUNCE. The interesting observation about this program is that most of the work of making actions before-or-after is actually carried out in the called procedures. The only effort or thought required of the application programmer is to identify and mark, by creating new versions, the variables that the transaction will modify.

    Procedure TRANSFER references debit_account, credit_account, and amount. It calls BEGIN_TRANSACTION for my_id, and calls NEW_VERSION for both debit_account, my_id and credit_account, my_id. It calls MARK_POINT_ANNOUNCE for my_id. The procedure calls READ_CURRENT_VERSION for debit_account, my_id and assigns that value to xvalue, then subtracts amount from xvalue and assigns the new value to xvalue. It writes the value of xvalue to debit_account, my_id. The procedure then calls READ_CURRENT_VERSION for credit_account, my_id and assigns that value to yvalue, then adds amount to yvalue and assigns that sum to yvalue. If xvalue is greater than 0 the procedure commits to the action identified by my_id; else it aborts the action and signals that negative transfers are not allowed.

    Figure \(\PageIndex{7}\): An implementation of the funds transfer procedure that uses the mark point discipline to ensure that it is atomic both with respect to failure and with respect to concurrent activity.

    The delays (which under the simple serialization discipline would all be concentrated in BEGIN_TRANSACTION) are distributed under the mark-point discipline. Some delays may still occur in BEGIN_TRANSACTION, waiting for the preceding transaction to reach its mark point. But if marking is done before any other calculations, transactions are likely to reach their mark points promptly, and thus this delay should be not as great as waiting for them to commit or abort. Delays can also occur at any invocation of READ_CURRENT_VALUE , but only if there is really something that the transaction must wait for, such as committing a pending version of a necessary input variable. Thus the overall delay for any given transaction should never be more than that imposed by the simple serialization discipline, and one might anticipate that it will often be less.

    A useful property of the mark-point discipline is that it never creates deadlocks. Whenever a wait occurs it is a wait for some transaction earlier in the serialization. That transaction may in turn be waiting for a still earlier transaction, but since no one ever waits for a transaction later in the ordering, progress is guaranteed. The reason is that at all times there must be some earliest pending transaction. The ordering property guarantees that this earliest pending transaction will encounter no waits for other transactions to complete, so it, at least, can make progress. When it completes, some other transaction in the ordering becomes earliest, and it now can make progress. Eventually, by this argument, every transaction will be able to make progress. This kind of reasoning about progress is a helpful element of a before-or-after atomicity discipline. In Section 3.6 of this chapter we will encounter before-or-after atomicity disciplines that are correct in the sense that they guarantee the same result as a serial ordering, but they do not guarantee progress. Such disciplines require additional mechanisms to ensure that threads do not end up deadlocked, waiting for one another forever.

    Two other minor points are worth noting. First, if transactions wait to announce their mark point until they are ready to commit or abort, the mark-point discipline reduces to the simple serialization discipline. That observation confirms that one discipline is a relaxed version of the other. Second, there are at least two opportunities in the mark-point discipline to discover and report protocol errors to clients. A transaction should never call NEW_VERSION after announcing its mark point. Similarly, WRITE_VALUE can report an error if the client tries to write a value for which a new version was never created. Both of these error-reporting opportunities are implemented in the pseudocode of Figure \(\PageIndex{5}\).

     

    Optimistic Atomicity: Read-Capture (Advanced Topic)

    Both the simple serialization and mark-point disciplines are concurrency control methods that may be described as pessimistic. That means that they presume that interference between concurrent transactions is likely, and they actively prevent any possibility of interference by imposing waits at any point where interference might occur. In doing so, they also may prevent some concurrency that would have been harmless to correctness. An alternative scheme, called optimistic concurrency control, is to presume that interference between concurrent transactions is unlikely, and allow them to proceed without waiting. Then, watch for actual interference, and if it happens take some recovery action: for example, aborting an interfering transaction and making it restart. (There is a popular tongue-in-cheek characterization of the difference: pessimistic = "ask first", optimistic = "apologize later".) The goal of optimistic concurrency control is to increase concurrency in situations where actual interference is rare.

    The system state history of Figure \(\PageIndex{3}\) suggests an opportunity to be optimistic. We could allow transactions to write values into the system state history in any order and at any time, but with the risk that some attempts to write may be met with the response "Sorry, that write would interfere with another transaction. You must abort, abandon this serialization position in the system state history, obtain a later serialization, and rerun your transaction from the beginning."

    A specific example of this approach is the read-capture discipline. Under the read-capture discipline, there is an option, but not a requirement, of advance marking. Eliminating the requirement of advance marking has the advantage that a transaction does not need to predict the identity of every object it will update—it can discover the identity of those objects as it works. Instead of advance marking, whenever a transaction calls READ_CURRENT_VALUE, that procedure makes a mark at this thread’s position in the version history of the object it read. This mark tells potential version-inserters earlier in the serial ordering but arriving later in real time that they are no longer allowed to insert—they must abort and try again, using a later serial position in the version history. Had the prospective version inserter gotten there sooner, before the reader had left its mark, the new version would have been acceptable, and the reader would have instead waited for the version inserter to commit, and taken that new value instead of the earlier one. Read-capture gives the reader the power of extending validity of a version through intervening transactions, up to the reader’s own serialization position. This view of the situation is illustrated in Figure \(\PageIndex{8}\), which has the same version history as did Figure \(\PageIndex{3}\).

    The version history from Figure 3 above is represented with high-water marks (HWM) and the read-capture discipline. Changed values of objects are explicitly shown and surrounded with solid boxes; values that would change but would cause a conflict are surrounded with dotted boxes. Between transactions 1 and 2, the HWM for accounts A and B is 2. Between transactions 2 and 3, the HWM for account B is 3. Between transactions 1 and 3, the HWM for account C is 3. A dotted box surrounds the value of +12 for account A in transaction 4, showing that this transaction would create a conflict and needs to abort. Between transactions 1 and 4, the HWM for account D is 4. For account A, the HWM is 6 between transactions 2 and 6, and 7 between transactions 6 and 7. For account B, the HWM is 5 between transactions 3 and 5, and 6 between transactions 5 and 6. For account C, the HWM is 5 between transactions 3 and 5, and for account D the HWM is 7 between transactions 4 and 7.

    Figure \(\PageIndex{8}\): Version history with high-water marks and the read-capture discipline. First, transaction 6, which is running concurrently with transaction 4, reads variable A, thus extending the high-water mark of A to 6. Then, transaction 4 (which intends to transfer 2 from D to A) encounters a conflict when it tries to create a new version of A and discovers that the high-water mark of A has already been set by transaction 6, so 4 aborts and returns as transaction 7. Transaction 7 retries transaction 4, extending the high-water marks of A and D to 7.

    The key property of read-capture is illustrated by an example in Figure \(\PageIndex{8}\). Transaction 4 was late in creating a new version of object A; by the time it tried to do the insertion, transaction 6 had already read the old value (+10) and thereby extended the validity of that old value to the beginning of transaction 6. Therefore, transaction 4 had to be aborted; it has been reincarnated to try again as transaction 7. In its new position as transaction 7, its first act is to read object D, extending the validity of its most recent committed value (zero) to the beginning of transaction 7. When it tries to read object A, it discovers that the most recent version is still uncommitted, so it must wait for transaction 6 to either commit or abort. Note that if transaction 6 should now decide to create a new version of object C, it can do so without any problem, but if it should try to create a new version of object D, it would run into a conflict with the old, now extended version of D, and it would have to abort.

    Read-capture is relatively easy to implement in a version history system. We start, as shown in Figure \(\PageIndex{9}\), by adding a new step (at line 9) to READ_CURRENT_VALUE . This new step records with each data object a high-water mark—the serial number of the highest-numbered transaction that has ever read a value from this object’s version history. The high-water mark serves as a warning to other transactions that have earlier serial numbers but are late in creating new versions. The warning is that someone later in the serial ordering has already read a version of this object from earlier in the ordering, so it is too late to create a new version now. We guarantee that the warning is heeded by adding a step to NEW_VERSION (at line 14), which checks the high-water mark for the object to be written, to see if any transaction with a higher serial number has already read the current version of the object. If not, we can create a new version without concern. But if the transaction serial number in the high-water mark is greater than this transaction’s own serial number, this transaction must abort, obtain a new, higher serial number, and start over again.

    Procedure READ_CURRENT_VALUE references data_id, value, and caller_id. The procedure starts at the end of data_id and repeats until the beginning, with v equaling the previous version of data_id for every repeat. If v.action_id is greater than or equal to caller_id, the procedure skips v. The procedure examines the outcome record state of v.action_id: if the state is PENDING, it waits for that action to abort or commit. If the state becomes COMMIT then it sets that caller_id as v.high_water_mark and returns the value of v; else it skips v. Procedure NEW_VERSION references data_id and caller_id. If caller_id is less than the high-water mark of data_id, or if caller_id is less than the latest version of data_id's action_id, then the procedure aborts the transaction and terminates the thread. Else it adds a new version v at the end of data_id, assigns v a value of 0 and an action_id of caller_id. Procedure WRITE_VALUE references data_id, new_value, and caller_id. It locates the version v of data_id's history such that v.action_id = caller_id, and if that version is not found, it signals "Tried to write without creating a new version." Then it assigns new_value to the value of v.

    Figure \(\PageIndex{9}\): Read-capture forms of READ_CURRENT_VALUE, NEW_VERSION, and WRITE_VALUE.

    We have removed all constraints on the real-time sequence of the constituent steps of the concurrent transaction, so there is a possibility that a high-numbered transaction will create a new version of some object, and then later a low-numbered transaction will try to create a new version of the same object. Since our NEW_VERSION procedure simply tacks new versions on the end of the object history, we could end up with a history in the wrong order. The simplest way to avoid that mistake is to put an additional test in NEW_VERSION (at line 15), to ensure that every new version has a client serial number that is larger than the serial number of the next previous version. If not, NEW_VERSION aborts the transaction, just as if a read-capture conflict had occurred. (This test aborts only those transactions that perform conflicting blind writes, which are uncommon. If either of the conflicting transactions reads the value before writing it, the setting and testing of high_water_mark will catch and prevent the conflict.) 

    The first question one must raise about this kind of algorithm is whether or not it actually works: is the result always the same as some serial ordering of the concurrent transactions? Because the read-capture discipline permits greater concurrency than does mark-point, the correctness argument is a bit more involved. The induction part of the argument goes as follows:

    1. The WAIT for PENDING values in READ_CURRENT_VALUE ensures that if any pending transaction \(k < n\) has modified any value that is later read by transaction \(n\), transaction \(n\) will wait for transaction \(k\) to commit or abort.
    2. The setting of the high-water mark when transaction \(n\) calls READ_CURRENT_VALUE, together with the test of the high-water mark in NEW_VERSION , ensures that if any transaction \(j < n\) tries to modify any value after transaction \(n\) has read that value, transaction \(j\) will abort and not modify that value.
    3. Therefore, every value that READ_CURRENT_VALUE returns to transaction \(n\) will include the final effect of all preceding transactions \(1...n - 1\).
    4. Therefore, every transaction \(n\) will act as if it serially follows transaction \(n- 1\).

    Optimistic coordination disciplines such as read-capture have the possibly surprising effect that something done by a transaction later in the serial ordering can cause a transaction earlier in the ordering to abort. This effect is the price of optimism; to be a good candidate for an optimistic discipline, an application probably should not have a lot of data interference. 

    A subtlety of read-capture is that it is necessary to implement bootstrapping before-or-after atomicity in the procedure NEW_VERSION, by adding a lock and calls to ACQUIRE and RELEASE , because NEW_VERSION can now be called by two concurrent threads that happen to add new versions to the same variable at about the same time. In addition, NEW_VERSION must be careful to keep versions of the same variable in transaction order, so that the backward search performed by READ_CURRENT_VALUE works correctly.

    There is one final detail, an interaction with all-or-nothing recovery. High water marks should be stored in volatile memory, so that following a crash (which has the effect of aborting all pending transactions) the high water marks automatically disappear and thus don’t cause unnecessary aborts.

     

    Does Anyone Actually Use Version Histories for Before-or-After Atomicity?

    The answer is yes, but the most common use is in an application not likely to be encountered by a software specialist. Legacy processor architectures typically provide a limited number of registers (the "architectural registers") in which the programmer can hold temporary results, but modern large scale integration technology allows space on a physical chip for many more physical registers than the architecture calls for. More registers generally allow better performance, especially in multiple-issue processor designs, which execute several sequential instructions concurrently whenever possible. To allow use of the many physical registers, a register mapping scheme known as register renaming implements a version history for the architectural registers. This version history allows instructions that would interfere with each other only because of a shortage of registers to execute concurrently.

    For example, Intel Pentium processors, which are based on an x86 instruction set architecture, have only eight architectural registers. The Pentium 4 has 128 physical registers, and a register renaming scheme based on a circular reorder buffer. A reorder buffer resembles a direct hardware implementation of the procedures NEW_VERSION and WRITE_VALUE of Figure \(\PageIndex{5}\). As each instruction issues (which corresponds to BEGIN_TRANSACTION), it is assigned the next sequential slot in the reorder buffer. The slot is a map that maintains a correspondence between two numbers: the number of the architectural register that the programmer specified to hold the output value of the instruction, and the number of one of the 128 physical registers, the one that will actually hold that output value. Since machine instructions have just one output value, assigning a slot in the reorder buffer implements in a single step the effect of both NEW_OUTCOME_RECORD and NEW_VERSION. Similarly, when the instruction commits, it places its output in that physical register, thereby implementing WRITE_VALUE and COMMIT as a single step.

    Figure \(\PageIndex{10}\) illustrates register renaming with a reorder buffer. In the program sequence of that example, instruction \(n\) uses architectural register five to hold an output value that instruction \(n + 1\) will use as an input. Instruction \(n + 2\) loads architectural register five from memory. Register renaming allows there to be two (or more) versions of register five simultaneously, one version (in physical register 42) containing a value for use by instructions \(n\) and \(n + 1\) and the second version (in physical register 29) to be used by instruction \(n + 2\). The performance benefit is that instruction \(n + 2\) (and any later instructions that write into architectural register 5) can proceed concurrently with instructions \(n\) and \(n + 1\). An instruction following instruction \(n + 2\) that requires the new value in architectural register five as an input uses a hardware implementation of READ_CURRENT_VALUE to locate the most recent preceding mapping of architectural register five in the reorder buffer. In this case that most recent mapping is to physical register 29. The later instruction then stalls, waiting for instruction \(n + 2\) to write a value into physical register 29. Later instructions that reuse architectural register five for some purpose that does not require that version can proceed concurrently.

    On the left of the diagram, a section of the reorder buffer shows that instruction n uses architectural register R5 to hold an output value and maps to physical register 42. Instruction n+1 uses architectural register R4 and maps to physical register 61. Instruction n+2 uses R5 and maps to physical register 29. The right side of the diagram shows a physical register file with 128 registers, represented as a vertical box with register 0 at the top and 128 at the bottom; arrows point from the three listed physical registers in the reorder buffer to the corresponding locations on the register file.

    Figure \(\PageIndex{10}\): Example showing how a reorder buffer maps architectural register numbers to physical register numbers. The program sequence corresponding to the three entries is:
    n         R5 ← R4 × R2           // Write a result in register five.
    n + 1     R4 ← R5 + R1           // Use result in register five.
    n + 2     R5 ← READ (117492)    // Write content of a memory cell in register five.

    Instructions \(n\) and \(n+2\) both write into register R5, so R5 has two versions, with mappings to physical registers 42 and 29, respectively. Instruction \(n+2\) can thus execute concurrently with instructions \(n\) and \(n+1\).

     

    Although register renaming is conceptually straightforward, the mechanisms that prevent interference when there are dependencies between instructions tend to be more intricate than either of the mark-point or read-capture disciplines, so this description has been oversimplified. For more detail, the reader should consult a textbook on processor architecture: for example, Computer Architecture, a Quantitative Approach, by Hennessy and Patterson\(^*\).

    The Oracle database management system offers several before-or-after atomicity methods, one of which it calls "serializable," though the label may be a bit misleading. This method uses a before-or-after atomicity scheme that the database literature calls snapshot isolation. The idea is that when a transaction begins, the system conceptually takes a snapshot of every committed value and the transaction reads all of its inputs from that snapshot. If two concurrent transactions (which might start with the same snapshot) modify the same variable, the first one to commit wins; the system aborts the other one with a "serialization error." This scheme effectively creates a limited variant of a version history that, in certain situations, does not always ensure that concurrent transactions are correctly coordinated.

    Another specialized variant implementation of version histories, known as transactional memory, is a discipline for creating atomic actions from arbitrary instruction sequences that make multiple references to primary memory. Transactional memory was first suggested in 1993, and with widespread availability of multicore processors has become the subject of quite a bit of recent research interest because it allows the application programmer to use concurrent threads without having to deal with locks. The discipline is to mark the beginning of an instruction sequence that is to be atomic with a "begin transaction" instruction, direct all ensuing STORE instructions to a hidden copy of the data that concurrent threads cannot read, and at end of the sequence check to see that nothing read or written during the sequence was modified by some other transaction that committed first. If the check finds no such earlier modifications, the system commits the transaction by exposing the hidden copies to concurrent threads; otherwise it discards the hidden copies and the transaction aborts. Because it defers all discovery of interference to the commit point, this discipline is even more optimistic than the read-capture discipline described in Section 3.5.3 above, so it is most useful in situations where interference between concurrent threads is possible but unlikely. Transactional memory has been experimentally implemented in both hardware and software. Hardware implementations typically involve tinkering with either a cache or a reorder buffer to make it defer writing hidden copies back to primary memory until commit time, while software implementations create hidden copies of changed variables somewhere else in primary memory. As with instruction renaming, this description of transactional memory is somewhat oversimplified, and the interested reader should consult the literature for fuller explanations. 

    Other software implementations of version histories for before-or-after atomicity have been explored primarily in research environments. Designers of database systems usually use locks rather than version histories because there is more experience in achieving high performance with locks. Achieving before-or-after atomicity by using locks systematically is the subject of the next section of this chapter.


    \(^*\) David A. Patterson and John L. Hennessy. Computer Architecture: A Quantitative Approach. Morgan Kaufman, fourth edition, 2007. ISBN: 978–0–12–370490–0. 704 + various pages (paperback). The cover gives the authors' names in the opposite order.

    This book provides a spectacular tour-de-force that explores much of the design space of current computer architecture. One of the best features is that each area includes a discussion of misguided ideas and their pitfalls. Even though the subject matter gets sophisticated, the book is always readable. The book is opinionated (with a strong bias toward RISC architecture), but nevertheless this is a definitive work on computer organization from the system perspective.


    This page titled 3.5: Before-Or-After Atomicity I - Concepts is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Jerome H. Saltzer & M. Frans Kaashoek (MIT OpenCourseWare) .