CPSC 416: Assignment 3

ASSIGNMENT 3: Distributed Transaction Management.

DUE 10:00 PM March 25

IMPORTANT: This assignment may be done in pairs and you are strongly encouraged to do that. Late assignments penalized at 33.333% per day pro-rated. Late assignments not accepted after 2 days. In addition your program must compile without warnings or errors and you are not permitted to change compiler options or add directives in the code to disable warnings.

In this assignment you will be implementation a distributed transaction management system using two phase commit. You will also use vector timestamps and logging so that you can visualize the key steps in a transaction. See assignment 2 for a description of how to handle vector timestamps and use ShiViz. You are allowed to reuse your vector clock implementation from A2. You will

You are required to write 3 programs as part of this assignment:

The transaction manager or coordinator: tmanager
A worker: tworker
A program to issue instructors to workers: cmd

Transaction Subsystem

Recall that both the workers and the transaction manager keep log files to record their decisions. In addition, the worker records transaction information for the objects it modifies. Consequently, one of the things you will want to do is build a transaction logging system. To simplify marking, you will implement a roll-back transaction system where the "real object" is modified and the old values are recorded in the log file. The sample code illustrates how to name and create a transaction log file. Although each worker and the transaction manager will write to their own log file, they must use a common transaction subsystem implementation. (i.e. you are to have only one transaction implementation regardless of whether it is used by a worker or transaction manager.) The precise information stored in the log file is a design decision, but keep in mind that a log file could have multiple sets of transaction related records in it and that you must also deal with the case when a worker or transaction manager crashes and then comes back to life. In such a situation you have to ensure that the vector clock starts where the other one left off. (Hint: Store the vector clock at the very start of the log file. Each time you change the vector clock seek to the start of the file and write the clocks out, or better yet, map the start of the file into memory using mmap and then update the memory and sync it. You might also want to open the file a second time time in append mode so that you write the other transaction records to the end of the file.

Transaction Manager

The transaction manager is responsible for coordinating one or more simultaneous transactions. For this assignment you may assume that there are at most 4 transactions in progress at once and that the total number of workers, across all transactions is no more than 9. (This implies that there will be no more than 10 vector clocks. As in assignment 2 you can use the port number as the node number. As described below the worker will have two UDP ports, you should probably use the command port as the node ID for the vector clocks.) The transaction manager takes a single argument, the port it is to listen on and send from. As in assignment 2 you are to used UDP as the transport protocol and do not have to confirm if a packet is received.

Worker

To simulate the modification of objects involved in a transaction, the worker has some state information it maintains to simulate two integer objects, A and B and string identifier. Since the objects have to be durable, this state information is stored in a file and needs to be updated each time an object's value is changed. The format of the state information can be found in the tworker.h file. The IDstring can be anything you want. The fields A and B will be updated by commands, and the vectorClock values and lastUpdateTime fields are changed every time either A or B is updated. Keep this in mind when you record information in the transaction log as the both the vectorClock and lastUpdateTime will need to revert to the values they had at the start of the transaction. The type and format of the messages exchanged between the workers and transaction manager are left as a design decision. However, if a message requires a response, you are to use a timeoutvalue of 10 seconds. After 10 seconds, the entity waiting for a response can assume the other side has crashed and perform whatever the appropriate action is at that point. If a worker is ever in the "uncertain" state and needs to get a decision from the coordinator it should wait 30 seconds for a decision after it has sent the vote that it is prepared to commit and once every 10 seconds after that until it gets a result.

The worker program takes a single argument, the UDP port number it will listen on for the commands described below. Note you will need to use a 2nd UDP port to communicate with the transaction manager on. You should probably let the system select this port number when appropriate. There is one important thing to remember, if the worker crashes when it comes back up in must use this same port to interact with the transaction manger. This means that the worker will need to record the port number, it also means that the transaction manager needs to keep track of the contact information for each worker. A suggestion would be to record at the start of the transaction log file along with the vector clocks the port number to use for the transaction manager/worker interactions. When a worker is restarted after a crash it must be restarted with the same command port number specified on the command line that it was originally started with. Note that since the workers don't maintain the contact information for the other workers when they restart and are in the uncertain state they will have to poll the transaction manager until a response is received. This could be a long time if the manager is down.

cmd

A big challenge with testing the transaction management system is demonstrating that it works. The purpose of the cmd program is to help simplify this task. This program interprets the arguments supplied on the command line and sends a "command" to the identified worker. The worker then performs the action or actions specified by the command. These commands can be used to simulate various types of interactions between the transaction manager and the workers. Only your worker process needs to accept and respond to these commands. You can assume that the UDP packet is delivered and acted on by the worker. There is no response message from the worker to the cmd program. The commands are as follows:

begin WORKER_HOST WORKER_PORT TX_MANAGER_HOST TX_MANAGER PORT TID
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to send a begin transaction command to the TX_MANAGER TX_PORT instructing it to start a new transaction with the transaction ID TID. (Normally we would let the transaction manager select the TID, but to provide more control options we are providing the TID. If the transaction ID already exists then a failure indication is to be returned to the worker and it should be logged as an even in the ShiViz log.)
join WORKER_HOST WORKER_PORT TX_MANAGER_HOST TX_MANAGER PORT TID
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to send a join transaction command to the TX_MANAGER TX_PORT instructing it to add this worker to the transaction. A success or failure indication needs to be returned from the transaction manager to the worker.
newa WORKER_HOST WORKER_PORT NEWVALUE
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to change the value of the A object to NEWVALUE. If a transaction is not currently underway the change is simply to be made and then synced to disk.
newb WORKER_HOST WORKER_PORT NEWVALUE
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to change the value of the B object to NEWVALUE. If a transaction is not currently underway the change is to be made and then synced to disk. If a transaction is not currently underway the change is simply to be made and then synced to disk.
newid WORKER_HOST WORKER_PORT newIDSTR
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to change the value of the ID string to newIDSTR.
crash WORKER_HOST WORKER_PORT
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to "crash". You are to simulate a crash by calling _exit() immediately.
delay WORKER_HOST WORKER_PORT DELAY
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to delay all responses to the transaction manager by DELAY seconds. If the value is 0 it means respond immediately. The default behaviour is to respond immediately. If the value is negative, the worker is to wait the absolute value of the DELAY, respond, and then crash immediately after responding. If the value is -1000 then the worker is to make its decision and perform all the actions required by that decision but crash just before responding to the the coordinator.
commit WORKER_HOST WORKER_PORT
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to commit the transaction that is in progress. You may assume that a worker is only ever working with one transaction at a time. (However, the transaction manager may be dealing with more than one transaction.)
commitcrash WORKER_HOST WORKER_PORT
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to send a special commit message to coordinator such that the coordinator will crash after logging the commit decision but before responding to any workers. From the worker's perspective this is treated as a normal commit situation. You may assume that a worker is only ever working with one transaction at a time. (However, the transaction manager may be dealing with more than one transaction.)
abort WORKER_HOST WORKER_PORT
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to abort the transaction that is in progress. You may assume that a worker is only ever working with one transaction at a time.
abortcrash WORKER_HOST WORKER_PORT
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to send a special abort message to coordinator such that the coordinator will crash after logging the abort decision but before responding to any workers. From the worker's perspective this is treated as a normal abort message to the coordinator. You may assume that a worker is only ever working with one transaction at a time. (However, the transaction manager may be dealing with more than one transaction.)
voteabort WORKER_HOST WORKER_PORT
- Send a message to the worker process at WORKER_HOST WORKER_PORT instructing it to vote abort when the coordinator sends a prepare message. By default the worker will always vote commit to a prepare message. Note that a delay command will affect the timing of the response and whether or note the abort response is actually sent.

General Implementation Notes

A definition of an object store is provided in the file tworker.h. The example code in tworker.c illustrates how to create the backing store for this object and how to modify it. You are not allowed to change this structure or how the name of the file it is stored in is determined, or where in the file it is stored. Every time you change one of the 3 objects you must update the lastupdate time as part of the change along with updating the vector clock. The program dumpObject has been provided to print the contents of one of these object stores.

You will probably want to develop some scripts for testing your implementation. You should add these scripts to your repo and commit them.

ShiViz

As indicated earlier you need to log events to a ShiViz log file for debugging and visualization purposes. Each "node" is to keep its own event log file such that they can be combined and used by ShiViz. Clearly you will want to log the sending and receiving of messages between transaction manager and workers. You will probably also want the workers to log events that result in changes to the objects. Key events like the starting, stopping (crashing), and/or restarting of the worker or transaction manager are also good events to log.

As part of this assignment you are to submit an example of two log files you used submitted to ShiViz (They are to be named ShiViz-Log.dat1 ShiViz-log.dat2). One of the runs will show a normal transaction completion with now errors and the other must show a situation where a worker fails at some point but the transaction commits. In addition to the coordinator there must be at least 4 workers in the system. In the file ShiViz-Report.txt provide a brief description of each scenario highlighting key events. (e.g. when something crashed, when a recover starts, when a recover decisions is reached, messages involved in the recovery etc.) Be clear as to which file the description applies. The write-up is to be be in clear grammatically correct English (make sure to run a spell-checker on the submission) of between 200 and 300 words. To give you a sense of size, this paragraph is about 90 words.

Testing

You are required to fully test your implementation with multiple workers. It is expected that your implementation will properly handle the arbitrary failure of the coordinator and zero or more workers simultaneously. From the worker's perspective you can assume that the transaction manager will always recover. Here are some ideas on what you should consider testing for: (This is not an exhaustive list. Note: TX stands for transaction)

A properly completed transaction.
Coordinator crashes before a request to commit - TX should abort
Coordinator logs the decision to commit and then crashes
Worker asks to commit while coordinator is crashed - TX should abort
Worker crashes and recovers before TX commits, TX should abort.
Worker crashes after recording prepared, but before sending its response TX should abort.

What to hand in.

All work is to be handed in via stash. Do not, any any circumstances hand-in object code, executable files, or any other form of binary file and that includes word or PF documents. Make sure you hand-in:

A working Makefile that will compile your program on the department's Linux machines and produce the required executable programs.
All the .c and .h files required to compile your program. Your code is to be well commented and appropriately formatted.
The coverpage.txt file. See the contents of the file for an explanation of how to hand it in.
ShiVizLog.dat[12] - Your example log files of data for visualization
ShiVizReport.txt - Your report describing the log data.

Grading

Here is a rough grading guideline. This is meant to provide you with some guidance, but I do not guarantee that that final grading rubric will match this mark distribution exactly, but it should be close. Implementation refers to both the correct functionality of the code and the design.

Implementation of transaction subsystem 30%
Implementatiion of the coordinator 15%
Implementation of the worker 15%
Implementation of the cmd program and handling of cmds in the worker %10
ShiViz examples 10%
Interview with TA 20%