Object Replication
XSTM's transactional memory implementation makes each transaction keep track of its reads and writes. When a transaction commits, reads are used to verify that the transaction is still valid and writes are used to update the objects. It is possible to send those over a network and use them on remote objects. If the remote objects are identical to local objects, writes can be applied in the same way on all instances. They will have exactly the same effect on each machine. This enables a clean and efficient way to replicate updates to objects.
Our model is based on the notion of Object Shares. Like a file share, when an object is in a share, modifications done to it are replicated to all machines where the share is openned. When a new machine connects to a share, it retrieves all current objects, and then stays synchronized with others machines until it disconnects. The share itself is a transactional object, so if a transaction that tries to add or remove an object fails, the share will not be modified and will remain consistent across all machines.

Shares can be connected in cluster or client-server architectures. In a cluster, when a transaction commits, its reads and writes are broadcasted to all other machines. One of them is automatically designated coordinator. It verifies that the transaction is valid and broadcasts the result. If the transaction is valid, all machines commit it, otherwise they discard it. In the client-server mode, each client sends his reads and writes to the server, which will send back if the transaction could be committed. In case of success, the server will also send writes to other clients.
Objects Are Read-only
An important consideration when using XSTM is to think of the replicated objects as being read-only by default. It is easier to think of a distributed application as a static cloud of objects spanning over the machines. Any modification requires a transaction, and ideally, when a transaction commits the cloud should be updated instantly and atomically. Then the cloud returns to its static state.
Avoiding Conflicts
As shown on transactional memory, when a transaction reads data it can become invalid. If another transaction modifies the same data at the same time, the first one will be aborted. If your application has a probability to read and write concurrently to the same objects, you can improve its performance if you can reduce the rate of the conflicts.
XSTM provides transacted implementations of common Java collections. A easy thing you can do for example if you use them is to call the methods with no return value. E.g. instead of using the usual "add" method to add an element to a Set, prefer the "addFast" method that we have added to TransactedSet. It does not return the boolean which tells if the element was previously in the set. This removes a read for the current transaction so it will not be aborted if another transaction adds or removes the same element concurently. By removing this read, you notify XSTM that your current transaction does not need to know if the element is in the set or not in the end, and you reduce conflicts.
Automatic Transactions
When you modify a replicated object without starting a transaction, XSTM will start one for you. E.g. let’s say you generated a User class with a field Name. You call setName("test") on an instance of it. The generated setName method contains a check which starts a transaction if there is not one already attached to the current thread. This transaction is committed at the end of the setName method so that the "test" string can be propagated to other machines. The transaction has not read any field so it has no reason to get aborted. The new value of the field will simply override the previous one as you would expect for a setter.
If you need to update several fields on an object, performance wise it is preferable to start a transaction yourself and do all the updates in its context. This way, XSTM has only one transaction to validate, and it can propagate all the changes at once on the network.
The behavior of some methods is not exactly the same if called in the context of a transaction or if they have to start their own. E.g. the "add" method on TransactedSet from our previous example returns a boolean indicating if the object was already present in the set. If the method is not called in the context of a transaction, it will start one to add the object and commit it. The fact that there was already an object in the set will be known only once the commit is finished. This might take some time, in particular if the object is shared with remote machines as the commit might have to be acknowledged by a remote coordinator. In the meantime, another transaction can add or remove the object from the set. It would be too slow to wait for the commit to complete on each method call, so the method returns immediately, without knowing if there is or not an object in the set. This is done by committing the transaction asynchronously and returning the default value false. That is why the current implementation of "add" always return false if it is not called in the context of a transaction. All methods that return a value as a side effect of updating an object have this behavior and return default values like false or null if you do not start a transaction before.
Chaining Transactions
For performance reasons, it is necessary to manage dependencies between transactions. When a client updates an object, there is a delay before this update is propagated. If another transaction tries to read the object during the delay, it will be invalidated because the object will have changed by the time it will try to commit. In most cases it would be more efficient to assume that the previous transaction will succeed and use its modified version of the object.
Dependency management allows this. Thanks to this an application can write continuously to an object without waiting for previous writes to be propagated. When a transaction starts, it stores the list of the transactions waiting to be propagated. Each time it reads an object, it walks this list to see if it has been modified, and uses its last version. The transaction then becomes dependant on the one that did the last update. If the later is discarded then the last object version that has been used by our transaction is no more, and our transaction must be discarded also.
Network Usage
Performance wise, when writing to one field of one object using standards desktop PCs, our current implementation allows around 30000 commits per second for non distributed objects, and a maximum of about 14000 per second on shared objects over a 100Mb/s network.
Transactions per second and used bandwidth on a 100Mb/s Ethernet network for various transaction sizes in bytes.
Link
More info in our free eBook, get it on XSTM web site.
Comments (0)
You don't have permission to comment on this page.