PV168

Transactions

Transactions

  • Transactions are a way of preserving consistency in data.
  • They ensure that when we are making a change in data, that we will either do it correctly or not at all.

Transactions

  • ACID principles:
    • Atomicity
    • Consistency
    • Isolation
    • Durability
Transactions

Atomicity

  • Transaction is an indivisible unit of work
  • All operations in a transaction must complete successfully
  • If any operation fails, the entire transaction is rolled back
  • From the outside, it either happened completely or not at all
Transactions

Consistency

  • Transactions must transition the system from one valid state to another
  • Prevents the system from ending up in an invalid state
Transactions

Isolation

  • Transactions is always done one at a time
  • Different isolation levels:
    • Read Uncommitted - dirty reads allowed
    • Read Committed - multi-version read
    • Repeatable Read - phantom reads allowed
    • Serializable - full isolation
  • What levels are default in various databases?
Transactions

Durability

  • Once a transaction is committed, it will remain so, even in the event of power loss, crashes, or errors
  • Typically achieved through the use of transaction logs and backups

Transactions

  • The operation takes Runnable as an argument.
  • Runnable object is an executable block of code, similar to Main.
  • All calls in the block are executed or none of them are.
@Override
public void importData(String filePath) {
  transactionExecutor.executeInTransaction(() -> {
      // delete data
      // validate import data
      // import all data
  });
}

PV168

Parallelism

Retrospective

  • Did you understand what was done in the last seminar?
  • Was it intuitively understandable?
  • Are you able to describe how it works?
  • Did you find it difficult?
Parallelism

Parallelism In Modern Applications

  • Approaches:
    • explicit parallelism
    • thread pools
    • functional paradigm
  • Code example
Parallelism

Explicit Parallelism

  • Uses synchronization primitives
    • mutex, monitors, locks, atomics
  • Low-level approach
  • Easy to create a bug
    • Better to avoid
Parallelism

Monitor

class Counter {
    private int value = 0;
    public synchronized int getValue() {
        return value++;
    }

    public int getSameValue() {
        synchronized(this) {
            return value++;
        }
    }
}
Parallelism

Monitor

  • The goal is to prevent concurrent access using the critical section.
  • Do not use multiple monitors on the same data.
  • Each object instance and the class itself has a monitor.
    • Do not mix them.
Parallelism

Monitor

  • Where to place synchronized?
    • in the method declaration
    • in the static method declaration
    • in its own code block
Parallelism

Explicit Parallelism

  • What is the difference between mutex and monitor?
    • Mutex is used to control the access to the critical section.
    • Monitor is a combination of mutex and condition variable.
  • Condition variable is another synchronization primitive
    • Sends notifications between threads.
Parallelism

Atomic numbers

  • Useful for just-a-counter cases.
  • Lower overhead
    • Monitors are quite expensive.
    • Can you say why?
Parallelism

Atomic numbers

class Counter {
    private AtomicInteger value = new AtomicInteger(0);
    public int getValue() {
        return value.getAndIncrement();
    }
}
Parallelism

Thread Pools

  • Separates thread management and the job specification
  • Either run-and-forget
  • ... or retrieve the result via Future<T>
  • The synchronization is done via thread-safe queue
    • and via the Future<T> class when used
  • Suitable for stateful classes
    • Resembles to message passing in OOP
Parallelism

Thread Pools

  • Sounds easy, right?
Parallelism

Thread Pools

  • One should guarantee at most one operation runs concurrently on the instance.
    • Synchronization must be done also between threads and the stateful object.
  • It can be achieved by adding a queue to each object.
  • Furthermore, the queue in the Thread Pool becomes "the queue of queues."
Parallelism

Functional Approach

  • Avoids manual thread management
  • Useful when data are independent to each other
    • stream-based data processing
    • without side-effects
    • ➡️ not useful for stateful objects
  • Read-only accesses to stateful objects are fine
Parallelism

Other

  • fibres/coroutines
  • message passing
  • processes
Parallelism

Fibres/Coroutines

  • Asynchronicity independent to threads
  • Virtual function stacks
  • I.e., server may have each transaction in a separate fiber.
    • It would not be possible with 1:1 mapping threads to transactions.
    • Remember - threads are just processes, hence large.
Parallelism

Message Passing

  • Similar to the Thread Pool approach.
  • No data sharing between classes.
  • Everything passed via messages.
  • The motivation is to prepare your environment for the next step
Parallelism

Processes

  • Do not create threads, run multiple processes
    • In case of crash, just once instance is down.
    • Better scaling - adhoc management
  • Multiprocessor execution
    • may run in different data centers
  • The protocol defines the correctness.