sauravomar: January 2019

Tuesday, 29 January 2019

G1 Garbage Collector in Java

G1 is introduced in Java7. Oracle 9 Hotspot JVM comes with default G1 Garbage collection.

One of the good property of this is you can configure this for maximum pause time using flag:

-XX:MaxGCPauseMillis=n.

Lots of real-world studies say most of the objects (90%) garbage collected in a young generation or in first garbage collection or minor GC (also it depends upon applications). Who survived a couple of GCs(major GC), present in old memory (old objects) they will remain survive more than 95% times.

Explaination on G1 Garbage Collector:

It does most of the work concurrently.
It uses non-continuous which enables G1 to deal with the very large heap efficiently.
Instead of dividing heaps into 3 spaces (old) like other Garbage Collectors like CMS (concurrent mark and sweep), Parallel etc.

it divides heap memory in small chunks. These regions are fix-sized (about 2Mb by default). like below

U: Unassigned, O: Old, S: survivor, E: eden

Splitting into small regions helps G1 concurrently run and finish it off very quickly.
While running GC on Eden space all the survived objects get copied to unassigned space. The unassigned space becomes survivor space.
If all the objects in Eden space are garbage then it can be declared as Unassigned.
G1 is not run on whole heap memory at once like others Garbage Collectors, instead of this it always selects the regions which are full or almost full to minimizes the amount of work to free heap space.
G1 only stops the application at the beginning of the GC for bootstrapping , this phase is called as Initial Mark.
While Application is executing it follow all the references and mark live objects, this phase called as Concurrent Mark.
When above phase(Concurrent Mark) is done then application again stops. for final cleanup is made, this phase called as Final Mark.
To move objects and reclaim heap memory, this phase called as Evacuation phase this phase is fast, called as Evacuation Phase.
This is not good for small heaps then it that case might be full GC is performed and might slow down overall executions. In that case increase the heap size or other Garbage collectors can be used.

Many properties and optimization can be used for G1 Gc. will be covered in an upcoming post.

Wednesday, 23 January 2019

Adder and Accumulator in JAVA8

Java 8 introduces lots of improvement like Stamped Locks Locks, Parallel Sorting Long or Double Adder and Long or Double Accumulator and lots of improvement.

LongAdder and LongAccumators they were present under java.util.concurrent.atomic.

LongAdder and LongAccumators which are recommended instead of the Atomic classes when multiple threads update frequently and less read frequently. During high contention, they were designed in such a way they can grow dynamically.

Atomic classes(AtomicLong or AtomicDouble) internally uses a volatile variable, so for any operation data need to fetch from memory which requires many CPU cycles, under heavy contention lot of CPU cycles has been wasted.

So LongAdder and LongAccumator design in such a way they use its local values for each thread and at last they can sum all the values. Internally they use cell object array which can grow on demand where the store value. More threads are calling increment(), the array will be longer. Each record in the array can be updated separately.

The code below shows how you can use LongAdder to calculate the sum of several values:

LongAdder counter = new LongAdder();

ExecutorService service = Executors.newFixedThreadPools(4);

Runnable incrementTask = () -> {
  counter.increment()  ;
};

for (int i = 0; i < 4; i++) {
          executorService.execute(incrementTask);
}

// get the current sumlong sum = counter.sum();

The result of the counter in the LongAdder is not available until we call the sum() method. This method iterates over the cell array and sums up all the value.

Adder class is used, to sum up, or adding the value, whereas Accumulator classes are given a commutative function to combine values or perform some action.

The code below shows how you can use LongAccumulator to calculate the sum of several values:

LongAccumulator acc = new LongAccumulator(Long::sum, 0);

ExecutorService service = Executors.newFixedThreadPools(4);

Runnable incrementTask = () -> {
  acc.accumulate()  ;
};

for (int i = 0; i < 4; i++) {
          executorService.execute(incrementTask);
}

// get the current sumlong sum = acc.get();

Here we have passed the sum function of Long class accumulate function will call our sum to function.

These classes implementations are very clever implementations in java8 they save a lot of CPU cycles and increasing the overall speed of execution of the process

Wednesday, 16 January 2019

Zero Copy in Linux

Most of the people already heard of Zero-Copy in Linux but very fewer people understand how it works because underneath it requires some operating system concepts.

Lets first understand how sending of data to the network works.

As we can see from the diagram:

Read system call causes a context switch from user mode to kernel mode. DMA engine reads the file contents from the disk and stores them into a kernel address space buffer.
Data is copied from the kernel buffer into the user buffer, and the read system call caused a context switch from kernel back to user mode and return.
To write the data to the socket, data is copied again from user context to kernel Context (Socket Buffer) and then sends to the network interface.

As we can see, it’s redundant to copy data between the Kernel Context and the Application Context. Using Zero Copy we can copy data directly from the Kernel Context to the Kernel Context.

So in Zero Copy we can bypass userspace entirely using the sendfile system call, which will copy the data directly from the to the Socket buffer. This turns out to be an important optimization which saves lots of CPU cycles, memory bandwidth.

Apache Kafka Uses Zero Copy for fast data transfers.

sauravomar