Wednesday, December 23, 2015

Hazelcast basics and config issues

Very basics

Hazelcast is a open-source data-grid framework for java. It supports easy provisioning of distributed versions of common java data structures, locks and a simple gossip implementation.

To include it in a maven based project simply add something like the following in your pom file:
<dependency>
    <groupId>com.hazelcast</groupId>
    <artifactId>hazelcast</artifactId>
    <version>3.5.4</version>
</dependency> 
Then all that's left to do is create an instance and create and to create a simple distributed map do the following:
Config config = new Config();
HazelcastInstance h = Hazelcast.newHazelcastInstance(config);
IMap<String, String> map = h.getMap("my-map");
Basic usage of the map is the same as a ConcurrentMap with some extra functions mainly related to locking. In case of maps the hash range will be split among members and data will be divided based on keys. Default settings also sets a backup for each key. The default behaviour also requires all writes to ack from the replica for the master to commit the write.

Config issues

When first trying to use Hazelcast these are some issues that I came across that could be better documented:

1. Multicast

If you not specified Hazelcast will use Multicast for node discovery. While this wouldn't be an issue for deployments on bare servers, it would cause issues when used on VMs including AWS. To change service discovery to seed based add something like the following to the config.

    Config conf = new Config();
    JoinConfig joins = conf.getNetworkConfig().getJoin();
    joins.getMulticastConfig().setEnabled(false);
    joins.getTcpIpConfig().addMember(seed);
    HazelcastInstance instance = Hazelcast.newHazelcastInstance(conf);

Since Hazelcast uses a gossip implementation to manage joining and leaving nodes the host passed as seed will only be used when first joining the cluster, ones in the cluster, members are handled by Hazelcast.
There is also a discovery service specifically for AWS available but I have not tried it though.

2. Listen interface

With default config I couldn't get Hazecast to listen to the network interface (both on machines with single and multiple interfaces). To get this working I explicitly set Hazelcast up to listen to 0.0.0.0 and to do that I added the following to the config factory.

    conf.getNetworkConfig()
        .getInterfaces()
        .setEnabled(true)
        .addInterface("0.0.0.0");

This will enable the listen interface and set it to listen to 0.0.0.0. This can of corse be changed to a specific ip to make it listen to only one interface.

3. Shutdown hook capturing

By default Hazelcast will capture sigterm and shutdown gracefully when possible. While this behaviour is probably preferred for most cases, for my use case however I needed more fine grained control, so I needed to add the following setting:

    conf.setProperty("hazelcast.shutdownhook.enabled", "false");

When setting this just make sure to call shutdown on the instance when shutting down to not risk dataloss.

Monday, September 7, 2015

Yarn resource request normalizaion

I ran in to this issue last week, and it doesn't seem to be documented. This is written with the Fair scheduler in mind but it might be true for the capacity scheduler as well since the setting it self is Yarn global.
When a resource request is send to Yarn, to make management easier the request is normalised before processed. The way that requests are normalised in the case of fair scheduler is by passing the request through the following algorithm. 
min(((max(request, minimum_assign) + (normaliser-1))/normaliser)*normaliser, maximum_assing)
Since minimum and maximum are pretty well known and mentioned in several sites, they are probably already set, however the only place I could find anything about the normaliser was in the source itself.

The default value for the normaliser is set to 1024mb meaning that even if a request is made for 1mb, the request will be normalised to 1024. The same algorithm is also used for vcores, however since the normaliser is set to 1 it is not really an issue.

The value to change the normaliser is configurable using the yarn.scheduler.increment-allocation-mb setting, so it might be advisable to change it to 256 for smaller clusters.

Sunday, August 23, 2015

Overriding Java classes

Overriding Java classes is a technique that I use a lot, however after talking to other people it seems to be not so widely known. So what do I mean by overriding java classes? What I'm referring to is how the JVM deals with 2 classes with the same class and package name. What happens is the JVM will only use the class appearing first on the classpath.
So to put this to use, one could apply patches without touching the original package, or us it during patch development for quick compiling and testing.

Using this for Hadoop development or maintenance,  one method that I've used a lot is to add a patch dir to hadoop e.g. share/hadoop/patches and make sure that $HADOOP_PREFIX/share/hadoop/patches/* is added first to your classpath. Doing this would allow you to add jars with modified classes that overrides the original ones just by putting them in the patch dir.

To put this to use, lets say we are working on a patch for the Resource manager of yarn, and every time we want to test our changes or dump a new log, instead of spending 10-20 minutes rebuilding the entire Hadoop package, generating a new jar only containing the modified classes and then adding it first on the classpath would allow for sub second compiles meaning a greatly decreased cost for testing changes.

Friday, August 7, 2015

Joining Treasure Data

After spending 2.5 years working with distributed analytics systems at DeNA, I joined Treasure Data this week with the goal of challenging my weaknesses and raising my skill sets.

In my time at DeNA I spent most of my time in the Hadoop Infra dept. where I took the lead on Hadoop upgrade CDH3 to HDP2, took part in the introduction of Storm, ElasticSearch, Kafka and Consul.  However due to lack of clear stances related to open sourcing and sharing information outside the company, nothing ended up being shared.

With me joining TD this month, one of the major changes is the clear company stance when it comes to sharing with the community. Since the majority of systems at TD are built on open source, we also need to give back to the community. One part of that will be this blog, which I will be using as a notepad for new discoveries and things I learned.

So while I'm thankful for the chances I've gotten and everything I've learned during my time at DeNA, I look forward to getting used to working in a more open way.