Easily simulating connection timeouts

Posted on July 30, 2015 by This data guy

I needed an easy way to simulate timeout when connected to a REST API. As part of the flow of an application I am working on I need to send events to our data platform, and blocking the production flow ‘just’ to send an event in case of timeout is not ideal, and I needed a way to test this.

I know there are a few options:

Connecting to a ‘well known’ timing out url, as google.com:81, but this is very antisocial
Adding my own firewall rule to DROP connection, but this is a lot of work (yes, I am very very lazy and I would need to look up the iptables syntax)
Connecting to a non routable IP, like 10.255.255.1 or 10.0.0.0

All those options are fine (except the first one, which although technically valid is very rude and no guaranteed to stay), but they all give indefinite non configurable timeouts.

I thus wrote a small python script, without dependencies, which just listens to a port and makes the connection wait a configurable amount of seconds before either closing the connection, either returning a valid HTTP response.

Its usage is very simple:

usage: timeout.py [-h] [--http] [--port PORT] [--timeout TIMEOUT]

Timeout Server.

optional arguments:
 -h, --help show this help message and exit
 --http, -w if true return a valid http 204 response.
 --port PORT, -p PORT Port to listen to. Default 7000.
 --timeout TIMEOUT, -t TIMEOUT
 Timeout in seconds before answering/closing. Default
 5.

For instance, to wait 2 seconds before giving an http answer:

./timeout.py -w -t2

Would give you following output if a client connects to it:

./timeout.py -w -t2
Listening, waiting for connection...
Connected! Timing out after 2 seconds...
Processing complete.
Returning http 204 response.
Closing connection.

Listening, waiting for connection...

This is the full script, which you can find on github as well:

#!/usr/bin/env python
import argparse
import socket
import time


# Make the TimeoutServer a bit more user friendly by giving 3 options:
# --http/-w to return a valid http response
# --port/-p to define the port to listen to (7000)
# --timeout/-t to define the timeout delay (5)

parser = argparse.ArgumentParser(description='Timeout Server.')
parser.add_argument('--http', '-w', default=False, dest='http', action='store_true',
                    help='if true return a valid http 204 response.')
parser.add_argument('--port', '-p', type=int, default=7000, dest='port',
                    help='Port to listen to. Default 7000.')
parser.add_argument('--timeout', '-t', type=int, default=5, dest='timeout',
                    help='Timeout in seconds before answering/closing. Default 5.')
args = parser.parse_args()


# Creates a standard socket and listen to incoming connections
# See https://docs.python.org/2/howto/sockets.html for more info
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('127.0.0.1', args.port))
s.listen(5)  # See doc for the explanation of 5. This is a usual value.

while True:
    print("Listening, waiting for connection...")
    (clientsocket, address) = s.accept()
    print("Connected! Timing out after {} seconds...".format(args.timeout))
    time.sleep(args.timeout)
    print('Processing complete.')

    if args.http:
        print("Returning http 204 response.")
        clientsocket.send(
            'HTTP/1.1 204 OK\n'
            #'Date: {0}\n'.format(time.strftime("%a, %d %b %Y %H:%M:%S", time.localtime())
            'Server: Timeout-Server\n'
            'Connection: close\n\n'  # signals no more data to be sent)
        )

    print("Closing connection.\n")
    clientsocket.close()

Puppet and virtual resources tutorial to manage user accounts

Posted on July 24, 2015 by This data guy

Virtual resources are a very powerful and not well understood feature of puppet. I will here explain what they are and why there are useful, using as example the management of users in puppet.

By default, in puppet, a resource may be specified only once. The typical example when this can be hurtful is when a user needs to be created on for instance the database and web servers. This user can be only defined once, not once in the database class and once in the webserver class.

If you were to define this user as a virtual resource, then you can define them in multiple places without issue. The caveat is that as the name suggests this user is virtual only, and is not actually created on the server. Some extra work is needed to create (realize in puppet-speak) the user.

Data structure and definitions

Jump to the next section if you directly want to go to the meat of the post. I still want to detail the data structure for better visualisation.

The full example can be found on github. The goal is to be able to define users with the following criteria and assumptions:

User definition is centralised in one place (typically common.yaml). A defined user on hiera does not mean that they are created on any server, it must be explicitly required.
A user might be ‘normal’ or have sudo rights. Sudo rights mean that they can do whatever they wishes, passwordless. There is no finer granularity.
A user might be normal on a server, sudo on another one, absent on others. This can be defined anywhere in the hiera hierarchy.

As good practice, all can be done via hiera. A user can be defined so, with simple basic properties:

accounts::config::users:
  name:
    # List of roles the user belongs to. Not necessarily matched to linux groups
    # They will be used in user::config::{normal,super} in node yaml files to
    # decide which users are present on a server, and which ones have sudo allowed.
    # Note that all users are part of 'all' groups
    roles: ['warrior', 'priest', 'orc']
    # default: bash
    shell: "/bin/zsh"
    # already hashed password.
    # https://thisdataguy.com/2014/06/10/understand-and-generate-unix-passwords
    # python -c 'import crypt; print crypt.crypt("passwerd", "$6$some_random_salt")'
    # empty/absent means no login via password allowed (other means possible)
    pass: '$6$pepper$P9Wt3.3Uqh9UZbvz5/6UPtHqa4KE/2aeyeXbKm0mpv36Z5aCBv0OQEZ1e.aKcPR6RBYvQIa/ToAfdUX6HjEOL1'
    # A PUBLIC rsa key.
    # Empty/absent means not key login allowed (other means possible)
    sshkey: 'a valid public ssh key string'

Roles here have no direct Linux counterpart, they have nothing to do with linux groups.
They are only an easy way to manage users inside hiera. You can for instance say
that all system administrators belong to the role sysops, and grant sudo to the sysops group everywhere in one go.

Roles can be added at will, and are just a string tag. Role names will be used later to actually select and create users.

To then actually have users created on a server, roles must be added to 2 specific configuration arrays, depending if a role must have sudo rights or not. Note that all values added to these arrays are merged along the hierarchy, meaning that you can add users to specific servers in the node definition.

For instance, if in common.yaml we have:

accounts::config::sudo: ['sysadmin']
accounts::config::normal: ['data']

and in a specific node definition (say a mongo server) we have:

accounts::config::sudo: ['data']
accounts::config::normal: ['deployer']

– all sysadmin users will be everywhere, with sudo
– all data users will be everywhere, without sudo
– all data users will have the extra sudo rights on the mongo server
– all deployer users will be on the mongo server only, without sudo

Very well, but to the point please!

So, why do we have a problem that cannot be resolved by usual resources?

I want the user definition to be done in one place (ie. one class) only
I would like to avoid manipulate data outside puppet (not in a ruby library)
If a user ends up being normal and sudo in a server, declaring them twice will not be possible

How does this work?

Look at the normal.pp manifest, Unfortunately, the sudo.pp manifest duplicates it almost exactly. The reasons is ordering and duplication of definition of the roles resource. This is a detail.

Looking at the file, here are the interesting parts. First accounts::normal::virtual

class accounts::normal { 
  ...
  define virtual() {...}
  create_resources('@accounts::normal::virtual', $users)
  ...
}

This defines a virtual resource (note the @ in front of the resource name on the create_resources line), which is called for each and every element of $users. Note that as it is a virtual resource, users will not actually be created (yet).

The second parameter to create_resources() needs to be a hash. Keys will be resource titles, attributes will be resource parameters. Luckily, this is exactly how we defined users in hiera!

This resource actually does not do much, it just calls the actual user creating resource, called Accounts::Virtual. Accounts::Virtual is a virtual resource, used as you would call any other puppet resource:

resource_name {title: attributes_key => attribute_value}

This is how the resource is realised. As said above, creating a virtual resource (virtual users in our case) does not automatically create the user. By calling it directly, the user is finally created:

accounts::virtual{$title:
 pass   => $pass,
 shell  => $shell,
 sshkey => $sshkey,
 sudo   => false
}

Note the conditional statement just before:

unless defined (Accounts::Virtual[$title]) { ... }

In my design, there is no specific sudoer resource. The sudoer file is managed as part as the user resource. This means that if a user is found twice, once as normal and once as sudo, the same user resource could be declared twice. As the sudo users are managed before the normal users, we can check if the user has already been defined. If that’s the case, the resource will not be called a second time.

This is all and well, but how is the accounts::normal::virtual resource called? Via another resource, of course! This is what roles (accounts::normal::roles) does:

define roles($type) { ... }
create_resources('accounts::normal::roles', $normal)

Notice the difference in create_resources? There is no @ prefix in the resource name. This means that this resource is directly called with $normal as parameter, and is not virtual.

Note the $normal parameter. It is just some fudge to translate an array (list of role to create as normal user) to a hash, which is what create_resources() requires.

Inside account::normal::roles, we found the nicely named spaceship operator. Its role will be to realise a bunch resources, but only a subset of them. You can indeed give a filter parameter. In our case (forgetting the ‘all’ conditional, which is just fudging to handle a non explicit group), you can see its use to filter on roles:

 Accounts::Normal::Virtual <| roles == $title |>

What this says is simply that we realise the resources Accounts::Normal::Virtual, but only for users having the value $title in their roles array.

To sum up, here is what happened in pseudo code

for each role as $role (done directly in a class)
- for each user as $user (done in the role resource)
  - apply the resource virtual user (done in the virtual user resource)

Easy, no?

Testing EventStore

Posted on June 25, 2015 by This data guy

I recently came across Event Store, which as its name might hint, is, well, a store for events. The doc says it better than me:

Event Store stores your data as a series of immutable events over time, making it easy to build event-sourced applications.

I wanted to see how useful it would be for us, how it could fit in a Hadoop based platform. This post describes my findings.

Principles

EventStore is thus a database to store events. How is that different from a standard RDBMS, say MySQL? The answers lays in the words Event Sourcing. Basically, a standard database would store the current status of an item or a concept. Think for instance about a shopping cart. If a user adds item A, then item B, then removes item A, the database would have a shopping cart with one element only, A, in it.

If you follow the principles of Event Sourcing, instead of updating the state of your cart, you would instead remember events. User added A. User added B. User removed A. That way, at any point in time you know all the history of your cart. This might help you in many ways: debugging, analysing why product A does not sell so well or even when you have a new great idea, having a lot of relevant data to test it already. You never know which analysis you will want to do in the future. You can read a lot about this, I strongly this post by Martin Kleppman : Using logs to build a solid data infrastructure.

Technology stack and installation

Note: I did use the Linux build, version 3.0.5. The windows build might have less bugs.

EventStore is developed on .Net, and can be built under Mono for Mac or Linux. It is (partly) open source, with some extra tools requiring a licence. Installation is quite easy if you follow the getting started doc. It does look like quite a young project, the only way (for Linux) is to download a .tgz and uncompress it, there is no deb or rpm packages for instance. Inside the tarball, there is no init script, and there are some assumptions in startup scripts (proper chdir before running) which make me feel that the project is built for Windows first, with Linux as an after thought (but it is there), or that the the project is not fully mature yet.

Of course, running under Mono is still a bit worrying. The full .Net framework is not and will be ported, and the legal status of Mono is not fully clear. You might never know what the future will bring.

Managing and monitoring

There is a nice web interface, which is good to have an instantaneous view of your cluster. A dashboard can give you some monitoring information, which can then be accessed via an (undocumented) call to /stats. This will give you a nice JSON object full of information.

Another bug is that the /stats page does need authentication, but will happily return an empty document with a 200 status code if you do not authenticate. This is another proof of lack of maturity.

Data loading

With the HTTP API, it was quite easy. You just need to post some JOSN to an end point. That said, the doc to write events to a stream seems wrong or there is a bug in the version I am using (3.0.5), because EventStore requires a UUID and event type for each event, which can be either passed as part of the JSON, or as part as the header. The first example uses JSON, which did not work at all for me:

HTTP/1.1 400 Must include an event type with the request either in body or as ES-EventType header.

I did have to use a HTTP header. Not a big deal, but that feels like a bad start.

The load was quite slow (8 hours for 1GB JSON), but I cannot say where the time was spent as I only did some functional testing. I was running EventStore one a small virtual machine, with 1 core and 512MB of memory. I never went above 50% CPU usage or 350MB memory. That said, I did have to generate a UUID per event, and that might be slow.

The .Net (tcp) API is said to be much faster. I did not try it, as there are other issues which Event Store which makes it a bad choice for us.

There is a well on github a JVM client. This one is referenced but less described in the doc, and is said to work well up to older versions (3.0.1).

Data fetching

My feeling is that Event Store is mostly to be used as a queue. You have nice ways to subscribe to a stream of event (Atom feed), and add processing to it, via projections, which are javascript snippets. With those projections you can set up simple triggers on events, or build counters. The official documentation is not great, but you can get a list of blog posts going more in depths. Note that projections are considered beta, not to be used in projection.

Simple processing (counters) is quite easy via projections. One place where Event Store shines, is the processing of temporal series. An example is given in some of the blog posts, to analyse the time difference between commit and push per language on github.

There are other APIs (.Net, JVM plus some not officially supported), but they all are about reading a stream of events programatically, without the buit-in ability to do more. Of course, from your language you can do whatever you want.

A big lack to me is that there is no SQL interface. If we want the data to be accessed, we do need some developer time, making it harder for the data analysts. Furthermore, doing joins does look quite tricky.

Oh, and I could not add projections at all, as the web interface does not let me to, for some reason.

Summary

Event Store is not yet for us. The bad points for us are:

Mono does not feel safe to use for a major production brick
Project seems not mature: errors in documentation, which is as well hard to find. Web UI not fully functional.
Data fetching (projections) considered beta and not supposed to be used in production.
Other APIs are production ready, but will cost lots of developer time, instead of giving easy access to the data to analysts.
No SQL interface.
Loads of small bugs here and there.

Of course, I looked at it from the point of view of the guy who will have to maintain it, and develop against it. It has some pretty good points, though:

Although it is not well integrated in Linux environments, installation was fairly painless, It just worked.
The concepts behind Event Store are very neat
It is fairly active on github, I do expect some nice progression

Replacing a single mongoDB server

Posted on June 4, 2015 by This data guy

I am moving a single mongoDB server to another hardware, and I want to do that with the least possible production interruption, of course.

Well, it so happens that it is not possible if you did not plan it from the start. You can argue that if I have a single SPOF server in production I am doing my job badly, but this is beside the point for this post.

MongoDB has this neat replication features, where you can build a cluster of servers, with one primary and a few slaves (secondaries), among other options. If you properly configured mongo to use this feature, then you can add a secondary, promote it to primary to eventually switch off the initial primary. This is what I will describe here.

Note that there will be 2 (very short) downtimes. One to create a replica set (this is just a restart), and one where the primaries are switched (you need to redirect connections to your new primary).

A note about vagrant

If you are using vagrant, make sure that you use the plugin vagrant-hostmanager (vagrant plugin install vagrant-hostmanager) which helps managing /etc/hosts from inside vagrant boxes. Furthermore, make sure you set a different hostname to each of your VMs. By default, if you use the same basebox, they will probably end up having the same hostname (config.vm.hostname in your vagrant file, or the more specific version if you define a cluster inside your vagrantfile).

Configure the replication set

First of all, you need to tell mongoDB to use the replication feature. If not you will end up with messages like:

> rs.initiate()
{ "ok" : 0, "errmsg" : "server is not running with --replSet" }

You just need to update your /etc/mongodb.conf to add a line like so:

replSet=spof

This is the config option that enables the replication. All servers in your replica set will have the same option, with the same value.

On a side note, in the same file make sure you are not binding only to 127.0.0.1, or you will have trouble having your 2 mongo instances talking to each other.

The sad thing is that mongo cannot reload its config file:

root@debian-800-jessie:~# service mongodb reload
[warn] Reloading mongodb daemon: not implemented, as the daemon ... (warning).
[warn] cannot re-read the config file (use restart). ... (warning).

Right. So a restart is needed:

service mongodb restart

You can now connect to your mongo shell the usual way, and initialise the replica set. This follows part of the tutorial explaining how to convert a standalone server to a replica set. Just type in the mongo shell:

rs.initiate()

and the (1 machine) replica set is now operational.

You can check this easily:

> rs.conf()
{
  "_id" : "spof",
  "version" : 1,
  "members" : [
  {
  "_id" : 0,
  "host" : "debian-800-jessie:27017"
 }
 ]
}

On your (new with an empty mongo) server, make sure that you add the replSet line as well in mongodb.conf.

Note that the hostname must be resolvable on the other machine of the cluster. If mongoDB somehow picked the local hostname, your replica set will just not work. If the local hostname has been picked up, see option 2 (reconfig) below.

You are now ready to add the second server to the set.

spof:PRIMARY> rs.add("server2.example.com:27017")
{ "ok" : 1 }

We can check that all is fine:

spof:PRIMARY> rs.conf()
{
   "_id" : "spof",
   "version" : 3,
   "members" : [
   {
       "_id" : 0,
       "host" : "debian-800-jessie:27017"
   },
   {
       "_id" : 1,
       "host" : "server2.example.com:27017"
   },
   ]
}

If the local hostname was chosen, the easiest option is to fully reconfigure the replica set from within the mongo shell, based on your current configuration :

// Get current configuration object
cfg=rs.conf()
// update the current machine to use the non local name
cfg.members[0].host="server1.example.com"
// fully add server 2
cfg.members[1]={"_id":1, host:"server2.example.com"}
// use this new config
rs.reconfig(cfg)

Ok, we are all good, and the replica set is properly set up. What happened on our server2 which was empty when we started?

on server2:

spof:SECONDARY> use events
spof:SECONDARY> db.events.find()
error: { "$err" : "not master and slaveOk=false", "code" : 13435 }

Hum, what does that mean? In short, there is a replication delay between the primary and secondary, so by default mongo disables reads to the secondary to make sure you always read up to date data. You can read more about read preferences, but to tell mongo that yes, you know what you are doing, just issue the slaveOK() command:

spof:SECONDARY> rs.slaveOk()
spof:SECONDARY> db.events.find()
{ "_id" : ObjectId("556d550b59a5fb8615044c72"), "name" : "relevant" }

Success! (In this vagrant example, there was only one document in the collection).

In real life, if the secondary needs to sync a lot of data, it will stay in state STARTUP2 for a long time, which you can see via rs.status(). In the log files of the new secondary, you can see progress per collections. It will then move to RECOVERING to finally become SECONDARY, which is when it will start accepting connections.

Switch primaries

We are all set, you waited long enough to have the secondary in sync with the primary. What now? We first need to switch primary and secondary roles. This can be done easily by changing the priorities:

spof:PRIMARY> cfg=rs.conf()
spof:PRIMARY> cfg.members[0].priority=0.5
0.5
spof:PRIMARY> cfg.members[1].priority=1
1
spof:PRIMARY> rs.reconfig(cfg)
spof:SECONDARY>

As you can see, your prompt changed from primary to secondary.

From this moment on, all connections to your now secondary should succeed but you will not be able to do much (secondary cannot write, and remember slaveOk()). You must thus be sure that your client connect to the new primary, or that you know that the connection is readonly in which case you can use slaveOk(). This switchover will be your last downtime.

Clean up

you can tell your new master that the secondary is not needed anymore:

rs.remove('k1.wp:27017')

Note that if you switch the secondary off (service mongodb stop), then the primary will step down to secondary as well, as it cannot guarantee that it is in a coherent state. This is what you get from using a replica set with only 2 machines.

You can now dispose of your old primary as you wish.

If you want to play around with your old primary, you will be out of luck to start with:

"not master or secondary; cannot currently read from this replSet member"

It will of course be obvious that you need to remove the replSet value from mongodb.conf and restart the server. Sadly, you will then be greeted by another, longer message when you connect:

Server has startup warnings: 
Wed Jun 3 13:10:44.435 [initandlisten] 
Wed Jun 3 13:10:44.435 [initandlisten] ** WARNING: mongod started without --replSet yet 1 documents are present in local.system.replset
Wed Jun 3 13:10:44.435 [initandlisten] ** Restart with --replSet unless you are doing maintenance and no other clients are connected.
Wed Jun 3 13:10:44.435 [initandlisten] ** The TTL collection monitor will not start because of this.
Wed Jun 3 13:10:44.435 [initandlisten] ** For more info see http://dochub.mongodb.org/core/ttlcollections
Wed Jun 3 13:10:44.435 [initandlisten]

Well, the solution is almost obvious from the error message. If there is a document in local.system.replset, let’s just remove it!

> use local
switched to db local
> db.system.replset.find()
{ "_id" : "spof", "version" : 4, "members" : [ { "_id" : 1, "host" : "server1.example.com:27017" } ] }
> db.system.replset.remove()
> db.system.replset.find()
>

Once you exit and reconnect to mongoDB, all will be fine, and will have your nice standalone server back.

eg: examples for common command line tools

Posted on May 28, 2015 by This data guy

Are you tired to RTFM? Does this xkcd comic feel familiar to you?

Enter eg, which provides easy examples to common command line tools. Instead of having to find your way in the full manual of tar, you can just type:

eg tar

And you will have common usages, nicely formatted and even colored. For the example of tar, you will have examples of basic usage, tarring, untarring and more:

Of course, if you then want more information, TFM is the place to go.

eg is dead easy to install. You have to options:

pip install eg
# or
git clone https://github.com/srsudar/eg .
ln -s /absolute/path/to/eg-repo/eg_exec.py /usr/local/bin/eg

Et voila, you can start using eg.

Eg itself can be easily extended, as the example are just markdown files put in the right place. You can find all the documentation including formatting options and more in the eg repository.

Last but not least, the author suggests to alias eg to woman for something that is like man but a little more practical:

alias woman=eg

Tutorial: Install CDH 5 for testing on one machine

Posted on May 21, 2015 by This data guy

This is a tutorial after my own experience to install CDH 5.4 via the Cloudera Manager on one machine only for test purposes. This is based on a Mint machine (based on Ubuntu/Debian). Commands will thus be given with apt-get, you can probably just replace apt-get by yum if you are trying to do this on a Redhat-based server.

Preparation
Installation
Problems/Tips

Preparation

ssh

Install ssh server on your machine:

apt-get install openssh-server

Make sure you can connect as root if you do no want everything to run under one user, which is a question which will be asked during the installation process (screen 3). Running all under one user is nice for a one-machine test, but I believe you might run into issues if you later want to extend your cluster. For this reason I chose the normal, multi user (hdfs, hadoop and so on) installation. Cloudera actually gives a warning for the single user installation:

The major benefit of this option is that the Agent does not run as root. However, this mode complicates installation, which is described fully in the documentation. Most notably, directories which in the regular mode are created automatically by the Agent, must be created manually on every host with appropriate permissions, and sudo (or equivalent) access must be set up for the configured user.

On my machine, I for instance needed to update /etc/ssh/sshd_config to have the line :

PermitRootLogin yes

Other packages

For the heartbeat, you need supervisor and the command ntpdc:

apt-get install supervisor ntp

Supported platforms

Officially, Cloudera can install on some versions of Debian or Ubuntu. If you use a derivative, it might work (YMMV), but Cloudera will refuse to install. You can fool the installer by changing the lsb-release file:

sudo mv /etc/lsb-release /etc/lsb-release.orig
sudo ln -s /etc/upstream-release/lsb-release /etc/lsb-release
# After installation you can revert with:
sudo rm /etc/lsb-release
sudo mv /etc/lsb-release.orig /etc/lsb-release

Installation

Follow the documentation from cloudera:

wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
chmod u+x cloudera-manager-installer.bin
sudo ./cloudera-manager-installer.bin

Note that it will install the oracle JDK (1.7 for CDH 5.4.0), and postgres. A the end your browser should open and connect you to http://localhost:7180. Do not panic if the connection cannot be established at first. Try again in a minute or two, to give the servers enough time to properly startup. Note that if your machine is not very powerful, it can take 2 minutes. The username and password there are admin/admin.

Problems/Tips

IP address

Click a few times continue, and you will be asked to enter an IP address. As you are only testing on your machine, type yours, which you can find via hostname -I in your terminal. Make sure to use your real IP, not 127.0.0.1. The reason is that if later you extend your cluster with another node, and this node number 2 (n2) wants to access node number 1 (n1), it would try to access n1 via 127.0.0.1, which would of course point to n2 itself. This is a general good practice. As a host will be added to the cloudera manager if it heartbeats, a partial installation might make a ghost host (localhost) appear in ‘Currently Managed Host’. In that case, make sure they are not selected before carrying on.

Acquiring installation lock

If you are blocked on ‘Acquiring installation lock’. Click ‘Abort’, then:

rm -rf /tmp/scm_prepare*
rm -f /tmp/.scm_prepare_node.lock
# if above is not enough:
service cloudera-scm-agent restart
service cloudera-scm-server-db restart
service cloudera-scm-server restart

and ‘retry failed host’

Full restart

If like me you screwed up everything, you can always uninstall everything (make sure to say yes when asked to delete the database files). Cloudera explains (parts of) what to do, but the violent and complete way is as follow, to do as root:

/usr/share/cmf/uninstall-cloudera-manager.sh

# kill any PID listed by this ps below:
ps aux | grep cloudera
# this command does it automatically
kill $(ps ax --format pid,command | grep cloudera | sed -r 's/^\s*([0-9]+).*$/\1/')
# purge all cloudera packages
apt-get purge cloudera-manager-server-db-2 cloudera-manager-server cloudera-manager-daemons cloudera-manager-agent 
# I am not so sure when this one is installed or not:
apt-get purge cloudera-manager-repository
# your choice, would clean up orphaned packages (postgres)
apt-get autoremove
# purge all droppings
rm -rf /etc/cloudera*
rm -rf /tmp/scm_prepare*
rm -f /tmp/.scm_prepare_node.lock
rm -rf /var/lib/cloudera*
rm -rf /var/log/cloudera*
rm -rf /usr/share/cmf
rm -rf /var/cache/yum/cloudera*
rm -rf /usr/lib/cmf

Could not connect to host monitor

After all is done with success everywhere, you go back to the home page and you see a lot of sad empty graphs with ‘query error’. This means that the management services are not running.

You can easily fix this by clicking on the top left ‘Add Cloudera Management Service’, and following the wizard from there.

Vertica: some uses of sequences

Posted on January 5, 2015 by This data guy

When you want to have a unique identifier for a table, Vertica gives you 3 options. Create a numeric column either as AUTO_INCREMENT or IDENTITY, or use a sequence. The doc is comprehensive about the differences, but it can be summed up quite easily:

IDENTITY and AUTO_INCREMENT are part of a table, a SEQUENCE is a standalone object
AUTO_INCREMENT is simple and stupid and cannot be tweaked much, IDENTITY has a few more options (start value, increment) a SEQUENCE has a lot of options.

I will here talk a bit about the sequences, as they are the one allowing the most freedom.

Forcing the value of a incrementing column

If you have a table with an ‘id’ column, defined as a IDENTITY or AUTO_INCREMENT, you cannot set a value to this field during data load.:

CREATE TABLE tst_auto (id AUTO_INCREMENT, value varchar(10));
INSERT INTO tst_auto (id, value) VALUES (42, 'cheese');

ERROR 2444: Cannot insert into or update IDENTITY/AUTO_INCREMENT column "id"

If, on the other hand, you use a SEQUENCE, this is possible:

CREATE TABLE test_seq (id INT, value VARCHAR(10));
CREATE SEQUENCE seq;
ALTER TABLE test_seq ALTER COLUMN id set default NEXTVAL('seq');

You can then see that it does what you expect:

INSERT INTO test_seq (value) VALUES ('default');
INSERT INTO test_seq (id, value) VALUES (42, 'forced');
select * from test;

 id | value
----+---------
 1  | default
 42 | forced

If you use this, you must of course be careful that there are no duplication. If the example, you could for instance set the next value of the sequence to 43:

ALTER SEQUENCE seq RESTART WITH 43;

Using a sequence as a global identifier

The fun thing with a sequence is that it can be used on more than one table, thus giving you a global identifier, for instance:

-- 2 tables...
CREATE TABLE tst  (id INT, value varchar(10));
CREATE TABLE tst2 (id INT, value varchar(10));

-- 1 sequence...
CREATE SEQUENCE tst_seq;

-- ... said sequence is used by both tables
ALTER TABLE tst  ALTER COLUMN id set default NEXTVAL('tst_seq');
ALTER TABLE tst2 ALTER COLUMN id set default NEXTVAL('tst_seq');

-- testing...
INSERT INTO tst  (value) VALUES ('tst');
INSERT INTO tst2 (value) VALUES ('tst2');

-- success!
SELECT * FROM tst;
--  id |  value
-- ----+---------
--   1 | tst
-- (1 row)

SELECT * FROM tst2;
--  id |  value
-- ----+----------
--   2 | tst2
-- (1 row)

Easy import from Hive to Vertica

Posted on December 15, 2014 by This data guy

Properly setup, Vertica can connect to Hcatalog, or read hdfs files. This does require some DBA work, though.

If you want to easily get data fro Hive to Vertica, you can use the COPY statement with the LOCAL STDIN modifier and pipe the output of Hive to the input of Vertica. Once you add a dd in the middle to prevent the stream to just stop after a while, this works perfectly. I am not so sure why dd is needed, but I suppose it buffers data and makes the magic happen.

hive -e "select whatever FROM wherever" | \
dd bs=1M | \
/opt/vertica/bin/vsql -U $V_USERNAME -w $V_PASSWORD -h $HOST $DB -c \
"COPY schema.table FROM LOCAL STDIN DELIMITER E'\t' NULL 'NULL' DIRECT"

Of course, the previous statement needs to be amended to use your own user, password and database.

The performance are quite good with this, although I cannot give a good benchmark as in our case the hive statement was not trivial.

One thing to really take care of is where you run this statement. You can run it from everywhere as long as hive and Vertica are accessible, but be aware that data will flow from hive to your server to Vertica. Running this statement on a Vertica node or your hive server will reduce the network traffic and might speed up things.

This post is based on my answer to a question on stackoverflow.

A GUI for Vertica: DbVisualizer

Posted on December 8, 2014 by This data guy

vsql is very porwerful, but it is always nice to have a nice GUI tool with your database. As Vertica can be accessed via ODBC, most tools can at least provide some kind of GUI on top of Vertica.

DbVisualizer (“dbvis”) goes one step further. They teamed up with HP to make dbvis aware of Vertica specifities as projections, sessions, load streams and more. You can find the list of Vertica features supported by dbvis on their website. Note that some of them are only available for the Pro version, unfortunately (note the little ¹in the link above) but a lot of goodness is available in the free version.

A few screenshots below will show some nice Vertica-specific options.

Creating a table, with column encoding

Table with projections

DBA views (notice sessions, locks, tuple mover…)

Vertica: Panic – Data consistency problems

Posted on December 1, 2014 by This data guy

While replacing a node and during the recovery, the node did reboot (human mistake). After actual reboot the recover did not proceed and the node stayed in DOWN state, even if we tried to restart it via the admintools.

In vertica.log, we could see the following lines:

<PANIC> @v_dwh_node0003: VX001/2973: Data consistency problems found; startup aborted
 HINT: Check that all file systems are properly mounted. Also, the --force option can be used to delete corrupted data and recover from the cluster
 LOCATION: mainEntryPoint, /scratch_a/release/vbuild/vertica/Basics/vertica.cpp:1441

As the logs nicely suggest, using the (undocumented) --force option can help. That said, this option cannot be used from the admintool curse interface, and must be used from the command line:

/opt/vertica/bin/admintools -t restart_node -d $db_name -s $host --force

That way corrupted data was deleted, and the recovering could carry on nicely.

This Data Guy

Journey in a world of big(ger) data

Category Archives: Tech