Sublime text is an amazing text editor. Sleek, full of features, multi platform, very usable without a mouse. It is my editor of choice since a few years already.

One of its great advantages is that it is extensible in python, which makes it very easy to tweak.

I recently played with a vagrant box in which I needed to update a file. The file was mounted inside vagrant, but needed to be copied elsewhere inside the box, meaning it had to be copied manually every time I saved it. As I am very lazy (laziness drives progress, one of the favorite saying of my director of studies) I wanted to do that automagically. This is a very simple job, ideal for a first plugin.

So, where to start? Well, it is quite easy: on the menu bar click tools, then new plugin, Et voilà, you have your first plugin. Congratulations!

import sublime, sublime_plugin
class ExampleCommand(sublime_plugin.TextCommand):
 def run(self, edit):
 self.view.insert(edit, 0, "Hello, World")

This is a nice skeleton, but it does no go very far.

As I want to have an action on save, I needed to have an event listener plugin, in my case listening to a save event. I wanted to act after the event (use the file only after it was written). The API says that on_post_save_async is the best event for me, as it runs in a separate thread and is non blocking.

import sublime
import sublime_plugin


class UpdateOnSave(sublime_plugin.EventListener):

    def on_post_save_async(self, view):
        filename = view.file_name()
        # do something with the filename

Good, this is getting somewhere! All the base is present, now I just had to do something with this file.

The something is that case was a kind of subprocess call, to use vagrant ssh. Sublime already has a wrapper around subprocess, named exec. Exec can run in 2 contexts, view (basically the sublime buffer you are editing) to run TextCommands or window (sublime itself) to run all type of commands. Finding in which context to run your command is a bit of trial and error, but once done the last bit of the plugin is thus a trivial (once you know it) call to exec:

# please always use shlex with subprocess
import shlex
import sublime
import sublime_plugin
import os


class UpdateOnSave(sublime_plugin.EventListener):

    def on_post_save_async(self, view):
        filename = view.file_name()
        savedfile = os.path.basename(filename)
        saveddir = os.path.dirname(filename)

        # write in sublime status buffer
        sublime.status_message('Manually saving ' + filename)

        source_in_vagrant = '/vagrant/' + savedfile
        dest_in_vagrant = '/project/' + savedfile

        cmd_cp = "vagrant ssh -c 'sudo cp {0} {1}'".format(source, dest)

        view.window().run_command('exec', {
            'cmd': shlex.split(cmd_cp),
            'working_dir': saveddir,
        }
        )

Good, your plugin is ready! The last question is to know where to put it to have it actually used. With Sublime 3 under Linux, you need to have it in $HOME/.config/sublime-text-3/Packages/User. Note that it must be named something.py, with the .py extension, not the .py3 extension, or it would not be found.

You can find the plugin on github.

I recently came across Event Store, which as its name might hint, is, well, a store for events. The doc says it better than me:

Event Store stores your data as a series of immutable events over time, making it easy to build event-sourced applications.

I wanted to see how useful it would be for us, how it could fit in a Hadoop based platform. This post describes my findings.

Principles

EventStore is thus a database to store events. How is that different from a standard RDBMS, say MySQL? The answers lays in the words Event Sourcing. Basically, a standard database would store the current status of an item or a concept. Think for instance about a shopping cart. If a user adds item A, then item B, then removes item A, the database would have a shopping cart with one element only, A, in it.

If you follow the principles of Event Sourcing, instead of updating the state of your cart, you would instead remember events. User added A. User added B. User removed A. That way, at any point in time you know all the history of your cart. This might help you in many ways: debugging, analysing why product A does not sell so well or even when you have a new great idea, having a lot of relevant data to test it already. You never know which analysis you will want to do in the future. You can read a lot about this, I strongly this post by Martin Kleppman : Using logs to build a solid data infrastructure.

Technology stack and installation

Note: I did use the Linux build, version 3.0.5. The windows build might have less bugs.

EventStore is developed on .Net, and can be built under Mono for Mac or Linux. It is (partly) open source, with some extra tools requiring a licence. Installation is quite easy if you follow the getting started doc. It does look like quite a young project, the only way (for Linux) is to download a .tgz and uncompress it, there is no deb or rpm packages for instance. Inside the tarball, there is no init script, and there are some assumptions in startup scripts (proper chdir before running) which make me feel that the project is built for Windows first, with Linux as an after thought (but it is there), or that the the project is not fully mature yet.

Of course, running under Mono is still a bit worrying. The full .Net framework is not and will be ported, and the legal status of Mono is not fully clear. You might never know what the future will bring.

Managing and monitoring

There is a nice web interface, which is good to have an instantaneous view of your cluster. A dashboard can give you some monitoring information, which can then be accessed via an (undocumented) call to /stats. This will give you a nice JSON object full of information.

Another bug is that the /stats page does need authentication, but will happily return an empty document with a 200 status code if you do not authenticate. This is another proof of lack of maturity.

Data loading

With the HTTP API, it was quite easy. You just need to post some JOSN to an end point. That said, the doc to write events to a stream seems wrong or there is a bug in the version I am using (3.0.5), because EventStore requires a UUID and event type for each event, which can be either passed as part of the JSON, or as part as the header. The first example uses JSON, which did not work at all for me:

HTTP/1.1 400 Must include an event type with the request either in body or as ES-EventType header.

I did have to use a HTTP header. Not a big deal, but that feels like a bad start.

The load was quite slow (8 hours for 1GB JSON), but I cannot say where the time was spent as I only did some functional testing. I was running EventStore one a small virtual machine, with 1 core and 512MB of memory. I never went above 50% CPU usage or 350MB memory. That said, I did have to generate a UUID per event, and that might be slow.

The .Net (tcp) API is said to be much faster. I did not try it, as there are other issues which Event Store which makes it a bad choice for us.

There is a well on github a JVM client. This one is referenced but less described in the doc, and is said to work well up to older versions (3.0.1).

Data fetching

My feeling is that Event Store is mostly to be used as a queue. You have nice ways to subscribe to a stream of event (Atom feed), and add processing to it, via projections, which are javascript snippets. With those projections you can set up simple triggers on events, or build counters. The official documentation is not great, but you can get a list of blog posts going more in depths. Note that projections are considered beta, not to be used in projection.

Simple processing (counters) is quite easy via projections. One place where Event Store shines, is the processing of temporal series. An example is given in some of the blog posts, to analyse the time difference between commit and push per language on github.

There are other APIs (.Net, JVM plus some not officially supported), but they all are about reading a stream of events programatically, without the buit-in ability to do more. Of course, from your language you can do whatever you want.

A big lack to me is that there is no SQL interface. If we want the data to be accessed, we do need some developer time, making it harder for the data analysts. Furthermore, doing joins does look quite tricky.

Oh, and I could not add projections at all, as the web interface does not let me to, for some reason.

Summary

Event Store is not yet for us. The bad points for us are:

Mono does not feel safe to use for a major production brick
Project seems not mature: errors in documentation, which is as well hard to find. Web UI not fully functional.
Data fetching (projections) considered beta and not supposed to be used in production.
Other APIs are production ready, but will cost lots of developer time, instead of giving easy access to the data to analysts.
No SQL interface.
Loads of small bugs here and there.

Of course, I looked at it from the point of view of the guy who will have to maintain it, and develop against it. It has some pretty good points, though:

Although it is not well integrated in Linux environments, installation was fairly painless, It just worked.
The concepts behind Event Store are very neat
It is fairly active on github, I do expect some nice progression

This Data Guy

Journey in a world of big(ger) data

Category Archives: Test

Writing your first Sublime Text 3 plugin

Testing EventStore