Build a rpm for pentaho pdi (kettle)

To nicely deploy pdi at $work, I wanted to have it in our yum repositories. For this I used the fantastic fpm, the Effing Package Manager which enables you to build rpm without having to deal with complex spec files. In short you tell it that you want to build a rpm from a directory, no other options are mandatory, and it just works (but a few options are nice to tweak).

If you use my startup script, you can even add it in the rpm. The final command is something like:

fpm -s dir -t rpm \
  --name pentaho-pdi \
  --version 4.3.0 \
  --depends jdk \
  --vendor 'me@thisdwhguy.com' \
  --url 'https://github.com/pentaho/pentaho-kettle' \
  --description 'Pentaho pdi kettle' \
  --maintainer 'Me <me@thisdataguy.com>' \
  --license 'Apache 2.0' \
  --epoch 1 \
  --directories /opt/pentaho_pdi \
  --rpm-user pentaho \
  --rpm-group pentaho \
  --architecture all \
  --after-install=after-install.sh \
  ./pentaho_pdi=/opt \
  ./carte=/etc/init.d/carte

It wil probably not be exactly what you want, regarding paths and user. Furthermore, the after-install script needs to be generated (it just sets up ownership and rights of /etc/init.d/carte)

To make it easier, I created a small bash script with a few configuration variables and a few extra checks (mysql and vertica jars) which makes building very easy. You can just get this script, remove the checks if they are irrelevant to you, and you should be good to go. The script will even install fpm for you if needed.

#!/bin/bash

if [ "x$1" == "x" ]; then
  echo "Need one parameter: the pentaho version string (eg 5.2.0.1)"
  exit 1;
else
  PDIVERSION=$1
fi

# name of the directory where pdi will be installed
PDIDIR=pentaho_pdi
# user to own the pdi files
PDIUSER=pentaho
# root of where pdi will be installed
PDIROOT=/opt


if ! which fpm 1>/dev/null 2>/dev/null; then
    echo "fpm is not installed. I will try to do it myself"
    echo "Installing relevant rpms..."
    sudo yum install -y ruby-devel gcc
    echo "Installing the fpm gem..."
    sudo gem install fpm
    if ! which fpm 1>/dev/null 2>/dev/null; then
        echo "failed installing fpm, please do it yourself: https://github.com/jordansissel/fpm"
        exit 1
    fi
else
    echo "fpm installed, good."
fi

if [ ! -d "$PDIDIR" ]; then
    echo "I expect a directory called $PDIDIR."
    echo "It is the 'dist' directory built from source renamed as $PDIDIR."
    echo "Look at https://github.com/pentaho/pentaho-kettle"
    exit 1
else
    echo "$PDI_DIR directory exists, good."
fi

ERRORS=0

find $PDIDIR -name \*mysql\*.jar | grep -qE '.*'
if [[ $? -ne 0 ]]; then
    echo  "Download the mysql jar from http://dev.mysql.com/downloads/connector/j/ and put it in the libext/JDBC (<5.0) or lib (>= 5.0) subdirectory of $PDIDIR."
    ERRORS=1
else
    echo "Mysql jar present in $PDIDIR, good."
fi

find $PDIDIR -name \*vertica\*.jar | grep -qE '.*'
if [[ $? -ne 0 ]]; then
    echo  "Get the vertica jar from /opt/vertica and put it in the libext/JDBC (<5.0) or lib (>= 5.0) subdirectory of $PDIDIR."
    ERRORS=1
else
    echo "Vertica jar present in $PDIDIR, good."
fi

if [[ $ERRORS -eq 1 ]]; then
    exit 1
fi

# the init.d script will be installed as $PDIUSER, whereas it should be root

# Check that carte init script exists, if yes add it to the options
if [ -f ./carte ]; then
(cat << EOC
#!/usr/bin/env sh
chown root:root /etc/init.d/carte
chmod 744 /etc/init.d/carte
chkconfig --add carte
EOC
) > ./after-install.sh
    echo "After install script for carte generated at after-install.sh"
    CARTEOPTIONS="--after-install=after-install.sh ./carte=/etc/init.d/carte"
else
    CARTEOPTIONS=""
    echo "No Carte init.d script present."
fi


# All good, let's build
echo "Build the effing rpm, removing existing rpms first..."
rm -f pentaho-pdi*rpm
fpm -s dir -t rpm \
  --name pentaho-pdi \
  --version $PDIVERSION \
  --depends jdk \
  --vendor 'me@thisdataguy.com' \
  --url 'https://github.com/pentaho/pentaho-kettle' \
  --description 'Pentaho pdi kettle' \
  --maintainer 'me@thisdataguy.com' \
  --license 'Apache 2.0' \
  --epoch 1 \
  --directories $PDIROOT/$PDIDIR \
  --rpm-user $PDIUSER \
  --rpm-group $PDIUSER \
  --architecture all $CARTEOPTIONS \
  ./pentaho_pdi=/$PDIROOT \

rm -f after-install.sh

This will create a pentaho-pdi-${PDIVERSION}.noarch.rpm which you can just yum install or put it in your yum repositories.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s