To nicely deploy pdi at $work, I wanted to have it in our yum repositories. For this I used the fantastic fpm, the Effing Package Manager which enables you to build rpm without having to deal with complex spec files. In short you tell it that you want to build a rpm from a directory, no other options are mandatory, and it just works (but a few options are nice to tweak).
If you use my startup script, you can even add it in the rpm. The final command is something like:
fpm -s dir -t rpm \ --name pentaho-pdi \ --version 4.3.0 \ --depends jdk \ --vendor 'me@thisdwhguy.com' \ --url 'https://github.com/pentaho/pentaho-kettle' \ --description 'Pentaho pdi kettle' \ --maintainer 'Me <me@thisdataguy.com>' \ --license 'Apache 2.0' \ --epoch 1 \ --directories /opt/pentaho_pdi \ --rpm-user pentaho \ --rpm-group pentaho \ --architecture all \ --after-install=after-install.sh \ ./pentaho_pdi=/opt \ ./carte=/etc/init.d/carte
It wil probably not be exactly what you want, regarding paths and user. Furthermore, the after-install script needs to be generated (it just sets up ownership and rights of /etc/init.d/carte)
To make it easier, I created a small bash script with a few configuration variables and a few extra checks (mysql and vertica jars) which makes building very easy. You can just get this script, remove the checks if they are irrelevant to you, and you should be good to go. The script will even install fpm for you if needed.
#!/bin/bash if [ "x$1" == "x" ]; then echo "Need one parameter: the pentaho version string (eg 5.2.0.1)" exit 1; else PDIVERSION=$1 fi # name of the directory where pdi will be installed PDIDIR=pentaho_pdi # user to own the pdi files PDIUSER=pentaho # root of where pdi will be installed PDIROOT=/opt if ! which fpm 1>/dev/null 2>/dev/null; then echo "fpm is not installed. I will try to do it myself" echo "Installing relevant rpms..." sudo yum install -y ruby-devel gcc echo "Installing the fpm gem..." sudo gem install fpm if ! which fpm 1>/dev/null 2>/dev/null; then echo "failed installing fpm, please do it yourself: https://github.com/jordansissel/fpm" exit 1 fi else echo "fpm installed, good." fi if [ ! -d "$PDIDIR" ]; then echo "I expect a directory called $PDIDIR." echo "It is the 'dist' directory built from source renamed as $PDIDIR." echo "Look at https://github.com/pentaho/pentaho-kettle" exit 1 else echo "$PDI_DIR directory exists, good." fi ERRORS=0 find $PDIDIR -name \*mysql\*.jar | grep -qE '.*' if [[ $? -ne 0 ]]; then echo "Download the mysql jar from http://dev.mysql.com/downloads/connector/j/ and put it in the libext/JDBC (<5.0) or lib (>= 5.0) subdirectory of $PDIDIR." ERRORS=1 else echo "Mysql jar present in $PDIDIR, good." fi find $PDIDIR -name \*vertica\*.jar | grep -qE '.*' if [[ $? -ne 0 ]]; then echo "Get the vertica jar from /opt/vertica and put it in the libext/JDBC (<5.0) or lib (>= 5.0) subdirectory of $PDIDIR." ERRORS=1 else echo "Vertica jar present in $PDIDIR, good." fi if [[ $ERRORS -eq 1 ]]; then exit 1 fi # the init.d script will be installed as $PDIUSER, whereas it should be root # Check that carte init script exists, if yes add it to the options if [ -f ./carte ]; then (cat << EOC #!/usr/bin/env sh chown root:root /etc/init.d/carte chmod 744 /etc/init.d/carte chkconfig --add carte EOC ) > ./after-install.sh echo "After install script for carte generated at after-install.sh" CARTEOPTIONS="--after-install=after-install.sh ./carte=/etc/init.d/carte" else CARTEOPTIONS="" echo "No Carte init.d script present." fi # All good, let's build echo "Build the effing rpm, removing existing rpms first..." rm -f pentaho-pdi*rpm fpm -s dir -t rpm \ --name pentaho-pdi \ --version $PDIVERSION \ --depends jdk \ --vendor 'me@thisdataguy.com' \ --url 'https://github.com/pentaho/pentaho-kettle' \ --description 'Pentaho pdi kettle' \ --maintainer 'me@thisdataguy.com' \ --license 'Apache 2.0' \ --epoch 1 \ --directories $PDIROOT/$PDIDIR \ --rpm-user $PDIUSER \ --rpm-group $PDIUSER \ --architecture all $CARTEOPTIONS \ ./pentaho_pdi=/$PDIROOT \ rm -f after-install.sh
This will create a pentaho-pdi-${PDIVERSION}.noarch.rpm which you can just yum install or put it in your yum repositories.