Wednesday, December 26, 2012

Pentaho Data Integration Repositories & Commands to run transformations and Jobs

Pentaho Data Integration Repositories & Commands to run transformations and Jobs :

We can save kettle jobs and transformations in three type of repositories.
 1. File Repository
 2. Database Repository
 3. Enterprise Repository (We can create this repository in Pentaho Enterprise Edition )
             
          I have mentioned the commands to run the transformations and jobs which are saved in all three types of repositories for windows as well as Linux.
 
    To run the jobs, we can use kitchen.bat or kitchen.sh command.
    To run the transformations, we can use pan.bat or pan.sh command

 Do the following steps to run the commands.

 1. Open the command prompt
 2. Go to the tool home directory.
   $> cd <data-integration-home>
      for me, it is a c:\pentaho\design-tools\data-integration.
      ex : cd c:\pentaho\design-tools\data-integration 
 3. Now run the command

Windows :

 File Repository: 
      pan.bat /file:"C:\Users\Suribabu\Desktop\Pentaho\repository\Trans\Example.ktr" /level:Basic 
      kitchen.bat /file:"C:\Users\Suribabu\Desktop\Pentaho\repository\Jobs\Example.kjb" /level:Basic 

 Database Repository :
        pan.bat /rep:"DB_Rep" /trans:"Example" /dir:/Trans/ /user:admin /pass:admin /level:Basic 
        kitchen.bat /rep:"DB_Rep" /job:"Example" /dir:/Jobs/ /user:admin /pass:admin /level:Basic 

 Enterprise Repository:
        pan.bat /rep:"DB_Rep" /trans:"Example" /dir:/public/Trans/ /user:joe /pass:password /level:Basic
        kitchen.bat /rep:"DB_Rep" /job:"Example" /dir:/public/Jobs/ /user:joe /pass:password /level:Basic


 Linux : 
File Repository:
     pan.sh -file="/home/suri/repository/Trans/Example.ktr" -level=Minimal
     kitchen.sh -file="/home/suri/repository/Jobs/Example.ktr" -level=Minimal 

 Database Repository : 
       pan.bat -rep:"DB_Rep" -trans:"Example" -dir:/Trans/ -user:joe -pass:password -level:Basic 
       kitchen.bat -rep:"DB_Rep" -job:"Example" -dir:/Jobs/ -user:joe -pass:password -level:Basic 

 Enterprise Repository:
       pan.bat -rep:"DB_Rep" -trans:"Example" -dir:/public/Trans/ -user:joe -pass:password -level:Basic
       kitchen.bat -rep:"DB_Rep" -job:"Example" -dir:/public/Jobs/ -user:joe -pass:password -level:Basic 

 You can schedule the above commands or You can write wrapper script on top of the commands as follows and schedule it:

example.bat or example.sh 
# cd <data-integration-home>
cd c:\pentaho\design-tools\data-integration 
pan.bat /file:"C:\Users\Suribabu\Desktop\Pentaho\repository\Trans\Example.ktr" /level:Basic


 Now You can run and schedule example.bat or example.sh file for your ETL Trans.

4 comments:

  1. Hi Suribabu,

    If i want pass execute through UI how should i pass the parameters?

    as of now i am passing the below path but not working
    If i go to data-integration path then paste remaining path then going properly

    C:\data-integration\pan.bat /rep:"BEAT" /trans:Trans_Aggrinput_RespondentDetails_Part1 /dir:/home/datamatics/ /user:datamatics /pass:datamatics /level:Basic

    ReplyDelete
  2. Thanks a Lot Suribabu .It really helped and saved my time.
    Please keep posting more about PDI.

    ReplyDelete
  3. Thank you for writing such a good article on Pentaho Data Integration. Get some more details on Pentaho Data Integration Pentaho Consulting

    ReplyDelete
  4. Thanks a lot for the information. when i try to execute , i get the below error.
    RepositoriesMeta - Reading repositories XML file: /root/.kettle/repositories.xml
    Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92)
    Caused by: java.lang.NullPointerException
    at org.pentaho.di.repository.kdr.KettleDatabaseRepository.disconnect(KettleDatabaseRepository.java:1723)
    at org.pentaho.di.pan.Pan.main(Pan.java:452)


    Any idea about this error.
    Do we have any documentation for Database repository jobs executions.?

    ReplyDelete