A Pentaho DI Project Readme
Install Java
Installed Java
Install Pentaho Community Edition
download released build and unzip.
download page: https://wiki.pentaho.com/display/COM/Community+Edition+Downloads
direct download link: https://jaist.dl.sourceforge.net/project/pentaho/Pentaho 8.0/client-tools/pdi-ce-8.0.0.0-28.zip
p.s:
如果需要连接MySql数据库,需要下载放置连接Java驱动包
进入页面 https://dev.mysql.com/downloads/connector/j/ 下载。
direct download link: https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.zip
unzip and put into pentaho's lib
dir
Configuration
Pentaho环境配置
在project新建.kettle目录,在该目录下建立:
kettle.properties,配置环境变量;
repositories.xml: 配置Repository信息,特别是更改repository路径到project目录;
Kitchen 命令行运行job
先设置os环境变量KETTLE_HOME到项目目录, 然后运行对应命令
Windows
kitchen.bat /file:C:\Users\shawnwang\Downloads\pentaho_project1\crm_init_job.kjb /level:Basic /param "p_use_local_tmp=0"
with repo
kitchen.bat /rep:test_repo /job:crm_init_job /param "p_use_local_tmp=0"
Linux
kitchen.sh -file=/PRD/updateWarehouse.kjb -level=Minimal -param:p_use_local_tmp=0
with repo
kitchen.sh -rep:test_repo -job:job_test1 -param:p_use_local_tmp=0
Jobs in our case
J_exp_salesforce_files_data
仅把salesforce文件数据拉到本地临时表,(可配置是否同时生成文件)
kitchen.bat /rep:test_repo /job:J_exp_salesforce_files_data /param "p_gen_file=1"
参数: 1. p_gen_file:
1 : 导入Files数据的同时,会生成文件 0 (default): 不会生成文件,只导入Files数据到临时表
p_inc_mode:
1 (default): 使用增量模式,会去查临时表最大时间,从该时间开始 0 : 全量模式,从一个默认很老的时间开始
e.g:
./kitchen.sh -rep:test_repo -job:J_exp_salesforce_files_data -level=Basic
J_gen_files_from_salesforce_tmp_table
根据salesforce本地临时表的数据生成文件
参数: 1. p_begin_id:
begin min id to process (default: 0)
kitchen.bat /rep:test_repo /job:J_gen_files_from_salesforce_tmp_table
crm_init_job
全量拉salesforce数据到本地CRM系统
参数: p_use_local_tmp:
1 (default): 使用已有的本地salesforce tmp文件数据 0: 先全量拉取salesforce文件数据。
kitchen.bat /rep:test_repo /job:crm_init_job /param "p_use_local_tmp=0"
Last updated
Was this helpful?