In this post I’d like to demonstrate a technique for creating dynamic Kettle transformations. What do I mean by dynamic in this case? Imagine a transformation dynamically creating or changing another transformation before it is executed. Why would anyone want to do this? In this post I’ll take up the example of crating crosstabs in Kettle, . . . → Read More: Write ETL that writes ETL – Creating Crosstabs with Kettle
In this post I would like to announce the release of the Ruby Scripting Step v1.0 for the community edition of Kettle 4.x. Grab it from the downloads section of the github page.
The plugin allows to easily put ruby into your ETL. It is bundled with many samples so it should be easy to get . . . → Read More: Releasing the Ruby Scripting Step for Kettle
This article shows how to quickly load a simple Mondrian cube into LucidDB and how to view the cube in several OLAP front-ends. If you would like to try the column-oriented DB for your OLAP, this post may help to get up and running.
Get your copy of LucidDB from luciddb.org and set it up . . . → Read More: Taking LucidDB for an OLAP Test-Drive
When doing ETL work every now and then the exact SQL query you want to execute depends on some input parameters determined at runtime. This requirement comes up most frequently when SELECTing data. This article shows the techniques you can employ with the “Table Input” step in PDI to make it execute dynamic or parametrized queries. . . . → Read More: Dynamic SQL Queries in PDI a.k.a. Kettle
Anybody who finds themselves working on a client’s environment will usually face the fact that access to databases is restricted to what’s absolutely required to get the job done. The source files and target systems will be available, but creating helper tables or databases may be completely out of the question, or it may involve overcoming . . . → Read More: Using an on-demand in-memory SQL database in PDI
ETL processes sometimes need to generate data, even if there’s no input. This may be a bit puzzling at times, since an usual ETL row stream produces nothing if there’s no input. In some scenarios an ETL process is supposed to generate some sort of aggregation, which implies it should report a value of 0 even . . . → Read More: Processing the void – detecting and handling empty row streams in PDI
Tracking the progress of a transformation in PDI Spoon usually involves closely observing the numbers displayed in the step metrics tab. The step metrics tab dispays information about each step’s processed rows and input/output buffers, which makes it an important tool for understanding step performance at a glance as well as understanding the progress of a . . . → Read More: Tracking Transformation Progress becomes easier in PDI
Relational applications often model generic hierarchies of variable depth (tree-like structures) by maintaining a parent id that points to the immediate parent of the record. This approach is called the “Adjacency List” model. This article covers how to effectively analyze such hierarchies using a bridge table and how to create a bridge table using Kettle a.k.a. . . . → Read More: Analyzing Hierarchical Data Using Bridge Tables
The Excel Writer plugin offers support for Excel template files that can be filled in a variety of ways using Kettle a.k.a. PDI. In this post I would like to show how to fill an Excel report template file that has pre-styled cells, formulas and charts on different sheets. The entire report is filled within a . . . → Read More: Using the Excel Writer Step