Sometimes an ETL process needs to generate files in very specific non row based formats. This can be standard files like EDIFACT record files or maybe files using some ancient format you need to feed to a legacy system. In this post I would like to show some techniques to create those files using Pentaho Kettle . . . → Read More: Writing custom output formats in Pentaho Kettle
Having rich date dimensions in a data warehouse often enables sophisticated business relevant analytical queries. This post shows a way to generate a detailed date dimension table that includes fixed date and variable date holidays, working days, special events and week of year information using the Kettle ETL tool, also known as Pentaho PDI. . . . → Read More: Building a detailed Date Dimension with Pentaho Kettle
Reliable location information is a valuable asset when looking at internet traffic. Among other uses it can be utilized for fraud prevention or help in estimating foreign market potential. This article explains how you can lookup location information for an IP address using Kettle and MaxMind’s free GeoIP database.
Edit: As Daniel Einspanjer points out, there’s a . . . → Read More: GeoIP lookup using MaxMind’s Country Database and Kettle
Kettle’s named parameters often enable very elegant solutions for ETL requirements. This post gives an introduction to the named parameters feature of Kettle.
It is safe to think of named parameters as variables with their initial values assigned before the ETL process starts. In fact, they are available as variables when the ETL process executes. Their values can be specified in the launch dialog inside spoon or on the command line when running kitchen or pan. Continue reading Using Named Parameters in Kettle
Every now and then a BLOB field pops up during ETL work. I suppose in migration work, mostly. Dealing with BLOBs is usually DB vendor specific and your mileage may vary as to how well Kettle is able to support the BLOB types of your RDBMS. But if you are lucky and Kettle recognizes your binary type . . . → Read More: Writing BLOBs to Files