The Excel Writer plugin offers support for Excel template files that can be filled in a variety of ways using Kettle a.k.a. PDI. In this post I would like to show how to fill an Excel report template file that has pre-styled cells, formulas and charts on different sheets. The entire report is filled within a single transformation.
In order to follow along the tutorial you’ll need the Excel Writer plugin version 1.2. You will also need Kettle 4.1 or above since I will be using the “Block until steps finish” step introduced in Kettle 4.1. If you are using Kettle 3.2. or 4.0, don’t worry. There is an example transformation for those too. The “Block until steps finish” step has been replaced with a custom javascript step in this case. Get the example files to follow along.
The example report
The example report contains summary data for six months of sales and expenses. It contains three sheets.
- The “SourceData” sheet is where the numbers go. It contains the input fields, some calculated values and a pie chart.

- The “Metadata” sheet holds information about the report. It contains the generation date, the name of the transformation that generated it and the host it was generated on.

The template file
The template file for the report has been cleared of all input. The “SourceData” sheet needs to be filled in the highlighted spots now. The rest of the fields are calculated cells.
The “Metadata” sheet also needs some information filled in.
Filling the template file with Kettle
Looking at the data entry spots above it becomes clear, that there are 4 places to insert data.
- the year cell on the “SourceData” sheet
- the sales cells on the “SourceData” sheet
- the expenses cells on the “SourceData” sheet
- the cells on the “Metadata” sheet
There are four places to output data to, so there’s going to be four Excel Writer steps in the transformation. Now there’s a problem. Kettle transformations execute concurrently, so if all four steps started writing to the same Excel file, things would not work. Therefore it is necessary to ensure an order of execution. This can be done by creating four separate transformations and calling them one after another in a job, but this is not always convenient. More often than not related data that needs to go into different parts of an Excel document is produced naturally within a single transformation. Therefore in this tutorial the order is ensured by different means. Check out the transformation graph below.
Ensuring sequential order
The transformation writes to the Excel file in four steps, in the following order: year cell, sales cells, expenses cells, metadata cells. To make this work the Excel writer steps need to be configured to write to the same file, starting with a template file. To achieve this, the year writer step is configured to use the template file and create a new output file. The other Excel writer steps are configured to directly write into the (now existing) output file. In order for this to work it is necessary to also enable the “wait for first row before creating file” option on all the Excel writer steps. This simply makes sure that they start looking for the output file only after they see the first row to process. The idea is to delay that first row until all preceding writing operations on the output file are completed. This is where the “Block until steps finish” steps come in. The year writing step is first in line, it just takes the template file, creates the output file and writes into the year cell. The sales writer waits for the year writer to finish, the expense writer waits for the sales writer, and the metadata writer waits for the expense writer. This results in a nice chain of steps each waiting for their turn to do their work on the Excel report file. The result is a nice report with all the numbers filled in
If you are using Kettle 3.2 or 4.0, the “Block until steps finish” step is not available. In the example files I replaced this step with a custom javascript implementation of the same functionality. Other than that the transformations are identical.
It’s a question of style
Another aspect of filling an existing sheet with data is to leave the styles of existing cells unchanged. If anybody already did style the cells in the template, they should be left alone. So make sure the corresponding option is enabled on the Excel writer steps
Another task is to recalculate all formula cells of the workbook. Excel (depending on version, file type, and probably some other things) will not update the values of calculated cells automatically when opening the generated file. The Excel writer step can recalculate the formula cells too. The option is available on the content tab. But beware: some functions or naming types may not be supported by the underlying POI library yet, so you might run into errors. In these cases it might help to rephrase the formula, switch to the xls file type, or live with the fact that you have to hit Ctrl+Shift+Alt+F9 after you open the generated file for the first time to have Excel recalculate everything.
Note: if you’re missing these features on the Excel writer step dialog you’ll likely need to upgrade to version 1.2 of the Excel Writer step.
Downloads
Get the example files package. It contains the transformation for Kettle 4.1 and 4.0/3.2 along with all input and template files.
PS: Special thanks go to Dan Keeley for suggesting the “leave existing styles unchanged” and “recalculate formulas” features, and Dan Peacock for an elaborate version of the custom JS implementation of the “Wait for other step to finish” step.
PPS: dizzy brought up a common use case: print a “totals” line at the end of an output. The following sample demonstrates how to do it. It works on Kettle 4.1.x
Enjoy
Slawo







[...] This post was mentioned on Twitter by Matt Casters and André Simões, Slawomir Chodnicki. Slawomir Chodnicki said: Create Excel reports by filling data into templates using PDI | http://t.co/cDyauJp #pentaho #excel #kettle [...]
Hej Slawo,
Great post.
This kind of fits nicely together with the reporting step that Matt delivered some weeks ago. (After a lot of nagging from various people.)
KFF already includes a very simple Excel report based on the Excel writer step (v1). I guess this allows me to spice up things a bit using Excel.
I’ll be nice to have some visual reports on the behaviour of your job/trf after every run.
Thanks,
Jan
Hey Jan,
nice to hear the plugin finds its uses, especially with KFF
Keep up the good work. The plugin found its way into the 4.2 codebase btw, so from next release on it’s going to be included by default (yay)
Cheers
Slawo
Slawo,
One of the most practical plugin, it made me stick to Open Source. The Pentaho guys should buy to dinner.
Please keep on the good work.
Thomas
Hi Thomas,
thanks for the comment. Happy to see the plugin is used
As for the dinner, Pentaho guys *do* buy you dinner from time to time, just visit a pentaho community gathering like this one
Cheers
Slawo
What a great upgrade to the old existing excel writer, this totally rocks!
Great work Slawo, keep it up
Glad it works for you ITXpander
Can you add the feature “accept filename from field” as in the text output step?
Hi Marco,
since the Plugin went into Kettle trunk and will be available in Kettle 4.2 by default, it’s best to file feature requests in PDI Jira. Create an account if you don’t have one already, and file an improvement, specifying how the improvement should behave.
Cheers
Slawo
Slawo, thanks for the plugin, I’m happy it goes into the official package. Although I have one problem i cannot solve:
I have a list of products, where number of rows is always different. And I need to put some cells with excel’s SUM() function in the last row. It should sum all the values above. I think it’s the common task to put totals at the end of the list, so maybe you could think of a way to add this feature? I think the best option would be to change the footer in selected rows with excel’s aggregate functions. What do you think?
Hey dizzy,
Thanks for bringing that up, as it is indeed a common use case and it may not be obvious how it can be done. The step is designed with this in mind actually, but in a more generic way then being able to select a few aggregators. The idea is to write in two passes to the file. One pass writes the data, the second pass writes the formulas in a nice designated place that you’d usually want to design in the template. The technique is an extension of the “preserving custom footers” sample.
Get the demo transformation here: adding totals It requires Kettle 4.1 because it uses the blocker step again. You can work around using this step as described in the article, if you’re on an earlier version of Kettle
Cheers
Slawo
Oh, it’s much easier than I thought. Thanks!
dizzy
Hi Slawo,
thanks a lot for this aweasome plugin! I’ve been using it for about a month and it solves me a lot of little problem.
I’ve got only one problem, it seems from the POI library:
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Error opening new file
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : java.io.IOException: Unable to read entire header; 0 bytes read; expected 32 bytes
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.apache.poi.poifs.storage.HeaderBlockReader.alertShortRead(HeaderBlockReader.java:176)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.apache.poi.poifs.storage.HeaderBlockReader.(HeaderBlockReader.java:103)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.apache.poi.poifs.filesystem.POIFSFileSystem.(POIFSFileSystem.java:153)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.apache.poi.hssf.usermodel.HSSFWorkbook.(HSSFWorkbook.java:305)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.apache.poi.hssf.usermodel.HSSFWorkbook.(HSSFWorkbook.java:286)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.typeexit.plugin.excel.writer.ExcelWriterStep.prepareNextOutputFile(ExcelWriterStep.java:648)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.typeexit.plugin.excel.writer.ExcelWriterStep.processRow(ExcelWriterStep.java:88)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.trans.step.RunThread.run(RunThread.java:40)
2011/03/24 10:06:54 – ViaResNullo.0 – ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : java.lang.Thread.run(Unknown Source)
Can you help me solve this?
Thanks,
Andrea
P.s. Sorry for my bad english
Hey Andrea,
sure, I’ll look into it. Can you prepare a short transformation that shows the problem and send it to slawomir dot chodnicki at googlemail dot com ?
Cheers
Slawo
Hey,
Thanks to a workmate, i fix the problem: the .xls output file was probably corrupted, so I tried to delete it and it works.
It would be nice if you have the possibility to delete a specific sheet with this plugin, so there might be an excel writer 1.3 release?
I think it would be a good improvement.
Thanks anyway to your helpfulness,
Andrea
Hi,
is there a way to set the sheet name from a field?
/Alexander
Hi,
We are running transformations through jboss by including the supporting jars of version 3.2 in lib. Now have added the Excelwriter jars in lib.But getting the following error
Error reading object from XML file
Unable to load step info from XML step nodeorg.pentaho.di.core.exception.KettleStepLoaderException:
Unable to load class for step/plugin with id [TypeExitExcelWriterStep]. Check if the plugin is available in the plugins subdirectory of the Kettle distribution.
Kindly help.
Hi Alexander,
no, it is not possible to specify a sheet name from a field. The complexity of making all other features of the plugin work with unpredictable sheet switching is somewhat frightening
Cheers
Slawo
Hi Thilian,
Kettle 3.2. looks for step plugins in the following folders:
plugins/steps (may be relative to your jboss app?)
$HOME/.kettle/plugins/steps
if $KETTLE_HOME is set:
$KETTLE_HOME/.kettle/plugins/steps
You can also do your plugin loading yourself. Check how org.pentaho.di.trans.StepLoader is used.
Cheers
Slawo
Dear Slawo,
Thanks for your quick reply.
Even though ve included the path of the excelwriter.jar in kettle-plugins.xml,Still I m facing same error..:( Am I missing something w.r.to configuration.If so,Kindly guide me.
regards,
Thilan
[...] Release notes and usability post. [...]
I’m sorry Thilian,
I don’t have any more information about plugin loading than I already gave. Best to put the plugin folder into a place it is expected or do the plugin loading manually.
Cheers
Slawo
Hello Slawomir,
am trying to put a formula in a cell, this is the formula content:
‘here_path_to_another_excel_book(by the way very long path)\[here_another_excel_book.xls]sheet’!$here_column$here_row
For some reason that i can not understand POI seems to be parsing wrongly the URI. I think that the problem is de ‘ character. If i say to the step that the field dont have a formula it fill the cell with the desired string but not like a formula, just like string. Any idea?
Hey Rolando,
sadly I don’t recall anything off the top of my head. I’d suggest opening up a bug with a reproduction path over at http://jira.pentaho.com/browse/PDI and we’ll have a look.
Best
Slawo