Finding Duplicate Rows

duplicate rows

Since  Kettle 3.2  the Analytic Query transformation step is able to peek forward and backwards in the row stream. If you want to filter duplicate rows, you can do it by

sorting the rows on their key
fetching the previous row’s key using the Analytic Query step
filtering the duplicates if current and previous key are equal

Get the example . . . → Read More: Finding Duplicate Rows