The Problem

How to output to multiple files with dynamic contents and unique names using KNIME without hardcoding filters and using multiple branches? Loops, you need loops.

Coming from Alteryx that doesn’t have loops (you are required to use macros to achieve the same effect) it took me a while to understand KNIME loops. But once mastered they give you lots of flexibility and are much easier to understand and debug than macros since they follow the normal design rules and are visible at all times on the canvas.

An Example Dataset

Say you have a dataset as shown below and you want to output the data to separate files with one Collection No_ per file. This dataset also changes so next time there will be new values in Collection No_ but you still want the export to work.

How can you do this using KNIME?

The Loop Solution

The solution is to use a Group Loop where we create a loop that will run once per Collection No_ group. This means that for each run of the loop we will only see records for one Collection No_ inside the loop. One the next iteration we’ll see the next Collection No_ and so on until the full dataset is processed.

In the loop we do the following steps:

  1. Use Group Loop Start set on Collection No_,
  2. Get the a single row with Collection No_ using a GroupBy node (to be used for the output file name)
  3. Create a dynamic string to use as the destination path for the file using the Collection No_ and String Manipulation node
  4. Convert the string to a Path using String to Path
  5. Make the Path column into a variable using Table Column to Variable
  6. Feed the loop data into a CSV Writer node and use the Path variable to change the filename dynamically
  7. Connect the Loop End to the CSV Writer and Group Loop Start

The final workflow looks like this.

KNIME File Output Loop

In plain English the workflow does this on each iteration.

  • Uses a Group loop to iterate over only one group of records at once
  • Extracts the grouping item to be used as the file name. This isn’t necessary if you don’t need the grouped data to be used as part of the filename as you could create a dynamic timestamp instead.
  • Write the records to csv using the dynamic name fed in via a variable

You could of course add much more data manipulation between the loop start and end but this gives you the basic template for dynamically outputting files.

Leave a comment

Your email address will not be published. Required fields are marked *