Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Talend - generating n multiple rows from 1 row

Tags:

java

row

etl

talend

Background: I'm using Talend to do something (I guess) that is pretty common: generating multiple rows from one. For example:

ID | Name | DateFrom | DateTo

01 | Marco| 01/01/2014 | 04/01/2014

...could be split into:

new_ID | ID | Name | DateFrom | DateTo

01 | 01 | Marco | 01/01/2014 | 02/01/2014

02 | 01 | Marco | 02/01/2014 | 03/01/2014

03 | 01 | Marco | 03/01/2014 | 04/01/2014

The number of outcoming rows is dynamic, depending on the date period in the original row.

Question: how can I do this? Maybe using tSplitRow? I am going to check those periods with tJavaRow. Any suggestions?

like image 509
abierto Avatar asked Oct 14 '14 15:10

abierto


People also ask

How to add column names for randomly generated rows in Talend?

There are two ways to add column names for the randomly generated rows. The first option is to click the Edit Schema button and add the column names, data type, length, and precision. For the Talend tRowGenerator demo purpose, let me add the SNo and first name as the column names.

How to use Talend trowgenerator to generate random rows?

The Talend tRowGenerator supports all the built-in functions and has few functions to generate the first name, last name, state, etc. We can use this Talend Row Generator to generate as many rows as we want. Drag and drop the tRowGenerator into the job design. There are two ways to add column names for the randomly generated rows.

What is the maximum length of email address in Talend?

Email address is set to a length of 254, as this is the maximum permitted length. Talend and Java make it quick easy to generate Pseudo Random data, usually for testing purposes. This data can be high in volume and with good Entropy .


1 Answers

Expanding on the answer given by Balazs Gunics

Your first part is to calculate the number of rows one row will become, easy enough with a date diff function on the to and from dates

overall view

calculate number of rows per row


Part 2 is to pass that value to a tFlowToIterate, and pick it up with a tJavaFlex that will use it in its start code to control a for loop:

tJavaFlex start:

int currentId = (Integer)globalMap.get("out1.id");
String currentName = (String)globalMap.get("out1.name");
Long iterations = (Long)globalMap.get("out1.iterations");
Date dateFrom = (java.util.Date)globalMap.get("out1.dateFrom");

for(int i=0; i<((Long)globalMap.get("out1.iterations")); i++) { 

Main

  row2.id = currentId;
  row2.name = currentName;
  row2.dateFrom = TalendDate.addDate(dateFrom, i, "dd");
  row2.dateTo = TalendDate.addDate(dateFrom, i+1, "dd");

End

}

and sample output:

1|Marco|01-01-2014|02-01-2014
1|Marco|02-01-2014|03-01-2014
1|Marco|03-01-2014|04-01-2014
2|Polo|01-01-2014|02-01-2014
2|Polo|02-01-2014|03-01-2014
2|Polo|03-01-2014|04-01-2014
2|Polo|04-01-2014|05-01-2014
2|Polo|05-01-2014|06-01-2014
2|Polo|06-01-2014|07-01-2014
2|Polo|07-01-2014|08-01-2014
2|Polo|08-01-2014|09-01-2014
2|Polo|09-01-2014|10-01-2014
2|Polo|10-01-2014|11-01-2014
2|Polo|11-01-2014|12-01-2014
2|Polo|12-01-2014|13-01-2014
2|Polo|13-01-2014|14-01-2014
2|Polo|14-01-2014|15-01-2014
2|Polo|15-01-2014|16-01-2014
2|Polo|16-01-2014|17-01-2014
2|Polo|17-01-2014|18-01-2014
2|Polo|18-01-2014|19-01-2014
2|Polo|19-01-2014|20-01-2014
2|Polo|20-01-2014|21-01-2014
2|Polo|21-01-2014|22-01-2014
2|Polo|22-01-2014|23-01-2014
2|Polo|23-01-2014|24-01-2014
2|Polo|24-01-2014|25-01-2014
2|Polo|25-01-2014|26-01-2014
2|Polo|26-01-2014|27-01-2014
2|Polo|27-01-2014|28-01-2014
2|Polo|28-01-2014|29-01-2014
2|Polo|29-01-2014|30-01-2014
2|Polo|30-01-2014|31-01-2014
2|Polo|31-01-2014|01-02-2014
like image 171
Garlando Avatar answered Oct 14 '22 10:10

Garlando