Iam looking to convert the ISO time format to yyyy-mm-dd hh:mm:ss.SSS. However Im not able achive the conversion. Iam new to pig and im trying to write a udf to handle the conversion from ISO format to yyyy-mm-dd hh:mm:ss.SSS.
Kindly guide me I tried the built functions of pig (FORMAT,DATE_FORMAT) however was not able to convert the data to the needed format.
Current data format: 2013-08-22T13:23:18.226220+01:00
Required Data format: 2013-08-22 13:23:18.226
import java.io.IOException;
import java.text.DateFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.EvalFunc;
import org.joda.time.DateTime;
import org.joda.time.format.*;
import org.joda.time.format.DateTimeFormatter;
import org.joda.time.format.DateTimeFormatterBuilder;
public class test extends EvalFunc<String>{
public String exec(Tuple input) throws IOException {
if ((input == null) || (input.size() == 0))
return null;
try{
String time = (String)input.get(0);
DateFormat dt = new SimpleDateFormat ("yyyy-mm-dd hh:mm:ss.SSS");
Date d_t = dt.parse(time);
String timedt = getTimedt(d_t);
return timedt;
} catch (ParseException e) {
return null;
}
}
private String getTimedt(Date d_t) {
DateTimeFormatterBuilder formatter = new DateTimeFormatterBuilder();
}
}
How can i deal with the date conversions in pig?
With pig 0.11.1, a UDF is not required to convert from ISO 8601 format to yyyy-mm-dd hh:mm:ss.SSS format. Following is example code that shows how to convert a column of ISO 8601 format dates into yyyy-MM-dd HH:mm:ss.SSS dates.
converted_dates = FOREACH input_dates GENERATE ToString(date,'yyyy-MM-dd HH:mm:ss.SSS') as date:chararray;
NOTE:
I don't think the ToString function is documented... I guessed at this usage from this Google SOC proposal:
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/zjshen/21002
where the following function is mentioned as needing to be converted from a piggybank UDF into a built-in.
String ToString(DateTime d, String format)
My guess is that it was converted, but hasn't made its way into the main documentation yet. Here is the class documentation for the ToString built-in:
http://pig.apache.org/docs/r0.11.1/api/org/apache/pig/builtin/ToString.html
But we can see that the ToString function is missing from apache's pig documentation here:
http://pig.apache.org/docs/r0.11.1/func.html
2013-08-22T13:23:18.226220+01:00 is XSD dateTime format and it should be parsed this way
XMLGregorianCalendar xc = DatatypeFactory.newInstance().newXMLGregorianCalendar("2013-08-22T13:23:18.226220+01:00");
from XMLGregorianCalendar you can get GregorianCalendar and then java.util.Date
GregorianCalendar gc = xc.toGregorianCalendar
Date date = gc.getTime();
Note that 226220 is fractional second. If you try to parse it with SimpleDateFormat as SSS it will parse it as 226220 milliseconds and it will be 226 secs 220 ms instead of 0.2226220 sec
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With