Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access side input in a non - anonymous DoFn

How to access the elements of a side input if I have my class extend DoFn?

For example:

Say I have a ParDo transform like:

PCollection<String> data = myData.apply("Get data",
    ParDo.of(new MyClass()).withSideInputs(myDataView));

And I have a class:-

static class MyClass extends DoFn<String,String>
{
    //How to access side input here
}

c.sideInput() isn't working in this case.

Thanks.

like image 508
rish0097 Avatar asked Dec 13 '22 21:12

rish0097


2 Answers

In this case, the problem is that the processElement method in your DoFn does not have access to the PCollectionView instance in your main method.

You can pass the PCollectionView to the DoFn in the constructor:

class MyClass extends DoFn<String,String>
{
    private final PCollectionView<..> mySideInput;

    public MyClass(PCollectionView<..> mySideInput) {
        // List, or Map or anything:
        this.mySideInput = mySideInput;
    }

    @ProcessElement
    public void processElement(ProcessContext c) throws IOException
    {
        // List or Map or any type you need:
        List<..> sideInputList = c.sideInput(mySideInput);
    }
}

You would then pass the side input to the class when you instantiate it, and indicate it as a side input like so:

p.apply(ParDo.of(new MyClass(mySideInput)).withSideInputs(mySideInput));

The explanation for this is that when you use an anonymous DoFn, the process method has a closure with access to all the objects within the scope that encloses the DoFn (among them is the PCollectionView). When you're not using an anonymous DoFn, there is no closure, and you need another way of passing the PCollectionView.

like image 200
Pablo Avatar answered Mar 03 '23 08:03

Pablo


So although the answer above is correct, it is still a little incomplete.

So once you finish implementing the above answer, you need to execute your pipeline like this:

    p.apply(ParDo.of(new MyClass(mySideInput)).withSideInputs(mySideInput));
like image 26
Haris Nadeem Avatar answered Mar 03 '23 09:03

Haris Nadeem