Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing objects using PAssert containsInAnyOrder in Apache Beam

While writing unit tests for my beam pipeline using PAssert, the pipeline outputs objects fine but the test fails during comparison with following assertion error:

java.lang.AssertionError: Decode pubsub message/ParMultiDo(DecodePubSubMessage).output: 
Expected: iterable with items [<PubsubMessage{message=[123, 34, 104...], attributes={messageId=2be485e4-3e53-4468-a482-a49842b87ed5, dataPipelineId=bc957aa3-17e7-46d6-bc73-0924fa5674fa, region=us-west1, ingestionTimestamp=2020-02-02T12:34:56.789Z}, messageId=null}>] in any order
     but: not matched: <PubsubMessage{message=[123, 34, 104...], attributes={messageId=2be485e4-3e53-4468-a482-a49842b87ed5, dataPipelineId=bc957aa3-17e7-46d6-bc73-0924fa5674fa, region=us-west1, ingestionTimestamp=2020-02-02T12:34:56.789Z}, messageId=null}>

I also tried encapsulating expectedOutputPubSubMessage in a list (apparently original output is in an Array) to no avail. All the given PAssert examples in documentation do a simple string or keyvalue comparison.

@RunWith(PowerMockRunner.class)
public class DataDecodePipelineTest implements Serializable {

  @Rule
  public TestPipeline p = TestPipeline.create();

  @Test
  public void testPipeline(){
      PubsubMessage inputPubSubMessage =
              new PubsubMessage(
                      TEST_ENCODED_PAYLOAD.getBytes(),
                      new HashMap<String, String>() {
                          {
                              put(MESSAGE_ID_NAME, TEST_MESSAGE_ID);
                              put(DATA_PIPELINE_ID_NAME, TEST_DATA_PIPELINE_ID);
                              put(INGESTION_TIMESTAMP_NAME, TEST_INGESTION_TIMESTAMP);
                              put(REGION_NAME, TEST_REGION);
                          }
                      });

      PubsubMessage expectedOutputPubSubMessage =
              new PubsubMessage(
                      TEST_DECODED_PAYLOAD.getBytes(),
                      new HashMap<String, String>() {
                          {
                              put(MESSAGE_ID_NAME, TEST_MESSAGE_ID);
                              put(DATA_PIPELINE_ID_NAME, TEST_DATA_PIPELINE_ID);
                              put(INGESTION_TIMESTAMP_NAME, TEST_INGESTION_TIMESTAMP);
                              put(REGION_NAME, TEST_REGION);
                          }
                      });

      PCollection<PubsubMessage> input =
              p.apply(Create.of(Collections.singletonList(inputPubSubMessage)));

      PCollection<PubsubMessage> output =
              input.apply("Decode pubsub message",
                      ParDo.of(new DataDecodePipeline.DecodePubSubMessage()));

      PAssert.that(output).containsInAnyOrder(expectedOutputPubSubMessage);
      
      p.run().waitUntilFinish();
  }
}

Apparently, someone faced the exact same issue years ago which remains unresolved. Test pipeline comparing objects using PAssert containsInAnyOrder()

like image 294
Zain Qasmi Avatar asked Nov 15 '22 06:11

Zain Qasmi


1 Answers

Just pass the expectedOutputPubSubMessage inside an array:

PAssert
    .that(output)
    .containsInAnyOrder(new PubsubMessage[] { expectedOutputPubSubMessage });
like image 150
Deniz Acay Avatar answered May 10 '23 13:05

Deniz Acay