Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parent Shard Exists but not the Child Shard

I am setting up a Spark Streaming project with Kinesis and when I try to connect to my Kinesis stream I am getting the following error from Spark:

ERROR ShardSyncTask: Caught exception while sync'ing Kinesis shards and leases
com.amazonaws.services.kinesis.clientlibrary.exceptions.internal.KinesisClientLibIOException: Parent shard shardId-000000000000 exists but not the child shard shardId-000000000002

When I post test data to this stream or read data from the stream using the base Amazon libraries I get no errors, this only occurs when I try to connect with Spark.

Below is the code that I am using for my tests:

val conf = new SparkConf().setMaster("local[2]").setAppName("KinesisCounter")
val ssc = new StreamingContext(conf, Seconds(1))
val rawStream = KinesisUtils.createStream(ssc, "dev-test", "kinesis.us-east-1.amazonaws.com", Duration(1000), InitialPositionInStream.TRIM_HORIZON, StorageLevel.MEMORY_ONLY)
rawStream.map(msg => new String(msg)).count.print
like image 859
egerhard Avatar asked Sep 28 '22 23:09

egerhard


1 Answers

How many shards you have on Kinesis?

what I would do is:

  1. check the Kinesis region, make sure your application setting and stream are in the same region
  2. delete your DynomoTable which stores the Kinesis streaming shards, and start all over again. below is from official documentation:

Changing the application name or stream name can lead to Kinesis errors in some cases. If you see errors, you may need to manually delete the DynamoDB table

  1. check your application code, to see if some settings are being set during code running.

Hope it helps.

like image 185
keypoint Avatar answered Oct 13 '22 00:10

keypoint