I am using Apache Avro.
My schema has map type:
{"name": "MyData", 
  "type" :  {"type": "map", 
              "values":{
                   "type": "record",
                   "name": "Person",
                   "fields":[
                      {"name": "name", "type": "string"},
                      {"name": "age", "type": "int"},
                ]
                }
               }
}
After compile the schema, the genated Java class use CharSequence as the key  for the Map  MyData. 
It is very inconvenient to use CharSequence in Map as key, is there a way to generate String type key for Map in Apache Avro?
P.S.
Problem is that, for example dataMap.containsKey("SOME_KEY") will returns false even though there is such key there, just because it is CharSequence. Besides, put an map entry with a existing key doesn't relpace the old one. That's why I say it is inconvenient to use CharSequence as key.
This JIRA discussion is relevant. The main point of CharSequence still being used is backwards-compatability.
And like Charles Forsythe pointed out, there has been added a workaround for when String is necessary, by setting the string property in the schema.
 { "type": "string", "avro.java.string": "String" }
The default type here is their own Utf8 class. In addition to manual specification and the pom.xml setting, there is even an avro-tools compile option for it, the -string option:
java -jar avro-tools.1.7.5.jar compile -string schema /path/to/schema .
                        Apparently, there is a workaround for this problem in Avro 1.6. You specify the string type in your project's POM file:
  <stringType>String</stringType>
This is mentioned in this issue is AVRO-803 ... though the plugin's web documentation doesn't reflect this.
Apparently, by default, Avro uses CharSequence.  I found a way to configure it to convert to String
From Avro 1.6.0 onward, there is an option to have Avro always perform the conversion to String. There are a couple of ways to achieve this. The first is to set the avro.java.string property in the schema to String:
         { "type": "string", "avro.java.string": "String" }
I have not tested this.
I think explicitly convert String to Utf8 will work. "some_key" -> new Utf8("some_key") and use this as your key for the map.
Regardless of whether it's possible to force Avro to use a String, using CharSequence directly is a bad implementation because CharSequence isn't Comparable<CharSequence> and doesn't even specify equality of two identical sequences. I suggest filing this as a bug against Avro.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With