i saw the spark avro datasource is implemented based on FileFormat interface. Is there any documentation about how to write spark custom datasource based on FileFormat? Up to now i can't find any(except the source code from spark avro).
Thank you!
Saves the content of the DataFrame as the specified table. In the case the table already exists, behavior of this function depends on the save mode, specified by the mode function (default to throwing an exception).
Here is an example of a simple file-based spark datasource: https://hackernoon.com/extending-our-spark-sql-query-engine-5f4a088de986
Here's a couple examples that implement the Data Sources API, as well: * https://github.com/databricks/spark-csv * https://github.com/databricks/spark-avro
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With