Looking to convert some R code to Sparklyr, functions such as lmtest::coeftest() and sandwich::sandwich(). Trying to get started with Sparklyr extensions but pretty new to the Spark API and having issues :(
Running Spark 2.1.1 and sparklyr 0.5.5-9002
Feel the first step would be to make a DenseMatrix object using the linalg library:
library(sparklyr)
library(dplyr)
sc <- spark_connect("local")
rows <- as.integer(2)
cols <- as.integer(2)
array <- c(1,2,3,4)
mat <- invoke_new(sc, "org.apache.spark.mllib.linalg.DenseMatrix",
rows, cols, array)
This results in the error:
Error: java.lang.Exception: No matched constructor found for class org.apache.spark.mllib.linalg.DenseMatrix
Okay, so I got a java lang exception, I'm pretty sure the rows and cols args were fine in the constructor, but not sure sure about the last one, which is supposed to be a java Array. So I tried a few permutations of:
array <- invoke_new(sc, "java.util.Arrays", c(1,2,3,4))
but end up with a similar error message...
Error: java.lang.Exception: No matched constructor found for class java.util.Arrays
I feel like I'm missing something pretty basic. Anyone know what's up?
R counterpart of the Java Array is list:
invoke_new(
sc, "org.apache.spark.ml.linalg.DenseMatrix",
2L, 2L, list(1, 2, 3, 4))
## <jobj[17]>
## class org.apache.spark.ml.linalg.DenseMatrix
## 1.0 3.0
## 2.0 4.0
or
invoke_static(
sc, "org.apache.spark.ml.linalg.Matrices", "dense",
2L, 2L, list(1, 2, 3, 4))
## <jobj[19]>
## class org.apache.spark.ml.linalg.DenseMatrix
## 1.0 3.0
## 2.0 4.0
Please note I am using o.a.s.ml.linalg instead of o.a.s.mllib.linalg. While mllib would work in isolation, as of Spark 2.x o.a.s.ml algorithms no longer accept local o.a.s.mllib.
At the same time R vector types (numeric, integer, character) are used as scalars.
Note:
Personally I believe this is not the way to go. Spark linalg packages are quite limited, and internally depend on the libraries, which won't be usable via sparklyr. Moreover sparklyr API is not suitable for complex logic.
In practice it makes more sense to implement Java or Scala extension, with a thin, R friendly wrapper.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With