I am trying to install tensorflow on spark onto the server, as I was told by my boss because he thought it would be easy to use. But I also learnt the default distributed tensorflow on the tensorflow website. Can any expert tell me the difference between these two choice of distribution? Will spark automatically assign the parameter server or workers?
Thanks in advance.
TensorFlow is an open-source AI library from Google that allows for data flow graphs to build models. Apache Spark is a real-time data processing system with support for diverse data sources and programming styles, providing a framework for machine learning.
Advantages. It can train large models with millions and billions of parameters like: GPT-3, GPT-2, BERT, et cetera. Potentially low latency across the workers. Good TensorFlow community support.
Since TensorFlow can use all the cores on each worker, we only run one task at one time on each worker and we batch them together to limit contention. The TensorFlow library can be installed on Spark clusters as a regular Python library, following the instructions on the TensorFlow website.
tf. distribute. Strategy is a TensorFlow API to distribute training across multiple GPUs, multiple machines, or TPUs. Using this API, you can distribute your existing models and training code with minimal code changes.
I finally installed TensorflowOnSpark(TFOS) on the server and compared it with the default distributed Tensorflow(TF). And my conclusion is:
Pros:
Cons:
If I am wrong somewhere, please correct me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With