Programming Language
Scala, Python, Java
Used – Clojure, R-programming
Distributed Data & Streaming Technologies
Apache Spark, Hadoop, Apache Kafka, Apache Ignite, Apache Flink, Koalas,
Apache Druid, Pinot, Apache Pulsar
Distributed Databases & Technologies
Cassandra, HBase, RocksDB, Elasticsearch, Neo4J
Distributed Analytics
Presto, Metabase, Apache Preset, Databricks
Data Discovery/Catalog & Metadata Management
Amundsen, Marquez, Apache Atlas, Metacat
Orchestration & Workflow Technologies
Docker, Terraform, Kubernetes, Alluxio, Apache Airflow,
Prefect, Dagster, Argo, Kubeflow
Machine Learning
Spark ML, Pytorch, Tensorflow, Scipy, Scikit-Learn, H20
MLOPs & Infrastructure
MLFlow, ZenML, DVC, LakeFS, Feast, Hopsworks
github - https://github.com/abhishek-ch