Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 
avatar of mirekphd

mirekphd

mirekphd has asked 1 questions and find answers to 61 problems.

Stats

883
EtPoint
169
Vote count
1
questions
61
answers

About

In my Data Scientist / ML Engineer hat I work in python on fully automated ML modeling pipelines (Papermill+Scrapbook, MLflow, git), from data munging, feature engineering and selection (varimp, SHAP), model training, distributed hyperparameters tuning (Optuna), profit margin optimization (Ipopt), reproducibility and validation (MLflow) and automated periodic monitoring of post-production model performance (MLflow, CronJobs).

I've been responsible for building and productionalizing first successful ML models in both major areas of our business (demand and risk models) and a paradigm shift away from decades-old linear models that still dominate the conservative insurance industry.

In my MLOps hat I develop in-house Docker containers (allowing for self-service package installations and automated updates and security scans) for data scientists working on ML models development (GPU-enabled, python, R, H2O) with familiar interfaces such as Jupyter, RStudio Server, and VS Code, specialized ML Ops frameworks such as MLFlow or generic data and file management tools (such as s3/MinIO, Cloud Commander, or SQL/no-SQL databases, notably MariaDB and Redis). I also develop and maintain in production custom apps with REST APIs for the production deployment of ML models and their features (using python (for GBDTs) or Java (for H2O models), Flask/FastAPI, gunicorn, Redis, MinIO, git, and bash).

In my DevOps hat I orchestrate two types of ML containers (stateful for ML models development and stateless for their production deployment) in a constantly changing array of on-prem Openshift clusters, automating multi-layer container builds, packages/libraries/extensions updates, security scans, and staging/production deployments using Jenkins pipelines, bash, python, and Groovy scripts and Openshift build/deployment configs, all integrated using webhooks. I also perform linux system admin role for the CI/CD build servers (CentOS, docker, docker-compose, MicroK8s, Jenkins, Postgres, Clair) and fulfill an Openshift business admin role (using Openshift CLI, YAML configs and bash scripts) in both the data science development and in ML models production clusters for dozens of company data scientists and tens of ML models in production.