Persistent storage for Apache Mesos

Tags:

Recently I've discovered such a thing as a Apache Mesos.

It all looks amazingly in all that demos and examples. I could easily imagine how one would run for stateless jobs - that fits to the whole idea naturally.

Bot how to deal with long running jobs that are stateful?

Say, I have a cluster that consists of N machines (and that is scheduled via Marathon). And I want to run a postgresql server there.

That's it - at first I don't even want it to be highly available, but just simply a single job (actually Dockerized) that hosts a postgresql server.

1- How would one organize it? Constraint a server to a particular cluster node? Use some distributed FS?

2- DRBD, MooseFS, GlusterFS, NFS, CephFS, which one of those play well with Mesos and services like postgres? (I'm thinking here on the possibility that Mesos/marathon could relocate the service if goes down)

3- Please tell if my approach is wrong in terms of philosophy (DFS for data servers and some kind of switchover for servers like postgres on the top of Mesos)

_{^{Question largely copied from Persistent storage for Apache Mesos, asked by zerkms on Programmers Stack Exchange.}}

284

asked Feb 06 '15 15:02

Carlos Castellanos

1 Answers

Excellent question. Here are a few upcoming features in Mesos to improve support for stateful services, and corresponding current workarounds.

Persistent volumes (0.23): When launching a task, you can create a volume that exists outside of the task's sandbox and will persist on the node even after the task dies/completes. When the task exits, its resources -- including the persistent volume -- can be offered back to the framework, so that the framework can launch the same task again, launch a recovery task, or launch a new task that consumes the previous task's output as its input.
- Current workaround: Persist your state in some known location outside the sandbox, and have your tasks try to recover it manually. Maybe persist it in a distributed filesystem/database, so that it can be accessed from any node.
Disk Isolation (0.22): Enforce disk quota limits on sandboxes as well as persistent volumes. This ensures that your storage-heavy framework won't be able to clog up the disk and prevent other tasks from running.
- Current workaround: Monitor disk usage out of band, and run periodic cleanup jobs.
Dynamic Reservations (0.23): Upon launching a task, you can reserve the resources your task uses (including persistent volumes) to guarantee that they are offered back to you upon task exit, instead of going to whichever framework is furthest below its fair share.
- Current workaround: Use the slave's --resources flag to statically reserve resources for your framework upon slave startup.

As for your specific use case and questions:

1a) How would one organize it? You could do this with Marathon, perhaps creating a separate Marathon instance for your stateful services, so that you can create static reservations for the 'stateful' role, such that only the stateful Marathon will be guaranteed those resources.

1b) Constraint a server to a particular cluster node? You can do this easily in Marathon, constraining an application to a specific hostname, or any node with a specific attribute value (e.g. NFS_Access=true). See Marathon Constraints. If you only wanted to run your tasks on a specific set of nodes, you would only need to create the static reservations on those nodes. And if you need discoverability of those nodes, you should check out Mesos-DNS and/or Marathon's HAProxy integration.

1c) Use some distributed FS? The data replication provided by many distributed filesystems would guarantee that your data can survive the failure of any single node. Persisting to a DFS would also provide more flexibility in where you can schedule your tasks, although at the cost of the difference in latency between network and local disk. Mesos has built-in support for fetching binaries from HDFS uris, and many customers use HDFS for passing executor binaries, config files, and input data to the slaves where their tasks will run.

2) DRBD, MooseFS, GlusterFS, NFS, CephFS? I've heard of customers using CephFS, HDFS, and MapRFS with Mesos. NFS would seem an easy fit too. It really doesn't matter to Mesos what you use as long as your task knows how to access it from whatever node where it's placed.

Hope that helps!

144

answered Sep 21 '22 13:09

Adam

Related questions
                            
                                How to fix pg_dump version mismatch errors?
                            
                                Two SQL LEFT JOINS produce incorrect result
                            
                                Different timezone_types on DateTime object
                            
                                pgAdmin won't start (eternal loading)
                            
                                NOTICES for sequence after running migration in rails on postgresql Application
                            
                                How to check the replication delay in PostgreSQL?
                            
                                DBeaver, export from database Postgres, table structure (properties) into file .txt
                            
                                Find Parent Recursively using Query
                            
                                Postgres Alter table to convert column type from char to bigint [duplicate]
                            
                                How do I add PostGIS to PostgreSQL pgAdmin?
                            
                                How do I install/update to Postgres 9.4?
                            
                                Can't install pg gem on Windows
                            
                                how to properly specify database schema in spring boot?
                            
                                Postgres enum in TypeORM
                            
                                PHP not loading php_pgsql.dll on Windows
                            
                                Correct JPA Annotation for PostgreSQL's text type without Hibernate Annotations
                            
                                How to map postgresql "timestamp with time zone" in a JPA 2 entity
                            
                                PSQLException: ERROR: relation "TABLE_NAME" does not exist
                            
                                Sync postgreSql data with ElasticSearch
                            
                                Text compression in PostgreSQL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Persistent storage for Apache Mesos

Tags:

postgresql

mesosphere

distributed-computing

mesos

Carlos Castellanos

People also ask

1 Answers

Adam

Recent Activity

Donate For Us