Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ubuntu cluster management

I am trying to figure out a solution for managing a set of linux machines(OS:Ubuntu,~40 nodes. same hardware). These machines are supposed to be images of each other, softwareinstalled in one needs to be installed in other. My software requirements are hadoop, R and servicemix. R packages on all machines also need to be synchronized(package installed on one needs to be available in all the others)

One solution I am using right now is by using NFS and pssh. I am hoping there is a better/easier solution out there, which would make my life a bit easier. Any suggestion is appreciated.

like image 776
smschauhan Avatar asked Apr 05 '11 06:04

smschauhan


People also ask

Does Linux support clustering?

Some Linux operating system vendors offer clustering software, such as SUSE Linux HAE; Red Hat Pacemaker; and Oracle Real Application Clusters (RAC). While they allow you to create a failover cluster, they present a variety of challenges.

What are cluster services in Linux?

Last updated on: 2022-05-31. Authored by: Guillermo Casillas. A cluster is a group of computers (nodes) which work together to provide a shared solution. At a high level, a cluster can be viewed as having three parts (often defined as cluster stack).

What is an advantage of using a Linux cluster?

Cluster computing provides a number of benefits: high availability through fault tolerance and resilience, load balancing and scaling capabilities, and performance improvements.


2 Answers

Two popular choices are Puppet from Puppet Labs and Chef from OpsCode.

Another potential mechanism is creating a new metapackage that Requires: the packages you want installed on all machines. When you modify your metapackage, an apt-get update && apt-get -u dist-upgrade would install the new package on all your systems simultaneously.

The metapackage approach might be less work to configure and use initially, but Puppet or Chef might provide better returns on investment in the long run, as they can manage far more than just package installs.

like image 90
sarnold Avatar answered Oct 09 '22 19:10

sarnold


I have used a low-tech apporach in the past for this by simply sharing (at least parts of) /usr/local/ to keep a common R library in /usr/local/lib/R/site-library/. I guess that could work for your Hadoop installation too.

I tried to keep the rest in Debian / Ubuntu packages and kept all nodes current. Local R and Ubuntu package repositories (for locally created packages) can also help, but are a bit more work.

like image 36
Dirk Eddelbuettel Avatar answered Oct 09 '22 19:10

Dirk Eddelbuettel