Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any distributed file system which runs on Windows except Hadoop? [closed]

I'm desperate to find any DFS which supports Windows. The only such DFS is Hadoop HDFS but it's very hard to deploy it other big number of Windows machines because it requires Cygwin + SSH.

Almost all DFS systems work only on Linux and only one (HDFS) runs on Windows.

I would be very grateful if somebody will be able to point me to other DFS with Windows support.

From DFS I need ability to load balance files across DFS nodes, compression and multi language API to work with DFS (I don't need to mount DFS).

like image 630
sha1dy Avatar asked Jun 25 '10 11:06

sha1dy


People also ask

What are the two modes of implementation of HDFS file system?

HDFS employs a NameNode and DataNode architecture to implement a distributed file system that provides high-performance access to data across highly scalable Hadoop clusters.

Which file system does Hadoop use?

HDFS is a distributed file system that handles large data sets running on commodity hardware. It is used to scale a single Apache Hadoop cluster to hundreds (and even thousands) of nodes. HDFS is one of the major components of Apache Hadoop, the others being MapReduce and YARN.

What is the difference between file system and distributed file system?

A distributed file system (DFS) differs from typical file systems (i.e., NTFS and HFS) in that it allows direct host access to the same file data from multiple locations. Indeed, the data behind a DFS can reside in a different location from all of the hosts that access it.

How NFS CIFS file sharing mechanism differ from HDFS?

After the successful accessing of data, the client machine can interconnect with the file systems within the specified parameters. Difference between HDFS & NFS : NFS does not have any built-in fault-tolerance but HDFS was designed to survive failures as it has fault-tolerance or replication.


3 Answers

There is DFS from Microsoft itself, it's in Windows Server (if it's good or bad I don't know)

like image 56
Redlab Avatar answered Nov 15 '22 18:11

Redlab


GPFS is a worthy consideration. It is IBM proprietary, but does have very good scalability, is a full-fledged network file system, and has decent Windows support. NTFS ACLs are preserved, though mapping them to NFSv4 ACLs, which works quite well (so long as you don't shoot your foot off trying to use POSIX permissions as well; chmod will blow away your NFSv4 ACLs.)

Lustre is worth a mention, but Windows support is generally considered very poor and green.

like image 21
ckg Avatar answered Nov 15 '22 20:11

ckg


You might want to check out CloudIQ Storage from Appistry.
(They have closed shop.)

It allows you to take the drives in commodity based machines (linux or windows) and have them appear as a single namespace accessible via a REST based API. When you write files to the system, you can define the number of copies you want saved. So if you had 5 machines in your distributed system, you could specify that a file be saved on 2 or 3 (or N) machines for redundancy. If a machine/hard drive crashes, its not an issue, because other machines hold copies of those files.

Check out the Downloads and Community links for a trial version as well as documentation.

like image 38
Brett McCann Avatar answered Nov 15 '22 20:11

Brett McCann