Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rsync files to hadoop

Tags:

rsync

hadoop

I have 6 servers and each contains a lot of logs. I'd like to put these logs to hadoop fs via rsync. Now I'm using fuse and rsync writes directly to fuse-mounted fs /mnt/hdfs. But there is a big problem. After about a day, the fuse deamon occupies 5 GB of RAM and it's not possible to do anything with mounted fs. So I have to remount fuse and everything is OK, but just for some time. Rsync command is

rsync --port=3360 -az --timeout=10 --contimeout=30 server_name::ap-rsync/archive /mnt/hdfs/logs

Rsync produces error message after some time:

rsync error: timeout in data send/receive (code 30) at io.c(137) [sender=3.0.7]
rsync: connection unexpectedly closed (498784 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [receiver=3.0.7]
rsync: connection unexpectedly closed (498658 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [generator=3.0.7]
like image 760
Michal Avatar asked Jun 23 '11 06:06

Michal


1 Answers

Fuse-hdfs does not support O_RDWR and O_EXCL, so rsync get a EIO error. If you want to use rsync with fuse-hdfs, it is needed to patch the code. You have two ways to modify, each one is OK. I recommend to use the second method.

  1. patch fuse-hdfs, it could be find in hadoop.

    https://issues.apache.org/jira/browse/HDFS-861

  2. patch rsync (version 3.0.8).

    diff -r rsync-3.0.8.no_excl/syscall.c rsync-3.0.8/syscall.c
    
    234a235,252
    > #if defined HAVE_SECURE_MKSTEMP && defined HAVE_FCHMOD && (!defined HAVE_OPEN64 || defined HAVE_MKSTEMP64)
    >   {
    >       int fd = mkstemp(template);
    >       if (fd == -1)
    >           return -1;
    >       if (fchmod(fd, perms) != 0 && preserve_perms) {
    >           int errno_save = errno;
    >           close(fd);
    >           unlink(template);
    >           errno = errno_save;
    >           return -1;
    >       }
    > #if defined HAVE_SETMODE && O_BINARY
    >       setmode(fd, O_BINARY);
    > #endif
    >       return fd;
    >   }
    > #else
    237c255,256
    <   return do_open(template, O_WRONLY|O_CREAT, perms);
    ---
    >   return do_open(template, O_RDWR|O_EXCL|O_CREAT, perms);
    > #endif
    
like image 135
fun.shao Avatar answered Nov 17 '22 06:11

fun.shao