Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable connect to docker container outside docker host

Tags:

docker

hadoop

I have two docker containers running on ubuntu , one of them is for hadoop namenode , and another one for hadoop datanode .

Now I have my java code running on windows use Hadoop FileSystem api to copy a file from my windows filesystem to remote docker hdfs .

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileUtil;
import org.apache.hadoop.fs.Path;
import org.junit.Test;

import java.io.File;

public class HadoopTest {

    @Test
    public void testCopyFileToHDFS() throws Exception {
        Configuration configuration = new Configuration();
        configuration.addResource(getClass().getClassLoader().getResourceAsStream("hadoop/core-site.xml"));
        configuration.addResource(getClass().getClassLoader().getResourceAsStream("hadoop/yarn-site.xml"));
        FileSystem fileSystem = FileSystem.get(configuration);
        FileUtil.copy(new File("c:\\windows-version.txt"),fileSystem,   new Path("/testsa"), false,configuration);
    }
}

but I got the following error:

16:57:05.669 [Thread-4] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 172.18.0.2:50010
16:57:15.654 [IPC Client (547201549) connection to /192.168.56.102:9000 from ignis] DEBUG org.apache.hadoop.ipc.Client - IPC Client (547201549) connection to /192.168.56.102:9000 from ignis: closed
16:57:15.655 [IPC Client (547201549) connection to /192.168.56.102:9000 from ignis] DEBUG org.apache.hadoop.ipc.Client - IPC Client (547201549) connection to /192.168.56.102:9000 from ignis: stopped, remaining connections 0
16:57:26.670 [Thread-4] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream
java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1533)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1309)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1262)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)
16:57:26.673 [Thread-4] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning BP-53577818-172.18.0.2-1500882061263:blk_1073741827_1003

You can see the first line error which says "Connecting to datanode 172.18.0.2:50010" is a docker internal ip address.

My Java code is running on a real windows machine which is outside of the docker host machine.

I have mapped Hadoop HDFS ports (e.g. 9000 and 50010) to my docker host(ubuntu).So I can access the HDFS name node through docker host ip address and the HDFS's port.

Below are the logic of my java code:

1)Java Code is running on windows Machine

2)Java code use FileSystem api to copy file from windows to remote HDFS.

3)The client can connect to HDFS name node by using docker host's Ip address and ports mapped from docker container (e.g. 9000)

4)HDFS Namenode server will handle the request send from client and return data node's ip address to client.

5)Client try to copy file from local by use the data node's ip address

6)Client got the error which said the data node's ip address can not be access ,because it's a ip address inside docker container

enter image description here

like image 885
jason zhang Avatar asked Jan 29 '23 22:01

jason zhang


1 Answers

Finally, I found solution by introduce hostname for datanode and enable hdfs client to use hostname instead of ip address when connect to datanode,my client also need to map datanode hostname as docker host ip address, below are detail steps:

  1. Add hostname for docker datanode container in docker-compose.xml

    hostname: datanode.company.com

  2. Enable hdfs (server & client) to use hostname instead of ip address.

<configuration>
    <property>
        <name>dfs.client.use.datanode.hostname</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.datanode.use.datanode.hostname</name>
        <value>true</value>
    </property>
</configuration>
  1. Map docker datanode hostname as docker host ip address by add an entry in file of etc/hosts

    192.168.1.25 datanode.company.com

like image 192
jason zhang Avatar answered Feb 05 '23 16:02

jason zhang