Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tarantool does not start due to disk write error

I am trying to launch Tarantool in Docker from scratch (no existing data). I use the Docker command they suggest in the Tutorial and run it under Docker Desktop 2.4.0.0 on MacOS 10.15.6 (Catalina):

docker run \
  --name mytarantool \
  -d -p 3301:3301 \
  -v /data/dir/on/host:/var/lib/tarantool \
  tarantool/tarantool:2.5.1

(/data/dir/on/host is replaced with my local directory on the laptop). I also tried it with the latest version 2.6.0.

The container terminates soon after start. docker logs shows this:

2020-10-02 20:51:10.331 [1] main/103/tarantool-entrypoint.lua C> Tarantool 2.6.0-0-g47aa4e01e
2020-10-02 20:51:10.331 [1] main/103/tarantool-entrypoint.lua C> log level 5
2020-10-02 20:51:10.332 [1] main/103/tarantool-entrypoint.lua I> mapping 268435456 bytes for memtx tuple arena...
2020-10-02 20:51:10.332 [1] main/103/tarantool-entrypoint.lua I> mapping 134217728 bytes for vinyl tuple arena...
2020-10-02 20:51:10.335 [1] main/103/tarantool-entrypoint.lua I> instance uuid 1811ff01-13d1-45c8-9878-0974bf27ee40
2020-10-02 20:51:10.335 [1] iproto/101/main I> binary: bound to 0.0.0.0:3301
2020-10-02 20:51:10.335 [1] main/103/tarantool-entrypoint.lua I> initializing an empty data directory
2020-10-02 20:51:10.351 [1] main/103/tarantool-entrypoint.lua I> assigned id 1 to replica 1811ff01-13d1-45c8-9878-0974bf27ee40
2020-10-02 20:51:10.351 [1] main/103/tarantool-entrypoint.lua I> cluster uuid 12ca546b-29ea-4af3-a407-f24e91c0e636
2020-10-02 20:51:10.357 [1] snapshot/101/main I> saving snapshot `/var/lib/tarantool/00000000000000000000.snap.inprogress'
2020-10-02 20:51:10.361 [1] snapshot/101/main I> done
2020-10-02 20:51:10.364 [1] main/103/tarantool-entrypoint.lua I> ready to accept requests
2020-10-02 20:51:10.365 [1] main/103/tarantool-entrypoint.lua I> set 'log_level' configuration option to 5
2020-10-02 20:51:10.365 [1] main/105/checkpoint_daemon I> scheduled next checkpoint for Fri Oct  2 22:10:11 2020
2020-10-02 20:51:10.367 [1] main/103/tarantool-entrypoint.lua I> set 'listen' configuration option to "3301"
2020-10-02 20:51:10.367 [1] main/103/tarantool-entrypoint.lua I> set 'log_format' configuration option to "plain"
2020-10-02 20:51:10.384 [1] wal/101/main xlog.c:1026 !> SystemError /var/lib/tarantool/00000000000000000000.xlog: can't allocate disk space: Invalid argument
2020-10-02 20:51:10.384 [1] main/103/tarantool-entrypoint.lua txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 20:51:10.391 [1] main txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 20:51:10.391 [1] main F> fatal error, exiting the event loop

At the same time the container manages to create 5.9K 00000000000000000000.snap and 97B 00000000000000000000.xlog files.

$ ls -hal
total 24
drwxr-xr-x@ 4 user  staff   128B  2 Oct 13:51 .
drwxr-xr-x  3 user  staff    96B  2 Oct 12:56 ..
-rw-r--r--  1 user  staff   5.9K  2 Oct 13:51 00000000000000000000.snap
-rw-r--r--  1 user  staff    97B  2 Oct 13:51 00000000000000000000.xlog

If I start the container without mounting the local directory, it succeeds.

I assume there is a problem with my local file system (or with the way as it is visible from the container) or probably a permission issue, but I can't figure out what exactly.

If I exec as shell in a successfully started container, I see that the xlog file is bigger and the owner of the files is tarantool:tarantool.

$ docker exec -it 016 sh
/opt/tarantool # ls -hal /var/lib/tarantool/
total 1044
drwxr-xr-x    2 tarantoo tarantoo    4.0K Oct  2 20:40 .
drwxr-xr-x    1 root     root        4.0K Aug  2 16:31 ..
-rw-r--r--    1 tarantoo tarantoo    5.9K Oct  2 20:40 00000000000000000000.snap
-rw-r--r--    1 tarantoo tarantoo     273 Oct  2 20:40 00000000000000000000.xlog

But with the directory bound, it looks different:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0 sh
/opt/tarantool # ls -hal /var/lib/tarantool/
total 16
drwxr-xr-x    4 tarantoo root         128 Oct  2 21:18 .
drwxr-xr-x    1 root     root        4.0K Aug  2 16:31 ..
-rw-r--r--    1 root     root        5.9K Oct  2 21:18 00000000000000000000.snap
-rw-r--r--    1 root     root          97 Oct  2 21:18 00000000000000000000.xlog

I tried to change owner for the directory and files:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0 sh
/opt/tarantool # chown tarantool:tarantool -R /var/lib/tarantool/

And checked that the changes persist after container restart:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0 sh
/opt/tarantool # ls -hal /var/lib/tarantool/
total 16
drwxr-xr-x    4 tarantoo tarantoo     128 Oct  2 21:18 .
drwxr-xr-x    1 root     root        4.0K Aug  2 16:31 ..
-rw-r--r--    1 tarantoo tarantoo    5.9K Oct  2 21:18 00000000000000000000.snap
-rw-r--r--    1 tarantoo tarantoo      97 Oct  2 21:18 00000000000000000000.xlog

Now the permissions look the same as in the working container. But starting the container normally eventually leads to the same problem:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0
Creating configuration file: /etc/tarantool/config.yml
Config:
---
force_recovery: false
memtx_dir: /var/lib/tarantool
listen: 3301
pid_file: /var/run/tarantool/tarantool.pid
vinyl_dir: /var/lib/tarantool
wal_dir: /var/lib/tarantool
```
2020-10-02 21:22:29.680 [1] main/103/tarantool-entrypoint.lua C> Tarantool 2.6.0-0-g47aa4e01e
2020-10-02 21:22:29.681 [1] main/103/tarantool-entrypoint.lua C> log level 5
2020-10-02 21:22:29.685 [1] main/103/tarantool-entrypoint.lua I> mapping 268435456 bytes for memtx tuple arena...
2020-10-02 21:22:29.685 [1] main/103/tarantool-entrypoint.lua I> mapping 134217728 bytes for vinyl tuple arena...
2020-10-02 21:22:29.687 [1] main/103/tarantool-entrypoint.lua I> instance uuid 74d33452-7f39-4ebf-a2f7-c1da6cb8c54b
2020-10-02 21:22:29.691 [1] main/103/tarantool-entrypoint.lua I> instance vclock {}
2020-10-02 21:22:29.691 [1] iproto/101/main I> binary: bound to 0.0.0.0:3301
2020-10-02 21:22:29.693 [1] main/103/tarantool-entrypoint.lua I> recovery start
2020-10-02 21:22:29.694 [1] main/103/tarantool-entrypoint.lua I> recovering from `/var/lib/tarantool/00000000000000000000.snap'
2020-10-02 21:22:29.695 [1] main/103/tarantool-entrypoint.lua I> cluster uuid 258099e8-803c-4a10-a3e0-57f6cd796f18
2020-10-02 21:22:29.708 [1] main/103/tarantool-entrypoint.lua I> assigned id 1 to replica 74d33452-7f39-4ebf-a2f7-c1da6cb8c54b
2020-10-02 21:22:29.709 [1] main/103/tarantool-entrypoint.lua I> recover from `/var/lib/tarantool/00000000000000000000.xlog'
2020-10-02 21:22:29.710 [1] main/103/tarantool-entrypoint.lua recovery.cc:156 W> file `/var/lib/tarantool/00000000000000000000.xlog` wasn't correctly closed
2020-10-02 21:22:29.713 [1] main/103/tarantool-entrypoint.lua I> ready to accept requests
2020-10-02 21:22:29.713 [1] main/103/tarantool-entrypoint.lua C> leaving orphan mode
2020-10-02 21:22:29.713 [1] main/103/tarantool-entrypoint.lua I> set 'log_level' configuration option to 5
2020-10-02 21:22:29.713 [1] main/105/checkpoint_daemon I> scheduled next checkpoint for Fri Oct  2 23:01:09 2020
2020-10-02 21:22:29.720 [1] main/103/tarantool-entrypoint.lua I> set 'listen' configuration option to "3301"
2020-10-02 21:22:29.720 [1] main/103/tarantool-entrypoint.lua I> set 'log_format' configuration option to "plain"
2020-10-02 21:22:29.723 [1] wal/101/main xlog.c:1026 !> SystemError /var/lib/tarantool/00000000000000000000.xlog: can't allocate disk space: Invalid argument
2020-10-02 21:22:29.723 [1] main/103/tarantool-entrypoint.lua txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 21:22:29.726 [1] main txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 21:22:29.726 [1] main F> fatal error, exiting the event loop

Strangely, permissions have not changed at all if I check it locally on my laptop. I assume it's a Docker magic, but I am not sure how it works, taking into account that permission changes persist between container restarts.

But maybe the issue is not related to permissions... What then? How to fix the issue?

like image 711
greatvovan Avatar asked Oct 02 '20 21:10

greatvovan


1 Answers

Answering my own question.

Apparently the issue had nothing to do with permissions, the problem is on the file system virtualization level.

The issue can be fixed by switching off "gRPC FUSE" feature in Docker Desktop preferences. Presumably (as it says can't allocate disk space: Invalid argument) the problem is that this implementation does not support fallocate() with some specific parameters (see https://github.com/docker/for-mac/issues/4964#issuecomment-702748937).

UPDATE: if you want to use gRPC FUSE feature for file sharing, consider updating to version 2.4.2.0 where the issue was fixed.

like image 187
greatvovan Avatar answered Nov 16 '22 04:11

greatvovan