Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bug with writing to file in linux /sys/class/gpio

I'm having the strangest bug I've ever seen with a linux system right now and there seem to be only two possible explanations for it -

  • Either appending sudo makes file writes instant
  • Or appending sudo produces a short delay in executing statements
  • Or I've got no clue what's happening with my program

Well let me give you some background. I'm currently writing a c++ program for raspberry pi gpio manipulation. There are no visible error in the program as far as I know & since it works with sudo successfully and with delays successfully too. So here's how rpi's gpio work -

  • First you've to export one, to reserve it for manipulation, it will create a new directory as gpio+number with several files in it.

    echo 17 > /sys/class/gpio/export

  • Then set it's direction(in means read and out means write)

    echo "out" > /sys/class/gpio/gpio17/direction

  • Then write the value (0 or 1 for off and on)

    echo 1 > /sys/class/gpio/gpio17/value

  • At the end, unexport it back, the directory will get deleted.

    echo 17 > /sys/class/gpio/unexport

It doesn't matter whether you do this through bash commands or through c/c++ or any other language IO, since in unix these are just files and you just need to read/write to them. Everything works fine till now. I've tested this manually and it works, so my manual test passes.


Now I've a simple test written for my program which looks like this -

TEST(LEDWrites, LedDevice)
{
    Led led1(17, "MyLED");
    // auto b = sleep(1);
    EXPECT_EQ(true, led1.on());
}

The Led class constructor does the export part - echo 17 > /sys/class/gpio/export, while the .on() call sets the direction - echo "write" > /sys/class/gpio/gpio17/direction and outputs the value as well - echo 1 > /sys/class/gpio/gpio17/value. Forget about unexport here since it is handled by destructor and plays no role here.

If you're curious, these functions handle I/O like this -

{
    const std::string direction = _dir ? "out" : "in";

    const std::string path = GPIO_PATH + "/gpio" + std::to_string(powerPin) + "/direction";

    std::ofstream dirStream(path.c_str(), std::ofstream::trunc);
    if (dirStream) {
        dirStream << direction;
    } else {
        // LOG error here
        return false;
    }
    return true;
}

means basic c++ file/io. Now let me explain the bug.


First, here are 3 runs of same test -

Normal run FAILS

[isaac@alarmpi build]$ ./test/testexe
Running main() from gtest_main.cc
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from LEDConstruction
[ RUN      ] LEDConstruction.LedDevice
[       OK ] LEDConstruction.LedDevice (1 ms)
[----------] 1 test from LEDConstruction (1 ms total)

[----------] 1 test from LEDWrites
[ RUN      ] LEDWrites.LedDevice
../test/test.cpp:20: Failure
Value of: led1.on()
  Actual: false
Expected: true
[  FAILED  ] LEDWrites.LedDevice (2 ms)
[----------] 1 test from LEDWrites (3 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (6 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] LEDWrites.LedDevice

 1 FAILED TEST

run with sudo PASSES

[isaac@alarmpi build]$ sudo ./test/testexe
[sudo] password for isaac: 
Running main() from gtest_main.cc
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from LEDConstruction
[ RUN      ] LEDConstruction.LedDevice
[       OK ] LEDConstruction.LedDevice (1 ms)
[----------] 1 test from LEDConstruction (2 ms total)

[----------] 1 test from LEDWrites
[ RUN      ] LEDWrites.LedDevice
[       OK ] LEDWrites.LedDevice (2 ms)
[----------] 1 test from LEDWrites (2 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (5 ms total)
[  PASSED  ] 2 tests.

wtf delay run PASSES has uncommented // auto b = sleep(1);

[isaac@alarmpi build]$ ./test/testexe
Running main() from gtest_main.cc
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from LEDConstruction
[ RUN      ] LEDConstruction.LedDevice
[       OK ] LEDConstruction.LedDevice (1 ms)
[----------] 1 test from LEDConstruction (2 ms total)

[----------] 1 test from LEDWrites
[ RUN      ] LEDWrites.LedDevice
[       OK ] LEDWrites.LedDevice (1001 ms)
[----------] 1 test from LEDWrites (1003 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (1005 ms total)
[  PASSED  ] 2 tests.

The only difference b/w delay and normal run is of single uncommented line - // auto b = sleep(1); Everything is same including device, directory structure, build conf and everything. The only things that explains this is linux might be creating that file and its friends sometimes later or it takes some time? and I call .on() before that. Well that could explain it...

But then why does sudo invocation with no delay passes? Does it makes those writes faster/instant or does it puts the delay statement by itself? Is this the cause of some kind of buffering? Please say no :/

If it matters, I'm using following dev rule for getting non-sudo access to gpio directory -

SUBSYSTEM=="bcm2835-gpiomem", KERNEL=="gpiomem", GROUP="gpio", MODE="0660"
SUBSYSTEM=="gpio", KERNEL=="gpiochip*", ACTION=="add", PROGRAM="/bin/sh -c 'chown root:gpio /sys/class/gpio/export /sys/class/gpio/unexport ; chmod 220 /sys/class/gpio/export /sys/class/gpio/unexport'"
SUBSYSTEM=="gpio", KERNEL=="gpio*", ACTION=="add", PROGRAM="/bin/sh -c 'chown root:gpio /sys%p/active_low /sys%p/direction /sys%p/edge /sys%p/value ; chmod 660 /sys%p/active_low /sys%p/direction /sys%p/edge /sys%p/value'"

EDIT - As @charles mentioned, I used std::flush after every write I made on I/O operations. Still failing.


Strace to the rescue


Let's see the execution of the failing build command -

open("/sys/class/gpio/export", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/unexport", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/export", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/gpio17/value", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = -1 EACCES (Permission denied)
open("/sys/class/gpio/gpio17/direction", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = -1 EACCES (Permission denied)
open("/sys/class/gpio/unexport", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3

..., 0666) = -1 EACCES (Permission denied)

Okaaay, here's something, that explains why it is passing with sudo. But why is it passing with delay? Let's check that too,

open("/sys/class/gpio/export", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/unexport", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/export", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/gpio17/value", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/gpio17/direction", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
open("/sys/class/gpio/unexport", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3

No wait, wtf? This means the permission denied must be for if files aren't created at that time. But how does using sudo solves that?

Here's relevant output for sudo -

open("/sys/class/gpio/export", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/unexport", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/export", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/gpio17/value", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
open("/sys/class/gpio/gpio17/direction", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
open("/sys/class/gpio/unexport", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
like image 498
Abhinav Gauniyal Avatar asked Sep 16 '16 05:09

Abhinav Gauniyal


1 Answers

There is a race between udev and your program. When you write to /sys/class/gpio/export, the write will not return until the GPIO is fully created. However, once it has been created, you have two processes that simultaneously take action on the new device:

  • A hotplug/uevent triggers udev to evaluate its rules. As part of these rules, it will change the ownership and permissions of /sys/class/gpio/gpio17/value.
  • Your program continues. It will immediately try to open /sys/class/gpio/gpio17/value.

So there is some chance that your program will open the value file before udev has changed its ownership and permissions. This is in fact very likely, because your udev handler does an execve of a shell which then execve's chown and chmod. But even without that, the scheduler will normally give priority to the task that was already running when returning from a syscall, so your program will usually open the value file before udev has even woken up.

By inserting a sleep, you allow udev to do its thing. So to make it robust, you could poll the file with access() before opening it.

It would also help by giving udev higher priority. E.g. chrt -f -p $(pidof systemd-udevd) 3. This gives udev real-time priority, which means it will always run before your program. It can also make your system unresponsive so take care.

like image 117
Arnout Avatar answered Oct 16 '22 20:10

Arnout