I have an embedded Linux platform (the Beagleboard, running Angstrom Linux) with two devices connected: <ul> <li>a Laser range finder (Hokuyo UTM 30) connected via USB</li> <li>a custom external board connected via SPI </li> </ul> We have a written a Linux kernel module which is responsible for the SPI data transfer. It has an IRQ handler in which spi_async is called which in turn causes an async callback method to be called. My C++ application consists of three threads: <ul> <li>a main thread for data processing</li> <li>a laser polling thread</li> <li>an SPI polling thread</li> </ul> I am experiencing problems which seem to be caused by how the modules described above interact. <ul> <li>When I switch off the USB device (laser range finder) I receive all SPI messages correctly (1 message every 3ms, message length divided by data rate is <1ms), independent from thread scheduling</li> <li>When I switch on the USB device and I run my program with normal thread scheduling (SCHED_OTHER, priority 0, no nice level set) about 1% of the messages is "lost" because the callback method of spi_async is running when the next IRQ occurs (I could handle this case differently in order not to loose the messages, so this is not a big issue.)</li> <li> With the USB device turned on and I run the program with SCHED_RR and <ul> <li>priority = 10 for main thread</li> <li>priority = 10 for SPI reading thread </li> <li>priority = 4 for USB/Laser polling thread</li> </ul> then I am loosing 40% of the messages because the IRQ is triggered again before the spi-callback method is called! (I could still maybe find a workaround, but the problem is that I need fast response times which can no longer be reached in this case). I need to use the thread scheduling and the laser device so I am looking for a way to solve this case. </li> </ul> Question 1: My assumption was that IRQ handlers and the callbacks triggered by spi_async in kernel space have higher priority than any thread running in user space (no matter if SCHED_RR or SCHED_OTHER). This would mean that turning to SCHED_RR in my application shouldn't slow down SPI transfer, but this seems very wrong. Is it? Question 2: How can I determine what happens here? Which debugging aids exist? (Or maybe you don't need any further information?) The main question for me is: why do I experience the problems only when the laser device is turned on. Could the USB driver consume so much time? ----- EDIT: I have made the following observation: The spi_async's callback calls <code>wake_up_interruptible(&mydata->readq);</code> (with <code>wait_queue_head_t readq;</code>). From the user space (my app) I call a function which results in <code>poll_wait(file, &mydata->readq, wait);</code> When the poll returns the user space calls <code>read()</code>. <ul> <li>When my application runs with <code>SCHED_OTHER</code> I can see that the callback method first finishes before the <code>read()</code> method in my kernel module is entered. </li> <li>When my application runs with <code>SCHED_RR</code> read is entered before exiting the callback.</li> </ul> This seems to proof that the priority of the user space threads is higher than the callback method's context's priority. Is there any way to change this behaviour and still have <code>SCHED_RR</code> for my application's threads?

Not all kernel thread have an RT priority. Imagine a periodically waking up thread that needs to do some background work is waking up. You don't want this thread to preemt your RT thread. So I guess your first assumption is wrong. Based on your other questions : <ul> <li>your main processing loop receives SPI data through a queue</li> <li>the spi processing thread feeds the main processing queue</li> </ul> It seems your main processing thread get in the way of the spi driver thread responsible for the spi data transfer. Here is what happens : <ul> <li>an IRQ is fired</li> <li>spi_async is called, which means a data transfer is queued, that will be picked up by a thread created by the spi master driver.</li> <li>spi master thread compete with your main processing thread, the laser thread, but this kernel thread has not RT priority, so it looses every time one of the RR thread is running.</li> </ul> What you can do is going back to normal scheduling, while playing with the various CONFIG_PREEMPT_ options. Or mess with the spi master driver, to ensure that any delayed work is queued with enough priority. Or even not queued at all.

Priority of kernel modules and SCHED_RR threads

Tags:

c++

c

linux

real-time

scheduling

I have an embedded Linux platform (the Beagleboard, running Angstrom Linux) with two devices connected:

a Laser range finder (Hokuyo UTM 30) connected via USB
a custom external board connected via SPI

We have a written a Linux kernel module which is responsible for the SPI data transfer. It has an IRQ handler in which spi_async is called which in turn causes an async callback method to be called.

My C++ application consists of three threads:

a main thread for data processing
a laser polling thread
an SPI polling thread

I am experiencing problems which seem to be caused by how the modules described above interact.

When I switch off the USB device (laser range finder) I receive all SPI messages correctly (1 message every 3ms, message length divided by data rate is <1ms), independent from thread scheduling
When I switch on the USB device and I run my program with normal thread scheduling (SCHED_OTHER, priority 0, no nice level set) about 1% of the messages is "lost" because the callback method of spi_async is running when the next IRQ occurs (I could handle this case differently in order not to loose the messages, so this is not a big issue.)
With the USB device turned on and I run the program with SCHED_RR and
- priority = 10 for main thread
- priority = 10 for SPI reading thread
- priority = 4 for USB/Laser polling thread
then I am loosing 40% of the messages because the IRQ is triggered again before the spi-callback method is called! (I could still maybe find a workaround, but the problem is that I need fast response times which can no longer be reached in this case). I need to use the thread scheduling and the laser device so I am looking for a way to solve this case.

Question 1:

My assumption was that IRQ handlers and the callbacks triggered by spi_async in kernel space have higher priority than any thread running in user space (no matter if SCHED_RR or SCHED_OTHER). This would mean that turning to SCHED_RR in my application shouldn't slow down SPI transfer, but this seems very wrong. Is it?

Question 2:

How can I determine what happens here? Which debugging aids exist? (Or maybe you don't need any further information?) The main question for me is: why do I experience the problems only when the laser device is turned on. Could the USB driver consume so much time?

----- EDIT:

I have made the following observation:

The spi_async's callback calls wake_up_interruptible(&mydata->readq); (with wait_queue_head_t readq;). From the user space (my app) I call a function which results in poll_wait(file, &mydata->readq, wait); When the poll returns the user space calls read().

When my application runs with SCHED_OTHER I can see that the callback method first finishes before the read() method in my kernel module is entered.
When my application runs with SCHED_RR read is entered before exiting the callback.

This seems to proof that the priority of the user space threads is higher than the callback method's context's priority. Is there any way to change this behaviour and still have SCHED_RR for my application's threads?

447

asked Oct 06 '11 15:10

Philipp

1 Answers

Not all kernel thread have an RT priority. Imagine a periodically waking up thread that needs to do some background work is waking up. You don't want this thread to preemt your RT thread. So I guess your first assumption is wrong.

Based on your other questions :

your main processing loop receives SPI data through a queue
the spi processing thread feeds the main processing queue

It seems your main processing thread get in the way of the spi driver thread responsible for the spi data transfer.

Here is what happens :

an IRQ is fired
spi_async is called, which means a data transfer is queued, that will be picked up by a thread created by the spi master driver.
spi master thread compete with your main processing thread, the laser thread, but this kernel thread has not RT priority, so it looses every time one of the RR thread is running.

What you can do is going back to normal scheduling, while playing with the various CONFIG_PREEMPT_ options. Or mess with the spi master driver, to ensure that any delayed work is queued with enough priority. Or even not queued at all.

130

answered Sep 24 '22 08:09

shodanex

Related questions
                            
                                Can you statically link the same protobuf message into multiple DLLs - and then have those DLLs work together?
                            
                                Resource Contention
                            
                                How to read using C++ (C#) sound stream sent by flash?
                            
                                Memory section handling error
                            
                                Oldtimers: What was the emacs-like editor that came with MS C 5.0 (for DOS) in the late 1980s?
                            
                                How do I get a std::exception error description when calling a C++ dll from C# [duplicate]
                            
                                "Matrix decomposition" of a matrix with holonic sub structure
                            
                                Python to C++ function conversion using Boost.Python
                            
                                Built-in operator candidates
                            
                                How to detect const reference to temporary issues at compile or runtime?
                            
                                Looking for testing matrices/systems for iterative linear solver
                            
                                boost distributed with closed source library
                            
                                Problems with the two parameter format function in boost::regex_replace
                            
                                C++ Graph Vertex Coloring Library or Source Code
                            
                                "Missing non-virtual thunks" and inheritance order
                            
                                May I have a real life example where casting through void* works and reinterpret_cast doesn't?
                            
                                basic_streambuf::seekoff what should be returned when ios_base::in | ios_base::out is specified?
                            
                                Implementing a multivariate gaussian probability density function for >2 dimensions in C++
                            
                                Input needed for my program structure/design [closed]
                            
                                c++ visual studio 2010 exe in resource get Rebased?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With