Problem
I am creating a Windows 7 based C# WPF application using .Net 4.5, and one its major features is to call certain functions that interface with custom hardware with a set of user defined cycle times. For example the user might choose two functions to be called every 10 or 20 milliseconds and another every 500 milliseconds. The smallest cycle time the user can choose is 1 milliseconds.
At first it seemed that the timings were accurate and the functions were called every 1 millisecond as required. But we later noticed that about 1-2% of the timings were not accurate, were some functions were called just 5 milliseconds late, and others could reach up to 100 milliseconds late. Even with cycle times greater than 1 msec, we faced the problem that the thread slept at the time it should have called the external function (a 20 msecs function could be called 50 msecs late because the thread was sleeping and didnt call the function)
After analysis we concluded that these delays were sporadic, with no noticeable pattern, and that the main possible reason behind these delays were OS scheduling and thread context switching, in other words our thread wasn't awake all the time like we need it to be.
As windows 7 is not an RTOS, we need to find if we can work around this problem somehow. But we do know for sure that this problem is fixable on windows, as we use other tools with similar functionality that can meet those timing constraints with a maximum of 0.7 ms error tolerance.
Our application is multithreaded with about a maximum of 30 threads running at the same, its current peak CPU usage is about 13%
Attempted Solutions
We tried many different things, timing was mainly measured using the stopwatch timer and IsHighResolution was true (other timers were used but we did not notice much difference):
Creating a separate thread and giving it high priority
Result: Ineffective (using both the terrible Thread.Sleep()
, and without it and using continuous polling)
Using a C# task (Thread pool)
Result: very little improvement
Using a multimedia timer with 1ms periodicity
Result: Ineffective or worse, multimedia timers are accurate at waking up the OS, but the OS may choose to run another thread, no 1ms guarantee, but even then, delays could be much bigger occasionally
Created a separate standalone C# project that just contained a while loop and stopwatch timer
Result: most of the time the accuracy was great even in microseconds, but occasionally the thread sleeps
Repeated point 4, but set the process priority to Realtime/High
Result: Very good numbers, almost not a single message had significant delay.
Conclusion:
From the previous we found that we had 5 possible courses of action, but we need someone knowledgeable with experience in such problems to point us in the right direction:
Our tool can be optimized and the threads managed somehow to insure the 1ms realtime requirement. maybe part of the optimization is setting the process priority of the tool to high or Realtime, but that does not seem like a wise decisions, as users could be using several other tools at the same time.
We divide our tool into two processes, one that contains the GUI And all the non time critical operations, and the other containing the minimal amount of time critical operations and set it to high/real time priority, and use IPC (like WCF) to communication between the processes. This could benefit us in two ways
Less probability of starvation for other processes as much less operations are happening.
The process would have less threads so (much less or no) probability of thread sleeping
Note: The next two points will deal with kernel space, please note that I have little information about kernel space and writing drivers, so I might be making some wrong assumptions about how it could be used.
Creating a driver in kernel space that uses lower level interrupts every 1ms to fire an event that forces the thread to perform its designated task in the process.
Moving the time critical components to kernel space, any interfacing with the Main body of the programs could be done through APIs and callbacks.
Perhaps all of these are not valid, and we might need to use a windows RTOS extension like IntervalZero RTOS platform?
The Question Itself
There are two answers I am looking for, and I hope they are backed with good sources.
Is this truly a threading and context switching problem? Or have we been missing something all of this time?
Which of the 5 options is guaranteed to fix this problem, and if several are, which is the easiest? If none of these options can fix it, what can? Please remember that other tools we have bench-marked do indeed reach the required timing accuracy on windows, and when the CPU is under heavy load, one or two timings out of 100,000 could be off by less than 2 milliseconds, which is very acceptable.
Our tool can be optimized and the threads managed somehow to insure the 1ms realtime requirement. maybe part of the optimization is setting the process priority of the tool to high or Realtime, but that does not seem like a wise decisions, as users could be using several other tools at the same time.
In real-time, the first boundary of thread scheduling is beyond specifying the scheduling policy and the priority. It requires two controls to be specified for the User level threads: Contention scope, and Allocation domain. These are explained as following below. 1. Contention Scope :
Each LWP is attached to a separate kernel-level thread. In real-time, the first boundary of thread scheduling is beyond specifying the scheduling policy and the priority. It requires two controls to be specified for the User level threads: Contention scope, and Allocation domain.
At first it seemed that the timings were accurate and the functions were called every 1 millisecond as required. But we later noticed that about 1-2% of the timings were not accurate, were some functions were called just 5 milliseconds late, and others could reach up to 100 milliseconds late.
Which of the 5 options is guaranteed to fix this problem?
This depends on what accuracy your are trying to achieve. If you're aiming for say +/- 1ms, you have a reasonable chance to get it done without points 3) to 5). The combination of points 1) and 2) is the way to go:
THREAD_PRIORITY_HIGHEST(2)
as the maximimum priority. Therefore you'd have to look into the SetThreadPriority function which allows access to THREAD_PRIORITY_TIME_CRITICAL (15)
. The Process::PriorityClass Property allows to access REALTIME_PRIORITY_CLASS (24)
. Note: Code running at such priorities will push all other code out of the way. You'd have to make the code with very littly computation and very safe.General remarks: All depends on load. Windows can do pretty well despite the fact that it is not a "realtime OS". However, also realtime systems rely on low load. Nothing is guaranteed, not even on an RT-OS when it is heavily loaded.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With