I have a driver which requires microsecond delays. To create this delay, my driver is using the kernel's udelay function. Specifically, there is one call to udelay(90):
iowrite32(data, addr + DATA_OFFSET);
iowrite32(trig, addr + CONTROL_OFFSET);
udelay(30);
trig |= 1;
iowrite32(trig, addr + CONTROL_OFFSET);
udelay(90); // This is the problematic call
We had reliability issues with the device. After a lot of debugging, we traced the problem to the driver resuming before 90us has passed. (See "proof" below.)
I am running kernel version 2.6.38-11-generic SMP (Kubuntu 11.04, x86_64) on an Intel Pentium Dual Core (E5700).
As far as I know, the documentation states that udelay will delay execution for at least the specified delay, and is uninterruptible. Is there a bug is this version of the kernel, or did I misunderstand something about the use of udelay?
To convince ourselves that the problem was caused by udelay returning too early, we fed a 100kHz clock to one of the I/O ports and implemented our own delay as follows:
// Wait until n number of falling edges
// are observed
void clk100_delay(void *addr, u32 n) {
int i;
for (i = 0; i < n; i++) {
u32 prev_clk = ioread32(addr);
while (1) {
u32 clk = ioread32(addr);
if (prev_clk && !clk) {
break;
} else {
prev_clk = clk;
}
}
}
}
...and the driver now works flawlessly.
As a final note, I found a discussion indicating that frequency scaling could be causing the *delay() family of functions to misbehave, but this was on a ARM mailing list - I assuming such problems would be non-existent on a Linux x86 based PC.
I don't know of any bug in that version of the kernel (but that doesn't mean that there isn't one).
udelay()
isn't "uninterruptible" - it does not disable preemption, so your task can be preempted by a RT task during the delay. However the same is true of your alternate delay implementation, so that is unlikely to be the problem.
Could your actual problem be a DMA coherency / memory ordering issue? Your alternate delay implementation accesses the bus, so this might be hiding the real problem as a side-effect.
The E5700 has X86_FEATURE_CONSTANT_TSC
but not X86_FEATURE_NONSTOP_TSC
. The TSC is the likely clock source for the udelay
. Unless bound to one of the cores with an affinity mask, your task may have been preempted and rescheduled to another CPU during the udelay
. Or the TSC might not be stable during lower-power CPU modes.
Can you try disabling interrupts or disabling preemption during the udelay
? Also, try reading the TSC before and after.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With