i am writing a device driver on linux-2.6.26. I want to have a dma buffer mapped into userspace for sending data from driver to userspace application. Please suggest some good tutorial on it.
Thanks
Here is what I have used, in brief...
get_user_pages
to pin the user page(s) and give you an array of struct page *
pointers.
dma_map_page
on each struct page *
to get the DMA address (aka. "I/O address") for the page. This also creates an IOMMU mapping (if needed on your platform).
Now tell your device to perform the DMA into the memory using those DMA addresses. Obviously they can be non-contiguous; memory is only guaranteed to be contiguous in multiples of the page size.
dma_sync_single_for_cpu
to do any necessary cache flushes or bounce buffer blitting or whatever. This call guarantees that the CPU can actually see the result of the DMA, since on many systems, modifying physical RAM behind the CPU's back results in stale caches.
dma_unmap_page
to free the IOMMU mapping (if it was needed on your platform).
put_page
to un-pin the user page(s).
Note that you must check for errors all the way through here, because there are limited resources all over the place. get_user_pages
returns a negative number for an outright error (-errno), but it can return a positive number to tell you how many pages it actually managed to pin (physical memory is not limitless). If this is less than you requested, you still must loop through all of the pages it did pin in order to call put_page
on them. (Otherwise you are leaking kernel memory; very bad.)
dma_map_page
can also return an error (-errno), because IOMMU mappings are another limited resource.
dma_unmap_page
and put_page
return void
, as usual for Linux "freeing" functions. (Linux kernel resource management routines only return errors because something actually went wrong, not because you screwed up and passed a bad pointer or something. The basic assumption is that you are never screwing up because this is kernel code. Although get_user_pages
does check to ensure the validity of the user addresses and will return an error if the user handed you a bad pointer.)
You can also consider using the _sg functions if you want a friendly interface to scatter/gather. Then you would call dma_map_sg
instead of dma_map_page
, dma_sync_sg_for_cpu
instead of dma_sync_single_for_cpu
, etc.
Also note that many of these functions may be more-or-less no-ops on your platform, so you can often get away with being sloppy. (In particular, dma_sync_... and dma_unmap_... do nothing on my x86_64 system.) But on those platforms, the calls themselves get compiled into nothing, so there is no excuse for being sloppy.
OK, this is what I did.
Disclaimer: I'm a hacker in the pure sense of the word and my code ain't the prettiest.
I read LDD3 and infiniband source code and other predecessor stuff and decided that get_user_pages
and pinning them and all that other rigmarole was just too painful to contemplate while hungover. Also, I was working with the other person across the PCIe bus and I was also responsible in "designing" the user space application.
I wrote the driver such that at load time, it preallocates as many buffers as it can with the largest size by calling the function myAddr[i] = pci_alloc_consistent(blah,size,&pci_addr[i])
until it fails. (failure -> myAddr[i]
is NULL
I think, I forget). I was able to allocate around 2.5GB of buffers, each 4MiB in size in my meagre machine which only has 4GiB of memory. The total number of buffers varies depending on when the kernel module is loaded of course. Load the driver at boot time and the most buffers are allocated. Each individual buffer's size maxed out at 4MiB in my system. Not sure why. I cat
ted /proc/buddyinfo
to make sure I wasn't doing anything stupid which is of course my usual starting pattern.
The driver then proceeds to give the array of pci_addr
to the PCIe device along with their sizes. The driver then just sits there waiting for the interrupt storm to begin. Meanwhile in userspace, the application opens the driver, queries the number of allocated buffers(n) and their sizes (using ioctl
s or read
s etc) and then proceeds to call the system call mmap()
multiple (n) times. Of course mmap()
must be properly implemented in the driver and LDD3 pages 422-423 were handy.
Userspace now has n pointers to n areas of driver memory. As the driver is interrupted by the PCIe device, it's told which buffers are "full" or "available" to be sucked dry. The application in turn is pending on a read()
or ioctl()
to be told which buffers are full of useful data.
The tricky part was to manage the userspace to kernel space synchronization such that buffers which are being DMA's into by the PCIe are not also being modified by userspace but that's what we get paid for. I hope this makes sense and I'd be more than happy to be told I'm an idiot but please tell me why.
I recommend this book as well by the way: http://www.amazon.com/Linux-Programming-Interface-System-Handbook/dp/1593272200 . I wish I had that book seven years ago when I wrote my first Linux driver.
There is another type of trickery possible by adding even more memory and not letting the kernel use it and mmap
ping on both sides of the userspace/kernelspace divide but the PCI device must also support higher than 32-bit DMA addressing. I haven't tried but I wouldn't be surprised if I'll eventually be forced to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With