I want to get data from a DMA enabled, PCIe hardware device into user-space as quickly as possible.
Q: How do I combine "direct I/O to user-space with/and/via a DMA transfer"
Reading through LDD3, it seems that I need to perform a few different types of IO operations!?
dma_alloc_coherent
gives me the physical address that I can pass to the hardware device. But would need to have setup get_user_pages
and perform a copy_to_user
type call when the transfer completes. This seems a waste, asking the Device to DMA into kernel memory (acting as buffer) then transferring it again to user-space. LDD3 p453: /* Only now is it safe to access the buffer, copy to user, etc. */
What I ideally want is some memory that:
Do I need single-page streaming mappings, setup mapping and user-space buffers mapped with get_user_pages
dma_map_page
?
My code so far sets up get_user_pages
at the given address from user-space (I call this the Direct I/O part). Then, dma_map_page
with a page from get_user_pages
. I give the device the return value from dma_map_page
as the DMA physical transfer address.
I am using some kernel modules as reference: drivers_scsi_st.c
and drivers-net-sh_eth.c
. I would look at infiniband code, but cant find which one is the most basic!
Many thanks in advance.
Whilst a user-space program is not allowed to access kernel memory, it is possible for the kernel to access user memory. However, the kernel must never execute user-space memory and it must also never access user-space memory without explicit expectation to do so.
When device_register is called for a device, it is inserted into the end of this list. The bus object also contains a list of all drivers of that bus type. When driver_register is called for a driver, it is inserted at the end of this list. These are the two events which trigger driver binding.
DMA stands for direct memory access and refers to the ability of devices or other entities in a computing system to modify main memory contents without going through the CPU.
The copy_to_user function copies a block of data from the kernel into user space.
I'm actually working on exactly the same thing right now and I'm going the ioctl()
route. The general idea is for user space to allocate the buffer which will be used for the DMA transfer and an ioctl()
will be used to pass the size and address of this buffer to the device driver. The driver will then use scatter-gather lists along with the streaming DMA API to transfer data directly to and from the device and user-space buffer.
The implementation strategy I'm using is that the ioctl()
in the driver enters a loop that DMA's the userspace buffer in chunks of 256k (which is the hardware imposed limit for how many scatter/gather entries it can handle). This is isolated inside a function that blocks until each transfer is complete (see below). When all bytes are transfered or the incremental transfer function returns an error the ioctl()
exits and returns to userspace
Pseudo code for the ioctl()
/*serialize all DMA transfers to/from the device*/ if (mutex_lock_interruptible( &device_ptr->mtx ) ) return -EINTR; chunk_data = (unsigned long) user_space_addr; while( *transferred < total_bytes && !ret ) { chunk_bytes = total_bytes - *transferred; if (chunk_bytes > HW_DMA_MAX) chunk_bytes = HW_DMA_MAX; /* 256kb limit imposed by my device */ ret = transfer_chunk(device_ptr, chunk_data, chunk_bytes, transferred); chunk_data += chunk_bytes; chunk_offset += chunk_bytes; } mutex_unlock(&device_ptr->mtx);
Pseudo code for incremental transfer function:
/*Assuming the userspace pointer is passed as an unsigned long, */ /*calculate the first,last, and number of pages being transferred via*/ first_page = (udata & PAGE_MASK) >> PAGE_SHIFT; last_page = ((udata+nbytes-1) & PAGE_MASK) >> PAGE_SHIFT; first_page_offset = udata & PAGE_MASK; npages = last_page - first_page + 1; /* Ensure that all userspace pages are locked in memory for the */ /* duration of the DMA transfer */ down_read(¤t->mm->mmap_sem); ret = get_user_pages(current, current->mm, udata, npages, is_writing_to_userspace, 0, &pages_array, NULL); up_read(¤t->mm->mmap_sem); /* Map a scatter-gather list to point at the userspace pages */ /*first*/ sg_set_page(&sglist[0], pages_array[0], PAGE_SIZE - fp_offset, fp_offset); /*middle*/ for(i=1; i < npages-1; i++) sg_set_page(&sglist[i], pages_array[i], PAGE_SIZE, 0); /*last*/ if (npages > 1) { sg_set_page(&sglist[npages-1], pages_array[npages-1], nbytes - (PAGE_SIZE - fp_offset) - ((npages-2)*PAGE_SIZE), 0); } /* Do the hardware specific thing to give it the scatter-gather list and tell it to start the DMA transfer */ /* Wait for the DMA transfer to complete */ ret = wait_event_interruptible_timeout( &device_ptr->dma_wait, &device_ptr->flag_dma_done, HZ*2 ); if (ret == 0) /* DMA operation timed out */ else if (ret == -ERESTARTSYS ) /* DMA operation interrupted by signal */ else { /* DMA success */ *transferred += nbytes; return 0; }
The interrupt handler is exceptionally brief:
/* Do hardware specific thing to make the device happy */ /* Wake the thread waiting for this DMA operation to complete */ device_ptr->flag_dma_done = 1; wake_up_interruptible(device_ptr->dma_wait);
Please note that this is just a general approach, I've been working on this driver for the last few weeks and have yet to actually test it... So please, don't treat this pseudo code as gospel and be sure to double check all logic and parameters ;-).
You basically have the right idea: in 2.1, you can just have userspace allocate any old memory. You do want it page-aligned, so posix_memalign()
is a handy API to use.
Then have userspace pass in the userspace virtual address and size of this buffer somehow; ioctl() is a good quick and dirty way to do this. In the kernel, allocate an appropriately sized buffer array of struct page*
-- user_buf_size/PAGE_SIZE
entries -- and use get_user_pages()
to get a list of struct page* for the userspace buffer.
Once you have that, you can allocate an array of struct scatterlist
that is the same size as your page array and loop through the list of pages doing sg_set_page()
. After the sg list is set up, you do dma_map_sg()
on the array of scatterlist and then you can get the sg_dma_address
and sg_dma_len
for each entry in the scatterlist (note you have to use the return value of dma_map_sg()
because you may end up with fewer mapped entries because things might get merged by the DMA mapping code).
That gives you all the bus addresses to pass to your device, and then you can trigger the DMA and wait for it however you want. The read()-based scheme you have is probably fine.
You can refer to drivers/infiniband/core/umem.c, specifically ib_umem_get()
, for some code that builds up this mapping, although the generality that that code needs to deal with may make it a bit confusing.
Alternatively, if your device doesn't handle scatter/gather lists too well and you want contiguous memory, you could use get_free_pages()
to allocate a physically contiguous buffer and use dma_map_page()
on that. To give userspace access to that memory, your driver just needs to implement an mmap
method instead of the ioctl as described above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With