I am working with a high speed serial card for high rate data transfers from an external source to a Linux box with a PCIe card. The PCIe card came with some 3rd party drivers that use dma_alloc_coherent to allocate the dma buffers to receive the data. Due to Linux limitations however, this approach limits data transfers to 4MB. I have been reading and trying multiple methods for allocating a large DMA buffer and haven't been able to get one to work.
This system has 32GB of memory and is running Red Hat with a kernel version of 3.10 and I would like to make 4GB of that available for a contiguous DMA. I know the preferred method is scatter/gather, but this is not possible in my situation as there is a hardware chip that translated the serial protocol into a DMA beyond my control, where the only thing that I can control is adding an offset to the incoming addresses (ie, address zero as seen from the external system can be mapped to address 0x700000000 on the local bus).
Since this is a one-off lab machine I think the fastest/easiest approach would be to use mem=28GB boot configuration parameter. I have this working fine, but the next step to access that memory from virtual space is where I am having problems. Here is my code condensed to the relevant components:
In the kernel module:
size_t len = 0x100000000ULL; // 4GB
size_t phys = 0x700000000ULL; // 28GB
size_t virt = ioremap_nocache( phys, len ); // address not usable via direct reference
size_t bus = (size_t)virt_to_bus( (void*)virt ); // this should be the same as phys for x86-64, shouldn't it?
// OLD WAY
/*size_t len = 0x400000; // 4MB
size_t bus;
size_t virt = dma_alloc_coherent( devHandle, len, &bus, GFP_ATOMIC );
size_t phys = (size_t)virt_to_phys( (void*)virt );*/
In the application:
// Attempt to make a usable virtual pointer
u32 pSize = sysconf(_SC_PAGESIZE);
void* mapAddr = mmap(0, len+(phys%pSize), PROT_READ|PROT_WRITE, MAP_SHARED, devHandle, phys-(phys%pSize));
virt = (size_t)mapAddr + (phys%pSize);
// do DMA to 0x700000000 bus address
printf("Value %x\n", *((u32*)virt)); // this is returning zero
Another interesting thing is that before doing all of this, the physical address returned from dma_alloc_coherent is greater than the amount of RAM on the system(0x83d000000). I thought that in x86 the RAM will always be the lowest addresses and therefore I would expect an address less than 32GB.
Any help would be appreciated.
Instead of limiting the amount of system memory via mem
, try using CMA: https://lwn.net/Articles/486301/
Using the CMA kernel command line argument allows you to reserve a certain amount of memory for DMA operations that is guaranteed to be contiguous. The kernel will allow non-DMA processes to access that memory, but as soon as a DMA operation needs that memory, non-DMA processes will be evicted. So, I would advise not changing your mem
parameter, but adding cma=4G
to your cmdline. dma_alloc_coherent
should automatically pull from that reserved space, but you can enable CMA debugging in your kernel config to make sure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With