Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing out DMA buffers into memory mapped file

I need to write in embedded Linux(2.6.37) as fast as possible incoming DMA buffers to HD partition as raw device /dev/sda1. Buffers are aligned as required and are of equal 512KB length. The process may continue for a very long time and fill as much as, for example, 256GB of data. I need to use the memory-mapped file technique (O_DIRECT not applicable), but can't understand the exact way how to do this. So, in pseudo code "normal" writing:

fd=open(/dev/sda1",O_WRONLY);
while(1) {
    p = GetVirtualPointerToNewBuffer();
    if (InputStopped())
        break;
    write(fd, p, BLOCK512KB);
}

Now, I will be very thankful for the similar pseudo/real code example of how to utilize memory-mapped technique for this writing.

UPDATE2: Thanks to kestasx the latest working test code looks like following:

#define TSIZE   (64*KB)
void* TBuf;
int main(int argc, char **argv) {
    int fdi=open("input.dat", O_RDONLY);
    //int fdo=open("/dev/sdb2", O_RDWR);
    int fdo=open("output.dat", O_RDWR);
    int i, offs=0;
    void* addr;
        i = posix_memalign(&TBuf, TSIZE, TSIZE);
        if ((fdo < 1) || (fdi < 1)) {
            printf("Error in files\n");
            return -1; }
        while(1) {
            addr = mmap((void*)TBuf, TSIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fdo, offs);
            if ((unsigned int)addr == 0xFFFFFFFFUL) {
                printf("Error MMAP=%d, %s\n", errno, strerror(errno));
                return -1; }
            i = read(fdi, TBuf, TSIZE);
            if (i != TSIZE) {
                printf("End of data\n");
                return 0; }
            i = munmap(addr, TSIZE);
            offs += TSIZE;
            sleep(1);
        };
}

UPDATE3: 1. To precisely imitate the DMA work, I need to move read() call before mmp(), because when the DMA finishes it provides me with the address where it has put data. So, in pseudo code: while(1) { read(fdi, TBuf, TSIZE); addr = mmap((void*)TBuf, TSIZE, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED, fdo, offs); munmap(addr, TSIZE); offs += TSIZE; }

This variant fails after(!) the first loop - read() says BAD ADDRESS on TBuf. Without understanding exactly what I do, I substituted munmap() with msync(). This worked perfectly.
So, the question here - why unmapping the addr influenced on TBuf?

2.With the previous example working I went to the real system with the DMA. The same loop, just instead of read() call is the call which waits for a DMA buffer to be ready and its virtual address provided.
There are no error, the code runs, BUT nothing is recorded (!). My thought was that Linux does not see that the area was updated and therefore does not sync() a thing.
To test this, I eliminated in the working example the read() call - and yes, nothing was recorded too.

So, the question here - how can I tell Linux that the mapped region contains new data, please, flush it!

Thanks a lot!!!

like image 356
leonp Avatar asked Nov 09 '22 21:11

leonp


1 Answers

If I correctly understand, it makes sense if You mmap() file (not sure if it You can mmap() raw partition/block-device) and data via DMA is written directly to this memory region.

For this to work You need to be able to control p (where new buffer is placed) or address where file is maped. If You don't - You'll have to copy memory contents (and will lose some benefits of mmap).

So psudo code would be:

truncate("data.bin", 256GB);
fd = open( "data.bin", O_RDWR );
p = GetVirtualPointerToNewBuffer();
adr = mmap( p, 1GB, PROT_READ | PROT_WRITE, MAP_SHARED, fd, offset_in_file );
startDMA();
waitDMAfinish();
munmap( adr, 1GB );

This is first step only and I'm not completely sure if it will work with DMA (have no such experience).

I assume it is 32bit system, but even then 1GB mapped file size may be too big (if Your RAM is smaller You'll be swaping).

If this setup will work, next step would be to make loop to map regions of file at different offsets and unmap already filled ones.

Most likely You'll need to align addr to 4KB boundary.

When You'll unmap region, it's data will be synced to disk. So You'll need some testing to select appropriate mapped region size (while next region is filled by DMA, there must be enough time to unmap/write previous one).

UPDATE:

What exactly happens when You fill mmap'ed region via DMA I simply don't know (not sure how exactly dirty pages are detected: what is done by hardware, and what must be done by software).

UPDATE2: To my best knowledge:

DMA works the following way:

  • CPU arranges DMA transfer (address where to write transfered data in RAM);
  • DMA controller does the actual work, while CPU can do it's own work in parallel;
  • once DMA transfer is complete - DMA controller signals CPU via IRQ line (interrupt), so CPU can handle the result.

This seems simple while virtual memory is not involved: DMA should work independently from runing process (actual VM table in use by CPU). Yet it should be some mehanism to invalidate CPU cache for modified by DMA physical RAM pages (don't know if CPU needs to do something, or it is done authomatically by hardware).

mmap() forks the following way:

  • after successfull call of mmap(), file on disk is attached to process memory range (most likely some data structure is filled in OS kernel to hold this info);
  • I/O (reading or writing) from mmaped range triggers pagefault, which is handled by kernel loading appropriate blocks from atached file;
  • writes to mmaped range are handled by hardware (don't know how exactly: maybe writes to previously unmodified pages triger some fault, which is handled by kernel marking these pages dirty; or maybe this marking is done entirely in hardware and this info is available to kernel when it needs to flush modified pages to disk).
  • modified (dirty) pages are written to disk by OS (as it sees appropriate) or can be forced via msync() or munmap()

In theory it should be possible to do DMA transfers to mmaped range, but You need to find out, how exactly pages ar marked dirty (if You need to do something to inform kernel which pages need to be written to disk).

UPDATE3:

Even if modified by DMA pages are not marked dirty, You should be able to triger marking by rewriting (reading ant then writing the same) at least one value in each page (most likely each 4KB) transfered. Just make sure this rewriting is not removed (optimised out) by compiler.

UPDATE4:

It seems file opened O_WRONLY can't be mmap'ed (see question comments, my experimets confirm this too). It is logical conclusion of mmap() workings described above. The same is confirmed here (with reference to POSIX standart requirement to ensure file is readable regardless of maping protection flags).

Unless there is some way around, it actually means that by using mmap() You can't avoid reading of results file (unnecessary step in Your case).

Regarding DMA transfers to mapped range, I think it will be a requirement to ensure maped pages are preloalocated before DMA starts (so there is real memory asigned to both DMA and maped region). On Linux there is MAP_POPULATE mmap flag, but from manual it seams it works with MAP_PRIVATE mapings only (changes are not writen to disk), so most likely it is usuitable. Likely You'll have to triger pagefaults manually by accessing each maped page. This should triger reading of results file.

If You still wish to use mmap and DMA together, but avoid reading of results file, You'll have to modify kernel internals to allow mmap to use O_WRONLY files (for example by zero-filling trigered pages, instead of reading them from disk).

like image 105
kestasx Avatar answered Nov 25 '22 08:11

kestasx