Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Asynchronous io in c using windows API: which method to use and why does my code execute synchronous?

I have a C application which generates a lot of output and for which speed is critical. The program is basically a loop over a large (8-12GB) binary input file which must be read sequentially. In each iteration the read bytes are processed and output is generated and written to multiple files, but never to multiple files at the same time. So if you are at the point where output is generated and there are 4 output files you write to either file 0 or 1 or 2, or 3. At the end of the iteration I now write the output using fwrite(), thereby waiting for the write operation to finish. The total number of output operations is large, up to 4 million per file, and output size of files ranges from 100mb to 3.5GB. The program runs on a basic multicore processor.

I want to write output in a separate thread and I know this can be done with

  1. Asyncronous I/O
  2. Creating threads
  3. I/O completion ports

I have 2 type of questions, namely conceptual and code specific.

Conceptual Question

What would be the best approach. Note that the application should be portable to Linux, however, I don't see how that would be very important for my choice for 1-3, since I would write a wrapper around anything kernel/API specific. For me the most important criteria is speed. I have read that option 1 is not that likely to increase the performance of the program and that the kernel in any case creates new threads for the i/o operation, so then why not use option (2) immediately with the advantage that it seems easier to program (also since I did not succeed with option (1), see code issues below).

Note that I read https://stackoverflow.com/questions/3689759/how-can-i-run-a-specific-function-of-thread-asynchronously-in-c-c, but I dont see a motivation on what to use based on the nature of the application. So I hope somebody could provide me with some advice what would be best in my situation. Also from the book "Windows System Programming" by Johnson M. Hart, I know that the recommendation is using threads, mainly because of the simplicity. However, will it also be fastest?

Code Question

This question involves the attempts I made so far to make asynchronous I/O work. I understand that its a big piece of code so that its not that easy to look into. In any case I would really appreciate any attempt.

To decrease execution time I try to write the output by means of a new thread using WINAPI via CreateFile() with FILE_FLAGGED_OVERLAP with an overlapped structure. I have created a sample program in which I try to get this to work. However, I encountered 2 problems:

  1. The file is only opened in overlapped mode when I delete an already existing file (I have tried using CreateFile in different modes (CREATE_ALWAYS, CREATE_NEW, OPEN_EXISTING), but this does not help).

  2. Only the first WriteFile is executed asynchronously. The remainder of WriteFile commands is synchronous. For this problem I already consulted http://support.microsoft.com/kb/156932. It seems that the problem I have is related to the fact that "any write operation to a file that extends its length will be synchronous". I've already tried to solve this by increasing file size/valid data size (commented region in code). However, I still do not get it to work. I'm aware of the fact that it could be the case that to get most out of asynchronous io i should CreateFile with FILE_FLAG_NO_BUFFERING, however I cannot get this to work as well.

Please note that the program creates a file of about 120mb in the path of execution. Also note that print statements "not ok" are not desireable, I would like to see "can do work in background" appear on my screen... What goes wrong here?

#include <windows.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define ASYNC  // remove this definition to run synchronously (i.e. using fwrite)

#ifdef ASYNC
    struct _OVERLAPPED *pOverlapped;
    HANDLE *pEventH;
    HANDLE *pFile;
#else
    FILE *pFile;
#endif

#define DIM_X   100
#define DIM_Y   150000

#define _PRINTERROR(msgs)\
 {printf("file: %s, line: %d, %s",__FILE__,__LINE__,msgs);\
 fflush(stdout);\
 return 0;}    \

#define _PRINTF(msgs)\
 {printf(msgs);\
  fflush(stdout);}      \

#define _START_TIMER       \
 time_t time1,time2;       \
 clock_t clock1;        \
 time(&time1);        \
 printf("start time: %s",ctime(&time1));  \
 fflush(stdout);

#define _END_TIMER\
 time(&time2);\
 clock1 = clock();\
 printf("end time: %s",ctime(&time2));\
 printf("elapsed processor time: %.2f\n",(((float)clock1)/CLOCKS_PER_SEC));\
 fflush(stdout);

double  aio_dat[DIM_Y] = {0};

double do_compute(double A,double B, int arr_len);

int main()
{
 _START_TIMER;

 const char *pName = "test1.bin";

    DWORD dwBytesToWrite;
    BOOL bErrorFlag = FALSE;

 int j=0;
 int i=0;
 int fOverlapped=0;

    #ifdef ASYNC
     // create / open the file
        pFile=CreateFile(pName,
            GENERIC_WRITE,   // open for writing
            0,               // share write access
            NULL,            // default security
            CREATE_ALWAYS,   // create new/overwrite existing
         FILE_FLAG_OVERLAPPED,  // | FILE_FLAG_NO_BUFFERING,   // overlapped file
         NULL);           // no attr. template

  // check whether file opening was ok
     if(pFile==INVALID_HANDLE_VALUE){
   printf("%x\n",GetLastError());
   _PRINTERROR("file not opened properly\n");
  }

  // make the overlapped structure
     pOverlapped = calloc(1,sizeof(struct _OVERLAPPED)); 
  pOverlapped->Offset = 0;
  pOverlapped->OffsetHigh = 0;

   // put event handle in overlapped structure
  if(!(pOverlapped->hEvent = CreateEvent(NULL,TRUE,FALSE,NULL))){
   printf("%x\n",GetLastError());
   _PRINTERROR("error in createevent\n");
  }
 #else
  pFile = fopen(pName,"wb");
 #endif 

 // create some output 
 for(j=0;j<DIM_Y;j++){
     aio_dat[j] = do_compute(i, j, DIM_X);
 }

 // determine how many bytes should be written
  dwBytesToWrite = (DWORD)sizeof(aio_dat);

 for(i=0;i<DIM_X;i++){ // do this DIM_X times

        #ifdef ASYNC
         //if(i>0){
      //SetFilePointer(pFile,dwBytesToWrite,NULL,FILE_CURRENT);
      //if(!(SetEndOfFile(pFile))){
      // printf("%i\n",pFile);
      // _PRINTERROR("error in set end of file\n");
      //}
      //SetFilePointer(pFile,-dwBytesToWrite,NULL,FILE_CURRENT);
      //}

      // write the bytes
      if(!(bErrorFlag = WriteFile(pFile,aio_dat,dwBytesToWrite,NULL,pOverlapped))){
       // check whether io pending or some other error
       if(GetLastError()!=ERROR_IO_PENDING){
    printf("lasterror: %x\n",GetLastError());
    _PRINTERROR("error while writing file\n");
       }
       else{
        fOverlapped=1;
       }
      }
          else{
       // if you get here output got immediately written; bad!
       fOverlapped=0;
      }

      if(fOverlapped){   
       // do background, this msgs is what I want to see
       for(j=0;j<DIM_Y;j++){
        aio_dat[j] = do_compute(i, j, DIM_X);
       }
       for(j=0;j<DIM_Y;j++){
        aio_dat[j] = do_compute(i, j, DIM_X);
       }

       _PRINTF("can do work in background\n");
      }
      else{
       // not overlapped, this message is bad
       _PRINTF("not ok\n");
      } 

                // wait to continue
      if((WaitForSingleObject(pOverlapped->hEvent,INFINITE))!=WAIT_OBJECT_0){
       _PRINTERROR("waiting did not succeed\n");
      }

      // reset event structure
      if(!(ResetEvent(pOverlapped->hEvent))){
       printf("%x\n",GetLastError());
       _PRINTERROR("error in resetevent\n");
      }

      pOverlapped->Offset+=dwBytesToWrite;

  #else
      fwrite(aio_dat,sizeof(double),DIM_Y,pFile);
   for(j=0;j<DIM_Y;j++){
    aio_dat[j] = do_compute(i, j, DIM_X);
   }
   for(j=0;j<DIM_Y;j++){
    aio_dat[j] = do_compute(i, j, DIM_X);
   }
  #endif
 }

 #ifdef ASYNC
     CloseHandle(pFile);
  free(pOverlapped);
 #else
     fclose(pFile);
 #endif

 _END_TIMER;

 return 1;
} 

double do_compute(double A,double B, int arr_len)
{
  int i;
  double   res = 0;
  double  *xA = malloc(arr_len * sizeof(double));
  double  *xB = malloc(arr_len * sizeof(double));

  if ( !xA || !xB )
   abort();

  for (i = 0; i < arr_len; i++) {
   xA[i] = sin(A);
   xB[i] = cos(B);
   res = res + xA[i]*xA[i];
  }

  free(xA);
  free(xB);

  return res;
}

Useful links

  • http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011/compiler_c/cref_cls/common/cppref_asynchioC_aio_read_write_eg.htm

  • http://www.ibm.com/developerworks/linux/library/l-async/?ca=dgr-lnxw02aUsingPOISIXAIOAPI

  • http://www.flounder.com/asynchexplorer.htm#Asynchronous%20I/O

I know this is a big question and I would like to thank everybody in advance who takes the trouble reading it and perhaps even respond!

like image 773
Martin Avatar asked Dec 15 '10 14:12

Martin


People also ask

What is asynchronous I O and how would it be used in a network application?

Unsourced material may be challenged and removed. In computer science, asynchronous I/O (also non-sequential I/O) is a form of input/output processing that permits other processing to continue before the transmission has finished. A name used for asynchronous I/O in the Windows API is overlapped I/O.

What is the difference between synchronous and asynchronous I O?

Synchronous I/O mean that some flow of execution (such as a process or thread) is waiting for the operation to complete. Asynchronous I/O means that nothing is waiting for the operation to complete and the completion of the operation itself causes something to happen.

What is synchronous and asynchronous in C#?

In synchronous operations tasks are performed one at a time and only when one is completed, the following is unblocked. In other words, you need to wait for a task to finish to move to the next one. In asynchronous operations, on the other hand, you can move to another task before the previous one finishes.

What are the advantages of having asynchronous I O calls available on through system calls?

The Asynchronous I/O feature enhances performance by allowing applications to overlap processing with I/O operations. Using Asynchronous I/O enables an application to have more CPU time available to perform other processing while the I/O is taking place.


1 Answers

You should be able to get this to work using the OVERLAPPED structure.

You're on the right track: the system is preventing you from writing asynchronously because every WriteFile extends the size of the file. However, you're doing the file size extension wrong. Simply calling SetFileSize will not actually reserve space in the MFT. Use the SetFileValidData function. This will allocate clusters for your file (note that they will contain whatever garbage the disk had there) and you should be able to execute WriteFile and your computation in parallel.

I would stay away from FILE_FLAG_NO_BUFFERING. You're after more performance with parallelism I presume? Don't prevent the cache from doing its job.

like image 122
martona Avatar answered Nov 08 '22 22:11

martona