dispatch_sync vs. dispatch_async on main queue

Tags:

Bear with me, this is going to take some explaining. I have a function that looks like the one below.

Context: "aProject" is a Core Data entity named LPProject with an array named 'memberFiles' that contains instances of another Core Data entity called LPFile. Each LPFile represents a file on disk and what we want to do is open each of those files and parse its text, looking for @import statements that point to OTHER files. If we find @import statements, we want to locate the file they point to and then 'link' that file to this one by adding a relationship to the core data entity that represents the first file. Since all of that can take some time on large files, we'll do it off the main thread using GCD.

- (void) establishImportLinksForFilesInProject:(LPProject *)aProject {     dispatch_queue_t taskQ = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);      for (LPFile *fileToCheck in aProject.memberFiles) {          if (//Some condition is met) {             dispatch_async(taskQ, ^{                 // Here, we do the scanning for @import statements.                  // When we find a valid one, we put the whole path to the imported file into an array called 'verifiedImports'.                   // go back to the main thread and update the model (Core Data is not thread-safe.)                 dispatch_sync(dispatch_get_main_queue(), ^{                      NSLog(@"Got to main thread.");                      for (NSString *import in verifiedImports) {                               // Add the relationship to Core Data LPFile entity.                     }                 });//end block             });//end block         }     } }

Now, here's where things get weird:

This code works, but I'm seeing an odd problem. If I run it on an LPProject that has a few files (about 20), it runs perfectly. However, if I run it on an LPProject that has more files (say, 60-70), it does NOT run correctly. We never get back to the main thread, the NSLog(@"got to main thread"); never appears and the app hangs. BUT, (and this is where things get REALLY weird) --- if I run the code on the small project FIRST and THEN run it on the large project, everything works perfectly. It's ONLY when I run the code on the large project first that the trouble shows up.

And here's the kicker, if I change the second dispatch line to this:

dispatch_async(dispatch_get_main_queue(), ^{

(That is, use async instead of sync to dispatch the block to the main queue), everything works all the time. Perfectly. Regardless of the number of files in a project!

I'm at a loss to explain this behavior. Any help or tips on what to test next would be appreciated.

423

asked Jun 30 '11 17:06

Bryan

2 Answers

This is a common issue related to disk I/O and GCD. Basically, GCD is probably spawning one thread for each file, and at a certain point you've got too many threads for the system to service in a reasonable amount of time.

Every time you call dispatch_async() and in that block you attempt to to any I/O (for example, it looks like you're reading some files here), it's likely that the thread in which that block of code is executing will block (get paused by the OS) while it waits for the data to be read from the filesystem. The way GCD works is such that when it sees that one of its worker threads is blocked on I/O and you're still asking it to do more work concurrently, it'll just spawn a new worker thread. Thus if you try to open 50 files on a concurrent queue, it's likely that you'll end up causing GCD to spawn ~50 threads.

This is too many threads for the system to meaningfully service, and you end up starving your main thread for CPU.

The way to fix this is to use a serial queue instead of a concurrent queue to do your file-based operations. It's easy to do. You'll want to create a serial queue and store it as an ivar in your object so you don't end up creating multiple serial queues. So remove this call:

dispatch_queue_t taskQ = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);

Add this in your init method:

taskQ = dispatch_queue_create("com.yourcompany.yourMeaningfulLabel", DISPATCH_QUEUE_SERIAL);

Add this in your dealloc method:

dispatch_release(taskQ);

And add this as an ivar in your class declaration:

dispatch_queue_t taskQ;

answered Sep 29 '22 16:09

Ryan

I believe Ryan is on the right path: there are simply too many threads being spawned when a project has 1,500 files (the amount I decided to test with.)

So, I refactored the code above to work like this:

- (void) establishImportLinksForFilesInProject:(LPProject *)aProject {         dispatch_queue_t taskQ = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);       dispatch_async(taskQ,       ^{       // Create a new Core Data Context on this thread using the same persistent data store          // as the main thread. Pass the objectID of aProject to access the managedObject      // for that project on this thread's context:       NSManagedObjectID *projectID = [aProject objectID];       for (LPFile *fileToCheck in [backgroundContext objectWithID:projectID] memberFiles])      {         if (//Some condition is met)         {                 // Here, we do the scanning for @import statements.                  // When we find a valid one, we put the whole path to the                  // imported file into an array called 'verifiedImports'.                   // Pass this ID to main thread in dispatch call below to access the same                 // file in the main thread's context                 NSManagedObjectID *fileID = [fileToCheck objectID];                   // go back to the main thread and update the model                  // (Core Data is not thread-safe.)                 dispatch_async(dispatch_get_main_queue(),                  ^{                     for (NSString *import in verifiedImports)                     {                          LPFile *targetFile = [mainContext objectWithID:fileID];                        // Add the relationship to targetFile.                      }                  });//end block          }     }     // Easy way to tell when we're done processing all files.     // Could add a dispatch_async(main_queue) call here to do something like UI updates, etc      });//end block     }

So, basically, we're now spawning one thread that reads all the files instead of one-thread-per-file. Also, it turns out that calling dispatch_async() on the main_queue is the correct approach: the worker thread will dispatch that block to the main thread and NOT wait for it to return before proceeding to scan the next file.

This implementation essentially sets up a "serial" queue as Ryan suggested (the for loop is the serial part of it), but with one advantage: when the for loop ends, we're done processing all the files and we can just stick a dispatch_async(main_queue) block there to do whatever we want. It's a very nice way to tell when the concurrent processing task is finished and that didn't exist in my old version.

The disadvantage here is that it's a bit more complicated to work with Core Data on multiple threads. But this approach seems to be bulletproof for projects with 5,000 files (which is the highest I've tested.)

answered Sep 29 '22 15:09

Bryan

Related questions
                            
                                How to use custom delegates in Objective-C
                            
                                How to convert and compare NSNumber to BOOL?
                            
                                Cocoa versus Cocoa Touch - What is the difference?
                            
                                Objective C - How to concatenate an entire array of strings?
                            
                                Couldn't load a xcode project because it is already opened from another project or workspace
                            
                                How to dismiss keyboard when user tap other area outside textfield?
                            
                                UITableViewCell checkmark change on select
                            
                                How do I get first x elements of an NSArray in Cocoa?
                            
                                Where do you declare a constant in Objective-C?
                            
                                How to add text input in alertview of ios 8?
                            
                                Objective-C delay action with blocks
                            
                                Is Objective-C's NSMutableArray thread-safe?
                            
                                How Do I sort an NSMutable Array with NSNumbers in it?
                            
                                Get class name of UIViewController in swift
                            
                                Generate Random Numbers Between Two Numbers in Objective-C
                            
                                NSDate - Convert Date to GMT
                            
                                resignFirstResponder for all textfields [duplicate]
                            
                                UIBezierPath Subtract Path
                            
                                How to display a base64 image within a UIImageView?
                            
                                Unable to dismiss MFMailComposeViewController, delegate not called

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

dispatch_sync vs. dispatch_async on main queue

Tags:

objective-c

cocoa

grand-central-dispatch

objective-c-blocks

core-data

Bryan

People also ask

2 Answers

Ryan

Bryan

Recent Activity

Donate For Us