Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Capture screenshot of macOS window

Note: this question is intentionally very general (e.g. both Objective-C and Swift code examples are requested), as it is intended to document how to capture a window screenshot on macOS as accessibly as possible.

I want to capture a screenshot of a macOS window in Objective-C/Swift code. I know this is possible because of the multitude of ways to take a screenshot on macOS (⇧⌘4, the Grab utility, screencapture on the command line, …), but I’m not sure how to do it in my own code. Ideally, I’d be able to specify a window of a particular application, and then capture it in an NSImage or CGImage that I could then process and display to the user or store in a file.

like image 387
ThatsJustCheesy Avatar asked Dec 30 '17 01:12

ThatsJustCheesy


1 Answers

Screen capture on macOS is possible through Quartz Window Services, a facility of the Core Graphics framework. Our key function here is CGWindowListCreateImage, which “returns a composite image based on a dynamically generated list of windows,” or, in other words, finds windows based on specified criteria and creates an image with the contents of each. Perfect! Its declaration is as follows:

CGImageRef CGWindowListCreateImage(CGRect screenBounds, 
                                   CGWindowListOption listOption, 
                                   CGWindowID windowID, 
                                   CGWindowImageOption imageOption);

So, in order to capture one specific window on the screen, we’ll need its window ID (CGWindowID). To go about retrieving that, we’ll first need a list of all of the windows available on the system. We get that through CGWindowListCopyWindowInfo, which takes CGWindowListOptions and a corresponding CGWindowID that, together, select which windows to include in the resulting list. To get ALL the windows, we specify kCGWindowListOptionAll, and kCGNullWindowID, respectively. Also, if you haven’t figured it out already, this is a C API, so we’ll use a bridging cast to work with the friendlier Objective-C containers rather than the Core Foundation ones.

Objective-C:

NSArray<NSDictionary*> *windowInfoList = (__bridge_transfer id)
    CGWindowListCopyWindowInfo(kCGWindowListOptionAll, kCGNullWindowID);

Swift:

let windowInfoList = CGWindowListCopyWindowInfo(.optionAll, kCGNullWindowID)!
    as NSArray

From here, we need to filter our windowInfoList down to the specific window that we want. Chances are we want to filter first by application. To do that, we’ll need the process ID of our application of choice. We can use NSRunningApplication to accomplish this:

Objective-C:

NSArray<NSRunningApplication*> *apps = 
    [NSRunningApplication runningApplicationsWithBundleIdentifier:
        /* Bundle ID of the application, e.g.: */ @"com.apple.Safari"];
if (apps.count == 0) {
    // Application is not currently running
    puts("The application is not running");
    return; // Or whatever
}
pid_t appPID = apps[0].processIdentifier;

Swift:

let apps = NSRunningApplication.runningApplications(withBundleIdentifier:
    /* Bundle ID of the application, e.g.: */ "com.apple.Safari")
if apps.isEmpty {
    // Application is not currently running
    print("The application is not running")
    return // Or whatever
}
let appPID = apps[0].processIdentifier

With appPID in hand, we can now go ahead and filter down our window info list to only windows with a matching owner PID:

Objective-C:

NSMutableArray<NSDictionary*> *appWindowsInfoList = [NSMutableArray new];
for (NSDictionary *info in windowInfoList) {
    if ([info[(__bridge NSString *)kCGWindowOwnerPID] integerValue] == appPID) {
        [appWindowsInfoList addObject:info];
    }
}

Swift:

var appWindowsInfoList = [NSDictionary]()
for info_ in windowInfoList {
    let info = info_ as! NSDictionary
    if (info[kCGWindowOwnerPID as NSString] as! NSNumber).intValue == appPID {
        appWindowsInfoList.append(info)
    }
}

We could have done additional filtering above by testing other keys of the info dictionary—for example, by name (kCGWindowName), or by whether the window is on-screen (kCGWindowIsOnscreen)—but for now, we’ll just take the first window in the list:

Objective-C:

NSDictionary *appWindowInfo = appWindowsInfoList[0];
CGWindowID windowID = [appWindowInfo[(__bridge NSString *)kCGWindowNumber] unsignedIntValue];

Swift:

let appWindowInfo: NSDictionary = appWindowsInfoList[0];
let windowID: CGWindowID = (appWindowInfo[kCGWindowNumber as NSString] as! NSNumber).uint32Value

And we have our window ID! Now, what else did we need for that call again?

CGImageRef CGWindowListCreateImage(CGRect screenBounds, 
                                   CGWindowListOption listOption, 
                                   CGWindowID windowID, 
                                   CGWindowImageOption imageOption);

First, we need a screenBounds to capture. According to the documentation, we can specify CGRectNull for this parameter to enclose all specified windows as tightly as possible. Works for me.

Second, we have to specify how we want to select our windows with listOption. We actually used one of these earlier, in our call to CGWindowListCopyWindowInfo, but there we wanted all the windows on the system; here, we only want one, so we’ll specify kCGWindowListOptionIncludingWindow, which, contrary to its documentation page, is meaningful on its own for CGWindowListCreateImage in that it specifies the window we pass, and only the window we pass.

Third, we pass our windowID as the window we want captured.

Fourth and finally, we can specify CGWindowImageOptions with the imageOption parameter. These affect the appearance of the resulting image; you can combine them through bitwise OR. The full list is here, but common ones include either kCGWindowImageDefault, which captures the window's contents along with its frame and shadow, or kCGWindowImageBoundsIgnoreFraming, which captures only the content, and kCGWindowImageBestResolution, which captures the window's content at the best resolution available, regardless of actual size (and, depending on the window, may be considerably large), or kCGWindowImageNominalResolution, which captures the window at its current size on the screen. Here, I’ve gone with kCGWindowImageBoundsIgnoreFraming and kCGWindowImageNominalResolution to capture only the content at the same size as on the screen.

Aaand, drumroll please:

Objective-C:

CGImageRef windowImage =
    CGWindowListCreateImage(CGRectNull, kCGWindowListOptionIncludingWindow,
                            windowID, kCGWindowImageBoundsIgnoreFraming|
                            kCGWindowImageNominalResolution);
// NOTE: windowImage may be NULL if the capture failed

Swift:

let windowImage: CGImage? =
    CGWindowListCreateImage(.null, .optionIncludingWindow, windowID,
                            [.boundsIgnoreFraming, .nominalResolution])
like image 108
ThatsJustCheesy Avatar answered Oct 07 '22 13:10

ThatsJustCheesy