I am trying to put a ripple simulation ontop of an already buisy app. Right now the cpu runs at about 11ms on the lowest processors. All the code in it so far runs on the main thread.
I am hoping that it is possible to put the ripple simulation entirely on another thread.
The simulation is based on the apple GLCameraRipple project. Basically it creates a tesselated rectangle, and calculates the texture coordinates. So in an ideal world the texture coordinates, and ripple simulating arrays would all be on a different thread.
The update function I am working with right now looks like this. It does sorta leverage GCD however it gains no speed from doing so due to to the sync. Without the sync however the app would crash because swift arrays are not thread safe.
var rippleTexCoords:[GLfloat] = []
var rippleSource:[GLfloat] = []
var rippleDest:[GLfloat] = []
func runSimulation()
{
if (firstUpdate)
{firstUpdate = false; Whirl.crashLog("First update")}
let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
let block1: (y: size_t) -> Void = {
(y: size_t) -> Void in
objc_sync_enter(self)
defer { objc_sync_exit(self) } // */ This will actually run at the end
let pw = self.poolWidthi
for x in 1..<(pw - 1)
{
let ai:Int = (y ) * (pw + 2) + x + 1
let bi:Int = (y + 2) * (pw + 2) + x + 1
let ci:Int = (y + 1) * (pw + 2) + x
let di:Int = (y + 1) * (pw + 2) + x + 2
let me:Int = (y + 1) * (pw + 2) + x + 1
let a = self.rippleSource[ai]
let b = self.rippleSource[bi]
let c = self.rippleSource[ci]
let d = self.rippleSource[di]
var result = self.rippleDest[me]
result = (a + b + c + d) / 2.0 - result
result -= result / 32.0
self.rippleDest[me] = result
}
//Defer goes here
}
dispatch_apply(Int(poolHeighti), queue, block1);
/*for y in 0..<poolHeighti {
block1(y: y)
}*/
let hm1 = GLfloat(poolHeight - 1)
let wm1 = GLfloat(poolWidth - 1)
let block2: (y: size_t) -> Void = {
(y: size_t) -> Void in
objc_sync_enter(self)
defer { objc_sync_exit(self) } // */
let yy = GLfloat(y)
let pw = self.poolWidthi
for x in 1..<(pw - 1) {
let xx = GLfloat(x)
let ai:Int = (y ) * (pw + 2) + x + 1
let bi:Int = (y + 2) * (pw + 2) + x + 1
let ci:Int = (y + 1) * (pw + 2) + x
let di:Int = (y + 1) * (pw + 2) + x + 2
let a = self.rippleDest[ai]
let b = self.rippleDest[bi]
let c = self.rippleDest[ci]
let d = self.rippleDest[di]
var s_offset = ((b - a) / 2048)
var t_offset = ((c - d) / 2048)
s_offset = (s_offset < -0.5) ? -0.5 : s_offset;
t_offset = (t_offset < -0.5) ? -0.5 : t_offset;
s_offset = (s_offset > 0.5) ? 0.5 : s_offset;
t_offset = (t_offset > 0.5) ? 0.5 : t_offset;
let s_tc = yy / hm1
let t_tc = xx / wm1
let me = (y * pw + x) * 2
self.rippleTexCoords[me + 0] = s_tc + s_offset
self.rippleTexCoords[me + 1] = t_tc + t_offset
}
}
dispatch_apply(poolHeighti, queue, block2)
/* for y in 0..<poolHeighti {
block2(y: y)
} *///
let pTmp = rippleDest
rippleDest = rippleSource
rippleSource = pTmp
}
Is there any way to force this code to constantly run on a different thread? Or somehow get it to go faster?
I dont konw if it was possible but if it is I would have these arrays on the following thread:
Main:
Secondary: (These are never read or written on the main thread)
On both threads:
If these conditions are followed then the runSimulation method could possibly be run on the second thread without issue.
While a lock is held, the thread that holds the lock can again acquire and release the lock. Any other thread is blocked from acquiring the lock and waits until the lock is released. Since the code uses a try... finally block, the lock is released even if an exception is thrown within the body of a lock statement.
There is no such thing.
Only one thread can hold a lock at a time. If a thread tries to take a lock that is already held by another thread, then it must wait until the lock is released.
Lock framework works like synchronized blocks except locks can be more sophisticated than Java's synchronized blocks. Locks allow more flexible structuring of synchronized code.
Memory is dirt cheap these days. Why not save the result in an extra work array? Read-only access to rippleDest and rippleSource won't need sync. You'll only need to use the lock when copying the computed results to rippleDest, thus reducing locking time to the bare minimum.
For other speed gains, I'd start by moving initialisation of all of indices ai, bi, ci, di, me, out of the loop, since they are only incremented by 1 for each iteration. This would save at least a half dozen operations per node even after compiler optimisation - that's as many operations as the useful work done by the procedure. You probably wont't get a 50% improvement from that, but something closer to 10-15%, which is not bad.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With