Heyo, this is my first time asking a question here so do forgive me if I mess somethin up >~<
I'm working on a program similar to openCanvas, the earlier ones that allowed multiple people to draw on the same canvas in real time over the internet. OC's really buggy and has a lot of limitations, which is why I wanted to write this.
I have it set up so the canvas extends "indefinitely" in all directions and is made up of 512x512 blocks of pixels that don't become active until they're drawn on, which should be really easy to make, and I was thinking about using Direct3D to make it hardware accelerated, thus the 512 square blocks.
My problem comes when I want to use layers, I'm not quite sure how I can compose layers quickly and without using a ton of memory, since my target is DirectX9 compatible video cards with 128m of memory, and a system with about 3.2 ghz of CPU power and between 2 and 8 gigs of ram. I had a few different approaches I was thinking of using and was wondering which would probably be the best, and if there was anything I could look into to make it run better.
My first idea was to make the gfx hardware do as much work as possible by having all the layers on all the blocks serve as textures, and they'd be updated by locking the changed area, updating them on the cpu, and unlocking them. Blocks that aren't currently being changed are flattened into one texture and the individual layers themselves are kept in system memory, which would reduce the gfx memory used, but could significantly increase bandwidth usage between system and gfx memory. I can see the constant locking and unlocking potentially slowing down the system pretty bad as well. Another possible issue is that I've heard some people using up to 200 layers, and I can't think of any good ways to optimize that given the above.
My other idea was to compose the textures -completely- in system memory, write them into a texture, and copy that texture to gfx memory to be rendered in each block. This seems to eliminate a lot of the issues with the other method, but at the same time I'm moving all the work into the CPU, instead of balancing it. This isn't a big deal as long as it still runs quickly, though. Again, however, there's the issue of having a couple hundred layers. In this case though, I could probably only update the final pixels that are actually changing, which is what I think the bigger name programs like Sai and Photoshop do.
I'm mostly looking for recommendations, suggestions that might improve the above, better methods, or links to articles that might be related to such a project. While I'm writing it in C++, I have no trouble translating from other languages. Thanks for your time~
Data Structure
You should definitely use a quadtree (or another hierarchical data structure) to store your canvas and its nodes should contain much smaller blocks than 512x512 pixels. Maybe not as small as 1x1 pixels, because then the hierarchical overhead would kill you - you'll find a good balance through testing.
Drawing
Let your users only draw on one (the highest) resolution. Imagine a infinitely large uniform grid (two dimensional array). Since you know the position of the mouse and the amount which your users have scrolled from the origin, you can derive absolute coordinates. Traverse the quadtree into that region (eventually adding new nodes) and insert the blocks (for example 32x32) as the user draws them into the quadtree. I would buffer what the user draws in a 2D array (for example as big as his screen resolution) and use a separate thread to traverse/alter the quadtree and copy the data from the buffer to circumvent any delays.
Rendering
Traversing the quadtree and copying all tiles to one texture and send it to the GPU? No! You see, sending over one texture which is as big as the screen resolution is not the problem (bandwidth wise). But traversing the quadtree and assembling the final image is (at least if you want many fps). The answer is to store the quadtree in system memory and stream it from the GPU. Means: Asynchronously another thread does the traversal and copies currently viewed data to the GPU in chunks as fast as it can. If your user doesn't view the canvas in full resolution you don't have to traverse the tree to leaf level, which gives you automatic level of detail (LOD).
Some random thoughts regarding the proposed strategy
..off the top of my head. If you have further questions, let me know!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With