Compression library using Nvidia's CUDA [closed]

2 Answers

We have finished first phase of research to increase performance of lossless data compression algorithms. Bzip2 was chosen for the prototype, our team optimized only one operation - Burrows–Wheeler transformation, and we got some results: 2x-4x speed up on good compressible files. The code works faster on all our tests.

We are going to complete bzip2, support deflate and LZMA for some real life tasks like: HTTP traffic and backups compression.

blog link: http://www.wave-access.com/public_en/blog/2011/april/22/breakthrough-in-cuda-data-compression.aspx

answered Nov 07 '22 05:11

Alexander Azarov

Not aware of anyone having done that and made it public. Just IMHO, it doesn't sound very promising.

As Martinus points out, some compression algorithms are highly serial. Block compression algorithms like LZW can be parallelized by coding each block independently. Ziping a large tree of files can be parallelized at the file level.

However, none of these is really SIMD-style parallelism (Single Instruction Multiple Data), and they're not massively parallel.

GPUs are basically vector processors, where you can be doing hundreds or thousands of ADD instructions all in lock step, and executing programs where there are very few data-dependent branches.

Compression algorithms in general sound more like an SPMD (Single Program Multiple Data) or MIMD (Multiple Instruction Multiple Data) programming model, which is better suited to multicore cpus.

Video compression algorithms can be accellerated by GPGPU processing like CUDA only to the extent that there is a very large number of pixel blocks that are being cosine-transformed or convolved (for motion detection) in parallel, and the IDCT or convolution subroutines can be expressed with branchless code.

GPUs also like algorithms that have high numeric intensity (the ratio of math operations to memory accesses.) Algorithms with low numeric intensity (like adding two vectors) can be massively parallel and SIMD, but still run slower on the gpu than the cpu because they're memory bound.

answered Nov 07 '22 07:11

Die in Sente

Related questions
                            
                                CUDA: How to use -arch and -code and SM vs COMPUTE
                            
                                Can I use __syncthreads() after having dropped threads?
                            
                                Running more than one CUDA applications on one GPU
                            
                                How to install CUDA in Google Colab GPU's
                            
                                High level GPU programming in C++ [closed]
                            
                                Can/Should I run this code of a statistical application on a GPU?
                            
                                Using std::vector in CUDA device code
                            
                                CUDA Driver API vs. CUDA runtime
                            
                                Why does cudaMalloc() use pointer to pointer?
                            
                                ImportError: libcublas.so.9.0: cannot open shared object file
                            
                                CUDA: How many concurrent threads in total?
                            
                                Fortran vs C++, does Fortran still hold any advantage in numerical analysis these days? [closed]
                            
                                How to let cmake find CUDA
                            
                                CUDA model - what is warp size?
                            
                                Where did CUDA get installed in my computer?
                            
                                Use of cudamalloc(). Why the double pointer?
                            
                                How can I compile CUDA code then link it to a C++ project?
                            
                                Structure of Arrays vs Array of Structures
                            
                                Python GPU programming [closed]
                            
                                What is the difference between cuda vs tensor cores?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Compression library using Nvidia's CUDA [closed]

Tags:

compression

cuda

gpgpu

Xn0vv3r

People also ask

2 Answers

Alexander Azarov

Die in Sente

Recent Activity

Donate For Us