I need to improve the performance of a piece of software that parses XML files and adds their contents to a large SQL Database. I have been trying to find information about whether or not it is possible to implement this on a GPU. My research regarding both CUDA and OpenCL has left me with out any clear answers beyond the fact that software can be developed in C/C++, FORTRAN and many other languages using compiler directives to enable GPU processing. This leads me to ask this question: Do I actually need an API or library written for GPU acceleration, or would a program written in C/C++ using a standard XML Parsing library and compiled with the compiler directives for CUDA/OpenCL automatically run the XML library functions on the GPU?
In general, GPU are not suited for XML processing acceleration...GPU are only great if the intended task has massively parallelism to exploit the large number of GPU processing units.. XML processing on the other hand is largely a single thread state machine transitional type of job.
I actually don't see any sense in parsing XML on GPU. GPU architecture is focused on massive floating point numbers calculations and not operations like text processing. I think it is much better to use CPU and split XML parsing between threads to make use of multiple cores. Using GPU in such application is in my opinion overkill.
First look at the structure of your xml. Following this link you can find criteria for XML structure suitable for parallel processing. Parallel XML Parsing in Java
If your xml structure is parallel-processable, then several ideas:
As i know, XML parsing needs stack structure to remember current position in the tree and verify proper opening and closing of nodes.
Stack structure can be represented as an 1-dimensional array with stack pointer. Stack pointer contains position of the stack top element in the array
They say that you can store arrays in 1D textures (max. 4,096 elements). Or in 2D textures (max. 16,777,216 = 4,096x4,096 elements) ... Look at following link for more https://developer.nvidia.com/gpugems/GPUGems2/gpugems2_chapter33.html
if you assign separate floating point number to each unique element name, then you can store elements as numbers
if you take the input text as an array of ascii/utf-8 codes, then why not store them as an array of floating point numbers?
Last thing important to consider using GPU is what is the output structure.
If you need e.g. table row of fixed length columns, then it is only about how to represent such structure in 1D or 2D array of float numbers
When you're sure about previous points and GPU is the right for you, then just write functions to convert your data to textures and textures back to your data
And then of course the whole xml parser...
I never tried programming with GPU at all, but seems very soon to me to say that something is impossible ...
Someone should be the first to build the whole algorithm and try whether it is efficient to use GPU or not
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With