Recently I have been trying to understand if hadoop clusters can be used for genetic algorithms/programming jobs. I've been reading about hadoop and I understand that it can parallize processing of large datasets. in my case, I wouldn't have large data sets.. but what i would find really useful are the parallelizing capabilities of hadoop. So, my question is whether a program like hadoop can be used for evaluating or processing genetic algorithms/programming which I think will be more processing oriented as opposed I/O oriented?
A genetic algorithm-based clustering technique, called GA-clustering, is proposed in this article. The searching capability of genetic algorithms is exploited in order to search for appropriate cluster centres in the feature space such that a similarity metric of the resulting clusters is optimized.
The clustering algorithm is an unsupervised method, where the input is not a labeled one and problem solving is based on the experience that the algorithm gains out of solving similar problems as a training schedule.
As you know maximum Bio-informatics algo is based on Combination,cutting,splicing,edit distance,Neural network,..etc and also backtracking like dfs (for partial digest).If you make them distribute like a map-reduce job for particular instance or length ex :
For length 1 ..... map-reduce job 1
For length 2 ..... map-reduce job 2
.
.
.
For length n ..... map-reduce job n
Or if you want to compare Bio-info algo with the architecture of hadoop you can find simple algo in this link >> http://matrixsust.blogspot.com/2011/01/introduction-to-bioinformatics.html
Hope it helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With