I am looking for a library that, ideally, has the following features:
I would like this to be in C++, as I am most comfortable with that language, but I will also use any other language if the library is worth it. I have googled and found some, but I do not really have the time to try them all out, so I want hear what other people had for experiences. Please only answer if you have some experience with the library you recommend.
P.S.: I could also use different libraries for the clustering and the SVM.
Scikit Learn is perhaps the most popular library for Machine Learning. It provides almost every popular model – Linear Regression, Lasso-Ridge, Logistics Regression, Decision Trees, SVMs and a lot more.
PyTorch. PyTorch is an open-source machine learning Python library that's based on the C programming language framework, Torch. PyTorch qualifies as a data science library and can integrate with other similar Python libraries such as NumPy.
TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.
OpenCV. OpenCV is one of the most famous and widely used open-source libraries for computer vision tasks such as image processing, object detection, face detection, image segmentation, face recognition, and many more.
There are only a few ML libraries that i have used enough so that i am comfortable recommending them; dlib ml is certainly one of them.
Sourceforge download here; and bleeding-edge check-out:
hg clone http://hg.code.sf.net/p/dclib/code dclib-code
The original library creator and current maintainer is Davis King.
Your wishlist versus the relevant dlib features:
good documentation: for free, open-source libraries directed at a relatively small group of users/developers, this is probably as good as it gets; aside from the usual docs, refined during the five-year dev history, there's a frequently updated Intro to dlib, a (low-traffic) forum; and a large set of excellent examples (including at least one for SVM).
C++: 100% in C++ as far as i know.
Support-Vector Machine algorithm: yep; in fact, the SVM modules have been the focus of the most recent updates to this Library.
Hierarchical Clustering algorithm: not out of the box; there is however, packaged code for k-means clustering. Obviously the results from each technique are very different, but calculation of the similarity metric and the subsequent recursive/iterative partitioning step are at the heart of both--in other words, the computation engine for hierarchical clustering is all there. To adapt the extant clustering module for HC, will take more than a couple lines of code, but it's also not a major endeavor given that you're working almost at the data-presentation level.
dlib ml has a few additional points to recommend it. It's a mature library (it's at version 17.x now, version 1.x was released sometime in late 2005, i believe) yet it also remains under active development, as evidenced by the repo logs (the last update, 17.27, was 17 May 2010) and the last commit (23 May 2010). In addition, it also includes quite few other ML techniques (eg., Bayesian Networks, Kernel Methods, etc.). And third, dllib ml has excellent "support" libraries for matrix computation and optimization--both of which are fundamental building blocks of many ML techniques.
In the source, i've noticed that dlib ml is licensed under BSL (Boost?), which is an open source license, though I don't know anything else about this type of license.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With