Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which machine learning library to use [closed]

I am looking for a library that, ideally, has the following features:

  • implements hierarchical clustering of multidimensional data (ideally on similiarity or distance matrix)
  • implements support vector machines
  • is in C++
  • is somewhat documented (this one seems to be hardest)

I would like this to be in C++, as I am most comfortable with that language, but I will also use any other language if the library is worth it. I have googled and found some, but I do not really have the time to try them all out, so I want hear what other people had for experiences. Please only answer if you have some experience with the library you recommend.

P.S.: I could also use different libraries for the clustering and the SVM.

like image 324
Björn Pollex Avatar asked May 26 '10 17:05

Björn Pollex


People also ask

Which library is better for machine learning?

Scikit Learn is perhaps the most popular library for Machine Learning. It provides almost every popular model – Linear Regression, Lasso-Ridge, Logistics Regression, Decision Trees, SVMs and a lot more.

Which is the appropriate library for deep learning?

PyTorch. PyTorch is an open-source machine learning Python library that's based on the C programming language framework, Torch. PyTorch qualifies as a data science library and can integrate with other similar Python libraries such as NumPy.

Is TensorFlow open-source?

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

Which Python library is most commonly used to solve object detection tasks?

OpenCV. OpenCV is one of the most famous and widely used open-source libraries for computer vision tasks such as image processing, object detection, face detection, image segmentation, face recognition, and many more.


1 Answers

There are only a few ML libraries that i have used enough so that i am comfortable recommending them; dlib ml is certainly one of them.

Sourceforge download here; and bleeding-edge check-out:

hg clone http://hg.code.sf.net/p/dclib/code dclib-code 

The original library creator and current maintainer is Davis King.

Your wishlist versus the relevant dlib features:

  • good documentation: for free, open-source libraries directed at a relatively small group of users/developers, this is probably as good as it gets; aside from the usual docs, refined during the five-year dev history, there's a frequently updated Intro to dlib, a (low-traffic) forum; and a large set of excellent examples (including at least one for SVM).

  • C++: 100% in C++ as far as i know.

  • Support-Vector Machine algorithm: yep; in fact, the SVM modules have been the focus of the most recent updates to this Library.

  • Hierarchical Clustering algorithm: not out of the box; there is however, packaged code for k-means clustering. Obviously the results from each technique are very different, but calculation of the similarity metric and the subsequent recursive/iterative partitioning step are at the heart of both--in other words, the computation engine for hierarchical clustering is all there. To adapt the extant clustering module for HC, will take more than a couple lines of code, but it's also not a major endeavor given that you're working almost at the data-presentation level.

dlib ml has a few additional points to recommend it. It's a mature library (it's at version 17.x now, version 1.x was released sometime in late 2005, i believe) yet it also remains under active development, as evidenced by the repo logs (the last update, 17.27, was 17 May 2010) and the last commit (23 May 2010). In addition, it also includes quite few other ML techniques (eg., Bayesian Networks, Kernel Methods, etc.). And third, dllib ml has excellent "support" libraries for matrix computation and optimization--both of which are fundamental building blocks of many ML techniques.

In the source, i've noticed that dlib ml is licensed under BSL (Boost?), which is an open source license, though I don't know anything else about this type of license.

like image 67
doug Avatar answered Oct 05 '22 18:10

doug