What is the Search/Prediction Time Complexity of Logistic Regression?

Tags:

I am looking into the time complexities of Machine Learning Algorithms and I cannot find what is the time complexity of Logistic Regression for predicting a new input. I have read that for Classification is O(c*d) c-beeing the number of classes, d-beeing the number of dimensions and I know that for the Linear Regression the search/prediction time complexity is O(d). Could you maybe explain what is the search/predict time complexity of Logistic Regression? Thank you in advance

Example For The other Machine Learning Problems: https://www.thekerneltrip.com/machine/learning/computational-complexity-learning-algorithms/

483

asked Jan 17 '19 14:01

Ana Smile

1 Answers

Complexity of training for logistic regression methods with gradient based optimization: O((f+1)csE), where:

f - number of features (+1 because of bias). Multiplication of each feature times it's weight (f operations, +1 for bias). Another f + 1 operations for summing all of them (obtaining prediction). Using gradient method to improve weights counts for the same number of operations, so in total we get 4* (f+1) (two for forward pass, two for backward), which is simply O(f+1).
c - number of classes (possible outputs) in your logistic regression. For binary classification it's one, so this term cancels out. Each class has it's corresponding set of weights.
s - number of samples in your dataset, this one is quite intuitive I think.
E - number of epochs you are willing to run the gradient descent (whole passes through dataset)

Note: this complexity can change based on things like regularization (another c operations), but the idea standing behind it goes like this.

Complexity of predictions for one sample: O((f+1)c)

f + 1 - you simply multiply each weight by the value of feature, add bias and sum all of it together in the end.
c - you do it for every class, 1 for binary predictions.

Complexity of predictions for many samples: O((f+1)cs)

(f+1)c - see complexity for one sample
s - number of samples

Difference between logistic and linear regression in terms of complexity: activation function.

For multiclass logistic regression it will be softmax, while linear regression, as the name suggests, has linear activation (effectively no activation). It does not change the complexity using big O notation, but it's another c*f operations during the training (didn't want to clutter the picture further) multiplied by 2 for backprop.

answered Oct 23 '22 12:10

Szymon Maszke

Related questions
                            
                                How to store neural network knowledge data?
                            
                                numpy generate data from linear function
                            
                                Python/Keras/Theano wrong dimensions for Deep Autoencoder
                            
                                How to interpret keras " predict_generator " output?
                            
                                How LSTM deal with variable length sequence
                            
                                Multiple Linear Regression with specific constraint on each coefficients on Python
                            
                                from torch._C import * ImportError: DLL load failed: The specified module could not be found
                            
                                Sklearn Chi2 For Feature Selection
                            
                                RuntimeError: size mismatch m1: [a x b], m2: [c x d]
                            
                                Classification metrics can't handle a mix of binary and continuous targets [duplicate]
                            
                                How can I use computer vision to find a shape in an image?
                            
                                Architecture & Essential Components of StumbleUpon's Recommendation Engine
                            
                                Advantages of SVM over decion trees and AdaBoost algorithm
                            
                                What FFT descriptors should be used as feature to implement classification or clustering algorithm?
                            
                                roc curve with sklearn [python]
                            
                                SKLearn how to get decision probabilities for LinearSVC classifier
                            
                                What does the capital letter 'J' mean in cost function J(θ)?
                            
                                ROC curve for binary classification in python
                            
                                Combining heuristics when ranking social network news feed items
                            
                                Why is Keras LSTM on CPU three times faster than GPU?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the Search/Prediction Time Complexity of Logistic Regression?

Tags:

time-complexity

machine-learning

logistic-regression