I am a newbie at LISP. I am following Andrew Ng's course for machine learning in Coursera (First week still). I wanted to try doing linear regression in LISP. I wrote the code for single-variable linear regression. The code seems to work fine. I want to generalize this for multi-variable linear functions. I want to know how to start doing this. I want to end up with something like :
(defun run-linear-regression alpha iterations training-set number-of-variables (...))
which would in turn create a hypothesis generator function with input number of variables, partial derivative functions for these hypotheses, etc.
Following is the code I have so far. I do not need anyone to code this for me, but some guidance on how to go about doing what I want would be appreciated. Also, any general comments on how to improve the code (performance, style, etc) I have so far are also welcome.
(defun make-hypothesis (theta1 theta2)
(lambda (x)
(+ theta1 (* x theta2))))
(defun make-cost-function (hypothesis)
(lambda (training-data)
(let* ((x (car training-data)) (y (cadr training-data))
(val (- (funcall hypothesis x) y)))
(* val val))))
(defun make-J-1 (cost-function)
(lambda (training-set) (float
(/
(reduce #'+ (mapcar cost-function training-set))
(* 2 (length training-set))))))
(defun make-J (theta1 theta2)
(make-J-1 (make-cost-function (make-hypothesis theta1 theta2))))
(defun make-part-deriv-1 (hypothesis)
(lambda (test-set)
(let ((m (length test-set)))
(float (/
(reduce #'+ (mapcar (lambda(elem)(- (funcall hypothesis (car elem)) (cadr elem))) test-set))
m)))))
(defun make-part-deriv-2 (hypothesis)
(lambda (test-set)
(let ((m (length test-set)))
(float (/
(reduce #'+ (mapcar (lambda(elem)(* (- (funcall hypothesis (car elem)) (cadr elem)) (funcall hypothesis (car elem)))) test-set))
m)))))
(defun make-learn-fn (alpha theta1 theta2 make-part-deriv)
(lambda (test-set)
(let* ((hypothesis (make-hypothesis theta1 theta2)) (pdv (funcall make-part-deriv hypothesis)))
(* alpha (funcall pdv test-set)))))
(defun make-learners (alpha)
(list
(lambda (theta1 theta2 test-set) (- theta1 (funcall (make-learn-fn alpha theta1 theta2 #'make-part-deriv-1) test-set)))
(lambda (theta1 theta2 test-set) (- theta2 (funcall (make-learn-fn alpha theta1 theta2 #'make-part-deriv-2) test-set)))))
(defun run-linear-regression (alpha iterations training-set &optional (theta1 0) (theta2 0) (printer nil))
(let ((t1 theta1) (t2 theta2))
(dotimes (i iterations)
(if (not (null printer))
(funcall printer t1 t2))
(let* ((funcs (make-learners alpha))
(nt1 (funcall (car funcs) t1 t2 training-set))
(nt2 (funcall (cadr funcs) t1 t2 training-set)))
(setq t1 nt1)
(setq t2 nt2)))
(list t1 t2)))
in the end, I would call it like this:
(defvar *training-set* '((15 20) (700 6) (23 15) (19 19) (204 15) (60 150) (87 98) (17 35) (523 29)))
(run-linear-regression 0.0001 1000000 *training-set*)
I'm not familiar with the math here, but since no one else has written better answers, here's some general advice.
You should change RUN-LINEAR-REGRESSION to take a list of variables, as well as a list of learner-functions. For example:
(defun run-linear-regression (iterations training-set
variables learners)
(let ((vars variables))
(dotimes (i iterations)
(setf vars (mapcar (lambda (function)
(funcall function vars training-set))
learners)))
vars))
That takes the learners as an argument instead of making them in the function. Your original code makes the learners inside a loop, which didn't seem necessary since MAKE-LEARNERS only takes ALPHA as an argument, and that doesn't ever change, so the resulting learners will always be the same.
We also need to change MAKE-LEARNERS so that the lambda-functions will take a list of variables:
(defun make-learners (alpha)
(list (lambda (variables test-set)
(destructuring-bind (theta1 theta2) variables
(- theta1 (funcall (make-learn-fn alpha theta1 theta2
#'make-part-deriv-1)
test-set))))
(lambda (variables test-set)
(destructuring-bind (theta1 theta2) variables
(- theta2 (funcall (make-learn-fn alpha theta1 theta2
#'make-part-deriv-2)
test-set))))))
That's pretty much the same as what you had, but it uses DESTRUCTURING-BIND to extract THETA1 and THETA2 from the list VARIABLES. Now we can call RUN-LINEAR-REGRESSION like:
(run-linear-regression 1000000 *training-set* '(0 0) (make-learners 0.0001))
;=> (42.93504 2.5061023e-4)
To add more variables, you would write a suitable version of MAKE-LEARNERS. Since I don't know the math, I can't really make an example for that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With