Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Benefits of TDD in machine learning

As far as I know typical workflow of TDD is based on black box testing. First we define interface then write one or set of test and then we implement code that pass all tests. So look at the example below:

from abc import ABCMeta


class InterfaceCalculator:
    __metaclass__ = ABCMeta

    @abstractmethod
    def calculate_mean(self):
        pass

Exemplary test case

from unittest import TestCase


class TestInterfaceCalculator(TestCase):

    def test_should_correctly_calcluate_mean(self):
        X=[1,1]
        expected_mean = 1
        calcluator =Calculator()
        self.assertAlmostEqual(calculator.calculate_mean(X), expected_mean) 

I skip implementation of the class Calculator(InterfaceCalculator) because it is trivial.

The following idea is pretty easy to understand. How about Machine Learning? Let consider the following example. We would like to implement cat, dog photo classifier. Start from the interface.

from abc import ABCMeta


class InterfaceClassifier:
    __metaclass__ = ABCMeta

    @abstractmethod
    def train_model(self, data):
        pass

    @abstractmethod
    def predict(self, data):
        pass

I prepared very sill set of the unittests

from unittest import TestCase


class TestInterfaceCalculator(TestCase):
    def __init__(self):
        self.model = CatDogClassifier()

    def test_should_correctly_train_model(self, data):
        """
        How can be implemented?
        """
        self.model.train_model(data)

    def test_should_correctly_calcluate_mean(self):
        input ="cat.jpg"
        expected_result = "cat"
        calcluator =.assertAlmostEqual(self.model.preditct(input), expected_result)

Is it the way to use TDD to help work on machine learning model? Or In this case TDD is useless. It, only can help us to verify correctness of input data and add very high level test of the trained model? How can I create good automatic tests?

like image 572
user1877600 Avatar asked Jul 18 '16 13:07

user1877600


1 Answers

With TDD, you describe the expected behavior in the form of a test and then create the code to satisfy the test. While this can work well for some components of your machine learning model, it usually doesn't work well for the high-level behavior of a machine learning model, because the expected behavior is not precisely known in advance. The process of developing a machine learning model often involves trying different approaches to see which one is most effective. The behavior is likely to be measured in terms of percentages, e,g, recognition is 95% accurate, rather than absolutes.

like image 55
Mike Stockdale Avatar answered Oct 02 '22 11:10

Mike Stockdale