Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get audiences insights using Keras and TensorFlow

Recently I've discovered Keras and TensorFlow and I'm trying to get into ML. I have manually classified train and test data from my users DB like so:

9 features and a label, the features are events in my system like "user added a profile picture" or "user paid X for a service" and the label is positive or negative R.O.I (1 or 0)

Sample:
enter image description here

I have used the following code to classify the users:

import numpy as np
from keras.layers import Dense
from keras.models import Sequential

train_data = np.loadtxt("train.csv", delimiter=",", skiprows=1)
test_data = np.loadtxt("test.csv", delimiter=",", skiprows=1)

X_train = train_data[:, 0:9]
Y_train = train_data[:, 9]

X_test = test_data[:, 0:9]
Y_test = test_data[:, 9]

model = Sequential()
model.add(Dense(8, input_dim=9, activation='relu'))
model.add(Dense(6, activation='relu'))
model.add(Dense(3, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model
model.fit(X_train, Y_train, epochs=12000, batch_size=10)

# evaluate the model
scores = model.evaluate(X_test, Y_test)
print("\n\n\nResults: %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

And got a 89% accuracy. That worked great in order to label a user as a valued customer.

Q : How can I extract the features that contributed for the possitive R.O.I so I can boost their focus in the UX?

Or : What is the approach to find the best combined segment of audiences?

like image 385
Roni Gadot Avatar asked Jun 05 '17 08:06

Roni Gadot


1 Answers

As people said, there is no easy answer, and mine is not intended to be the answer, but I think you might try something like this.

Take a look approach:

  • Predict the result for all your clients
  • Filter the good clients and plot their features
  • Filter the bad clients and plot their features
  • Can you see an evident pattern? Ex: most failures don't have feature x.

Create fake clients with combined features:

  • First, create fake customers that have only one feature, from 1 to 9. (Client 1 has only feature 1, client 2 has only feature 2 and so on)

  • Predict the results for these clients

  • Check if any of the features give nice results (this may not be the actual result yet, but keep note)

Above, you can see the result of each feature alone, but that is not very likely, is it?

  • Now you can create all combinations of two features. There are 36 combinations (9 x 8 / 2) of those features. (F1/F2 ; F1/F3 ; F1/F4 ....)
  • Predict and look at the good clients (keep note of the best combinations)

Keep going, combinations of 3 features (84 combinations)
Combinations of 4 features (126 combinations)

Compare the results between each step above:

  • Take all failure clients with 4 features. Compare with the success clients with 1 feature: is the one success feature present in the failed clients? If not, you have very probably found an independent success feature.

  • Any feature is missing in all failure candidates of all tests? This is another independent success feature.

Similar to the first one, you can compare 4 feature failures with 2 feature successes and see if any pair of features are successful.

And so on.

Take the real clients and filter them considering the above results:

  • From the real clients, take all those that have the single feature you thought to be success. Confirm they're actually success
  • Do the same with real clients that have the feature pairs you considered to be success. Confirm if they are.

You can do the same approach assuming certain features may lead to failure instead of leading to success. Or instead of looking at present features, look at missing features, etc.

like image 74
Daniel Möller Avatar answered Nov 08 '22 20:11

Daniel Möller