Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PLS-DA Loading Plot in Python

How can I make a Loading plot with Matplotlib of a PLS-DA plot, like the loading plot like that of PCA?

This answer explains how it can be done with PCA: Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

However there are some significant differences between the two methods which makes the implementation different as well. (Some of the relevant differences are explained here https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/interpreting-pls-scores-and-loadings )

To make the PLS-DA plot I use the following code:

from sklearn.preprocessing import StandardScaler
from sklearn.cross_decomposition import PLSRegression
import numpy as np
import pandas as pd

targets = [0, 1]

x_vals = StandardScaler().fit_transform(df.values)

y = [g == targets[0] for g in sample_description]
y = np.array(y, dtype=int)

plsr = PLSRegression(n_components=2, scale=False)
plsr.fit(x_vals, y)

colormap = {
    targets[0]: '#ff0000',  # Red
    targets[1]: '#0000ff',  # Blue
}

colorlist = [colormap[c] for c in sample_description]

scores = pd.DataFrame(plsr.x_scores_)
scores.index = x.index

x_loadings = plsr.x_loadings_
y_loadings = plsr.y_loadings_

fig1, ax = get_default_fig_ax('Scores on LV 1', 'Scores on LV 2', title)
ax = scores.plot(x=0, y=1, kind='scatter', s=50, alpha=0.7,
                 c=colorlist, ax=ax)
like image 708
Simen Russnes Avatar asked Jun 06 '19 11:06

Simen Russnes


1 Answers

I took your code and enhanced it. The biplot is obtained via simply overlaying the score and the loading plot. Other, more rigerous plots could be made with truely shared axis according to https://blogs.sas.com/content/iml/2019/11/06/what-are-biplots.html#:~:text=A%20biplot%20is%20an%20overlay,them%20on%20a%20single%20plot.

The code below generates this image for a dataset with ~200 features (therefore there are ~200 red arrows shown): biplot with overlayed axes, the axes of the loading plot are hidden and not scaling with the axes of the loading plot

from sklearn.cross_decomposition import PLSRegression
pls2 = PLSRegression(n_components=2)
pls2.fit(X_train, Y_train)

x_loadings = pls2.x_loadings_
y_loadings = pls2.y_loadings_

fig, ax = plt.subplots(constrained_layout=True)

scores = pd.DataFrame(pls2.x_scores_)
scores.plot(x=0, y=1, kind='scatter', s=50, alpha=0.7,
                 c=Y_train.values[:,0], ax = ax)


newax = fig.add_axes(ax.get_position(), frameon=False)
feature_n=x_loadings.shape[0]
print(x_loadings.shape)
for feature_i in range(feature_n):
    comp_1_idx=0
    comp_2_idx=1
    newax.arrow(0, 0, x_loadings[feature_i,comp_1_idx], x_loadings[feature_i,comp_2_idx],color = 'r',alpha = 0.5)
newax.get_xaxis().set_visible(False)
newax.get_yaxis().set_visible(False)

plt.show()
like image 187
Ggjj11 Avatar answered Nov 23 '22 23:11

Ggjj11