Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R ginv and Matlab pinv produce different results

ginv() function from MASS package in R produce totally different values compared to MATLAB pinv() function. They both claim to produce Moore-Penrose generalized inverse of a matrix.

I tried to set the same tolerance for the R implementation but the difference persists.

  • MATLAB default tol : max(size(A)) * norm(A) * eps(class(A))
  • R default tol : sqrt(.Machine$double.eps)

Reproduction:

R:

library(MASS)
A <- matrix(c(47,94032,149, 94032, 217179406,313679,149,313679,499),3,3)
ginv(A)

outputs:

              [,1]          [,2]          [,3]
[1,]  1.675667e-03 -8.735203e-06  5.545605e-03
[2,] -8.735203e-06  5.014084e-08 -2.890907e-05
[3,]  5.545605e-03 -2.890907e-05  1.835313e-02

svd(A)

outputs:

$d
[1] 2.171799e+08 4.992800e+01 2.302544e+00

$u
              [,1]         [,2]          [,3]
[1,] -0.0004329688  0.289245088 -9.572550e-01
[2,] -0.9999988632 -0.001507826 -3.304234e-06
[3,] -0.0014443299  0.957253888  2.892454e-01

$v
              [,1]         [,2]          [,3]
[1,] -0.0004329688  0.289245088 -9.572550e-01
[2,] -0.9999988632 -0.001507826 -3.304234e-06
[3,] -0.0014443299  0.957253888  2.892454e-01

MATLAB:

A = [47 94032 149; 94032 217179406 313679; 149 313679 499]
pinv(A)

outputs:

ans =

    0.3996   -0.0000   -0.1147
   -0.0000    0.0000   -0.0000
   -0.1147   -0.0000    0.0547

svd:

[U, S, V] = svd(A)

U =

   -0.0004    0.2892   -0.9573
   -1.0000   -0.0015   -0.0000
   -0.0014    0.9573    0.2892


S =

  1.0e+008 *

    2.1718         0         0
         0    0.0000         0
         0         0    0.0000


V =

   -0.0004    0.2892   -0.9573
   -1.0000   -0.0015   -0.0000
   -0.0014    0.9573    0.2892

Solution: to make R ginv like MATLAB pinv use this function:

#' Pseudo-Inverse of Matrix
#' @description
#' This is the modified version of ginv function in MASS package.
#' It produces MATLAB like pseudo-inverse of a matrix
#' @param X The matrix to compute the pseudo-inverse
#' @param tol The default is the same as MATLAB pinv function
#'
#' @return The pseudo inverse of the matrix
#' @export
#'
#' @examples
#' A <- matrix(1:6,3,2)
#' pinv(A)
pinv <- function (X, tol = max(dim(X)) * max(X) * .Machine$double.eps)
{
  if (length(dim(X)) > 2L || !(is.numeric(X) || is.complex(X)))
    stop("'X' must be a numeric or complex matrix")
  if (!is.matrix(X))
    X <- as.matrix(X)
  Xsvd <- svd(X)
  if (is.complex(X))
    Xsvd$u <- Conj(Xsvd$u)
  Positive <- any(Xsvd$d > max(tol * Xsvd$d[1L], 0))
  if (Positive)
    Xsvd$v %*% (1 / Xsvd$d * t(Xsvd$u))
  else
    array(0, dim(X)[2L:1L])
}
like image 216
Farid Cheraghi Avatar asked Apr 03 '16 21:04

Farid Cheraghi


People also ask

How to make R ginv like MATLAB pinv?

Solution: to make R ginv like MATLAB pinv use this function: #' Pseudo-Inverse of Matrix #' @description #' This is the modified version of ginv function in MASS package.

What is the difference between pinv () and INV () function in Octave/MATLAB?

The pinv () function in OCTAVE/MATLAB returns the Moore-Penrose pseudo inverse of a matrix using Singular value. The inv () function returns the inverse of the matrix. The pinv () function is useful when your matrix is non-invertible (singular matrix) or Determinant of that Matrix =0.

What is the difference between pinv () and INV () function in R?

The pinv () function involves the use of floating-point arithmetic. It is used to handle Non-Singular Matrices, it refers to inverse of a matrix. The inv () function doesn’t involve use of floating-point arithmetic.

How to find the inverse of a matrix in MATLAB?

But using the same Matrix, the inverse can be calculated using the pinv () function. Both pinv () and inv () are used to find the inverse of matrices in MATLAB. However, the difference is that pinv refers to pseudo inverse and inv refers to inverse. Below are some key differences between both the functions:


1 Answers

Running debugonce(MASS::ginv), we see that the difference lies in what is done with the singular value decomposition.

Specifically, R checks the following:

Xsvd <- svd(A)
Positive <- Xsvd$d > max(tol * Xsvd$d[1L], 0)
Positive
# [1]  TRUE  TRUE FALSE

If the third element were true (which we can force by setting tol = 0, as suggested by @nicola), MASS::ginv would return:

Xsvd$v %*% (1/Xsvd$d * t(Xsvd$u))
#               [,1]          [,2]          [,3]
# [1,]  3.996430e-01 -7.361507e-06 -1.147047e-01
# [2,] -7.361507e-06  5.014558e-08 -2.932415e-05
# [3,] -1.147047e-01 -2.932415e-05  5.468812e-02

(i.e., the same as MATLAB).

Instead, it returns:

Xsvd$v[, Positive, drop = FALSE] %*% ((1/Xsvd$d[Positive]) * 
  t(Xsvd$u[, Positive, drop = FALSE]))
#               [,1]          [,2]          [,3]
# [1,]  1.675667e-03 -8.735203e-06  5.545605e-03
# [2,] -8.735203e-06  5.014084e-08 -2.890907e-05
# [3,]  5.545605e-03 -2.890907e-05  1.835313e-02

Thanks to @FaridCher for pointing out the source code of pinv.

I'm not sure I 100% understand the MATLAB code, but I think it comes down to a difference in how tol is used. The MATLAB correspondence to Positive in R is:

`r = sum(s>tol)`

Where tol is what's supplied by the user; if none is supplied, we get:

m = 0;
% I don't get the point of this for loop -- why not just `m = max(size(A))`?
for i = 1:n 
   m = max(m,length(A(:,i)));
end
% contrast with simply `tol * Xsvd$d[1L]` in R
%   (note: i believe the elements of d are sorted largest to smallest)
tol = m*eps(max(s)); 
like image 199
MichaelChirico Avatar answered Oct 03 '22 01:10

MichaelChirico