Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python baseline correction library

Tags:

I am currently working with some Raman Spectra data, and I am trying to correct my data caused by florescence skewing. Take a look at the graph below:

enter image description here

I am pretty close to achieving what I want. As you can see, I am trying to fit a polynomial in all my data whereas I should really just be fitting a polynomial at the local minimas.

Ideally I would want to have a polynomial fitting which when subtracted from my original data would result in something like this:

enter image description here

Are there any built in libs that does this already?

If not, any simple algorithm one can recommend for me?

like image 656
Tinker Avatar asked Mar 19 '15 23:03

Tinker


People also ask

How do you do baseline correction in Python?

There is a python library available for baseline correction/removal. It has Modpoly, IModploy and Zhang fit algorithm which can return baseline corrected results when you input the original values as a python list or pandas series and specify the polynomial degree.

What is baseline Python?

A baseline provides a point of comparison for the more advanced methods that you evaluate later. In this tutorial, you will discover how to implement baseline machine learning algorithms from scratch in Python.

What is baseline correction?

Baseline correction is an important pre-processing technique used to separate true spectroscopic signals from interference effects or remove background effects, stains or traces of compounds, e.g. in 2D gel electrophoresis.

What is baseline subtraction?

Baseline subtraction is the functional estimation and removal of background noise. Source publication.


2 Answers

I found an answer to my question, just sharing for everyone who stumbles upon this.

There is an algorithm called "Asymmetric Least Squares Smoothing" by P. Eilers and H. Boelens in 2005. The paper is free and you can find it on google.

def baseline_als(y, lam, p, niter=10):   L = len(y)   D = sparse.csc_matrix(np.diff(np.eye(L), 2))   w = np.ones(L)   for i in xrange(niter):     W = sparse.spdiags(w, 0, L, L)     Z = W + lam * D.dot(D.transpose())     z = spsolve(Z, w*y)     w = p * (y > z) + (1-p) * (y < z)   return z 
like image 76
Tinker Avatar answered Oct 13 '22 22:10

Tinker


The following code works on Python 3.6.

This is adapted from the accepted correct answer to avoid the dense matrix diff computation (which can easily cause memory issues) and uses range (not xrange)

import numpy as np from scipy import sparse from scipy.sparse.linalg import spsolve  def baseline_als(y, lam, p, niter=10):   L = len(y)   D = sparse.diags([1,-2,1],[0,-1,-2], shape=(L,L-2))   w = np.ones(L)   for i in range(niter):     W = sparse.spdiags(w, 0, L, L)     Z = W + lam * D.dot(D.transpose())     z = spsolve(Z, w*y)     w = p * (y > z) + (1-p) * (y < z)   return z 
like image 28
jpantina Avatar answered Oct 13 '22 22:10

jpantina