I have a DataFrame which consists of 30 rows and 9 columns. I want to make a 2 sigma outlier removal.
I do it with this:
from scipy import stats
df[(np.abs(stats.zscore(df)) < 2).all(axis=1)]
But it removes the whole line if there is a outlier in a single column. I just want to get this single value deleted. How can I do this? And the first column contains the time. this should never be touched. How can I exclude this single column?
this is how the data looks like:
Trace for Mass: 60Ni    61Ni    62Ni    63Cu    64Ni    65Cu    66Zn
Resolution: High    High    High    High    High    High    High
                            
Time    Intensity   Intensity   Intensity   Intensity   Intensity   Intensity   Intensity
[sec]   [cps]   [cps]   [cps]   [cps]   [cps]   [cps]   [cps]
0.  4.246875178068876e-003  4.550645244307816e-004  8.364085806533694e-004  3.21496045216918e-003   3.215973265469074e-003  1.595904817804694e-003  1.983924303203821e-003  
1.051999807357788   4.264393821358681e-003  5.171436932869256e-004  8.292743586935103e-004  3.154967911541462e-003  3.216561861336231e-003  1.622977200895548e-003  1.874359208159149e-003  
2.102999925613403   4.27544629201293e-003   4.796394787263125e-004  8.318902109749615e-004  3.211528761312366e-003  3.147452371194959e-003  1.622740761376917e-003  1.879810937680304e-003  
3.154999971389771   4.278738517314196e-003  4.829006502404809e-004  7.972901221364737e-004  3.218628698959947e-003  3.22998408228159e-003   1.604416524060071e-003  1.938240835443139e-003  
4.206999778747559   4.211603198200464e-003  4.424861108418554e-004  8.007381693460047e-004  3.2428870908916e-003    3.166524693369865e-003  1.590821426361799e-003  1.903632888570428e-003  
5.257999897003174   4.267803858965635e-003  5.1306706154719e-004    8.309389813803136e-004  3.144200425595045e-003  3.117314074188471e-003  1.603707205504179e-003  1.815222087316215e-003  
6.309999942779541   4.182798787951469e-003  5.052632768638432e-004  7.896805764175952e-004  3.130593337118626e-003  3.10095027089119e-003   1.570251770317555e-003  1.817710697650909e-003  
7.361000061035156   4.296375438570976e-003  4.910536226816475e-004  8.9122453937307e-004    3.204192267730832e-003  3.028199542313814e-003  1.533132861368358e-003  1.788084045983851e-003  
8.413000106811523   4.335530567914248e-003  6.025235052220523e-004  8.631621603854001e-004  3.268211148679256e-003  2.987353131175041e-003  1.608435995876789e-003  1.796260941773653e-003  
9.463999748229981   4.290143493562937e-003  4.839488829020411e-004  8.525795419700444e-004  3.222533734515309e-003  3.005951410159469e-003  1.583610195666552e-003  1.700276043266058e-003  
10.51599979400635   4.287909716367722e-003  5.497571546584368e-004  9.083477198146284e-004  3.219338599592447e-003  2.950039459392428e-003  1.682562520727515e-003  1.783343963325024e-003  
11.56699943542481   4.260278772562742e-003  4.665948799811304e-004  7.738673011772335e-004  3.193542594090104e-003  2.853760728612542e-003  1.568833249621093e-003  1.736654434353113e-003  
12.61899948120117   4.26474679261446e-003   5.00720867421478e-004   8.611407829448581e-004  3.217800287529826e-003  2.865647897124291e-003  1.595077337697148e-003  1.658685388974845e-003  
13.67099952697754   4.222772549837828e-003  4.647313617169857e-004  8.633999968878925e-004  3.159464336931706e-003  2.801976399496198e-003  1.629361184313893e-003  1.673259655945003e-003  
14.72200012207031   4.23405971378088e-003   4.880253691226244e-004  8.320091292262077e-004  3.10550956055522e-003   2.766199875622988e-003  1.57923623919487e-003   1.671363832429051e-003  
15.77400016784668   4.263806156814098e-003  5.268111126497388e-004  8.335548918694258e-004  3.150589996948838e-003  2.747958991676569e-003  1.52225757483393e-003   1.638660905882716e-003  
16.82500076293945   4.173276014626026e-003  5.153965321369469e-004  7.848058012314141e-004  3.132368205115199e-003  2.736426191404462e-003  1.501098275184631e-003  1.646955031901598e-003  
17.87699890136719   4.209604579955339e-003  4.582091642078012e-004  7.977656787261367e-004  3.183129709213972e-003  2.714420203119516e-003  1.604771241545677e-003  1.606788486242294e-003  
18.92900085449219   4.214542452245951e-003  4.919854109175503e-004  8.5032032802701e-004    3.177686594426632e-003  2.588512841612101e-003  1.560558215714991e-003  1.607973361387849e-003  
19.97999954223633   4.171629901975393e-003  4.438837058842182e-004  8.449696470052004e-004  3.142070723697543e-003  2.649111207574606e-003  1.58833886962384e-003   1.547667197883129e-003  
21.0310001373291    4.234999883919954e-003  5.094563821330667e-004  8.215457201004028e-004  3.189756069332361e-003  2.645698608830571e-003  1.556538976728916e-003  1.515797688625753e-003  
22.08300018310547   4.159520845860243e-003  5.21336798556149e-004   7.7945546945557e-004    3.093914361670613e-003  2.504269825294614e-003  1.597914495505393e-003  1.550629152916372e-003  
23.13399887084961   4.095097538083792e-003  5.284418002702296e-004  8.160762954503298e-004  3.164552384987474e-003  2.605574205517769e-003  1.5143376076594e-003    1.545534702017903e-003  
24.18600082397461   4.190911073237658e-003  4.741653683595359e-004  8.253505802713335e-004  3.078178269788623e-003  2.457562601193786e-003  1.61718437448144e-003   1.502647297456861e-003  
25.23799896240234   4.155758768320084e-003  4.477270995266736e-004  8.012137841433287e-004  3.119352972134948e-003  2.549331868067384e-003  1.551455701701343e-003  1.538307638838887e-003  
26.28899955749512   4.055834375321865e-003  4.267746699042618e-004  8.247561054304242e-004  3.050019731745124e-003  2.364743268117309e-003  1.565523212775588e-003  1.418655156157911e-003  
27.34099960327148   4.160813987255096e-003  4.637996316887438e-004  8.405701955780387e-004  3.15011665225029e-003   2.621341263875365e-003  1.558548538014293e-003  1.534871873445809e-003  
28.39200019836426   4.123781807720661e-003  5.418366636149585e-004  8.308201213367283e-004  3.128936979919672e-003  2.427210099995136e-003  1.607372076250613e-003  1.475754892453551e-003  
29.44400024414063   4.185620695352554e-003  4.987408174201846e-004  7.421225891448557e-004  3.080426249653101e-003  2.371448557823896e-003  1.567532890476286e-003  1.444243011064827e-003  
30.49600028991699   4.092158749699593e-003  5.319360643625259e-004  8.368841372430325e-004  3.113200422376394e-003  2.385094529017806e-003  1.580300158821046e-003  1.433581346645951e-003  
This file is read by:
pd.options.display.float_format = '{:.4f}'.format
data = pd.read_csv(dateiname, sep='\t', names=['Time', '60Ni', '61Ni', '62Ni', '63Cu', '64Ni', '65Cu', '66Zn'], skiprows=6, nrows=30, index_col=False, dtype=float)
If you need to replace outliers by missing values, use DataFrame.mask:
df = df.mask(np.abs(stats.zscore(df)) < 2)
#working for replace outlier by missing values
#df = df.mask(np.abs(stats.zscore(df)) < 2, np.nan)
I just want to get this single value deleted.
This is not possible, we can only remove row(s) like your solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With