I'm beginner to python coding. I'm working over structural coordinates. I have pdb structure which have xyz coordinate information (last three col)
ATOM 1 N SER A 1 27.130 7.770 34.390
ATOM 2 1H SER A 1 27.990 7.760 34.930
ATOM 3 2H SER A 1 27.160 6.960 33.790
ATOM 4 3H SER A 1 27.170 8.580 33.790
ATOM 5 CA SER A 1 25.940 7.780 35.250
ATOM 6 CB SER A 1 25.980 9.090 36.020
ATOM 7 OG SER A 1 26.740 10.100 35.320
ATOM 8 HG SER A 1 26.750 10.940 35.860
ATOM 9 C SER A 1 24.640 7.790 34.460
ATOM 10 O SER A 1 24.530 8.510 33.500
ATOM 11 N CYS A 2 23.590 7.070 34.760
ATOM 12 H CYS A 2 23.590 6.550 35.610
ATOM 13 CA CYS A 2 22.420 7.010 33.900
ATOM 14 CB CYS A 2 21.620 5.760 34.270
ATOM 15 SG CYS A 2 22.480 4.210 33.970
ATOM 16 C CYS A 2 21.590 8.220 34.040
ATOM 17 O CYS A 2 21.370 8.690 35.160
How I can calculate the centroid of the structure from xyz coordinates.
From the centroid I want to draw a sphere of radius 20cm.
I try this
from __future__ import division
import math as mean
import numpy as nx
from string import*
infile = open('file.pdb', 'r') #open my file
text1 = infile.read().split('\n')
infile.close()
text = []
for i in text1:
if i != '':
text.append(i)
for j in text:
x1 = eval(replace(j[30:38], ' ', '')) #extract x-coordinate
y1 = eval(replace(j[38:46], ' ', '')) #extract y-coordinate
z1 = eval(replace(j[46:54], ' ', '')) #extract z-coordinate
idcord = []
idcord.append(x1); idcord.append(y1); idcord.append(z1)
centroid = nx.mean(idcord)
print centroid
it gives the centroid of each atom (xyz) i need a central point how??????
The training data is split into groups by class label, then the centroid for each group of data is calculated. Each centroid is simply the mean value of each of the input variables. If there are two classes, then two centroids or points are calculated; three classes give three centroids, and so on.
shape[0]): dist = np. linalg. norm(dataSetRow - centroids[i, :-1]) if dist < minDist: minDist = dist label = centroids[i, -1] print( "minDist:", minDist) return label # Function: Get Centroids # ------------- # Returns k random centroids, each of dimension n.
Add all the x values from the vertices and divide the sum by n . Add all the y values from the vertices and divide the sum by n . That's it! The result from Step 3 is the x-coordinate of the centroid.
To calculate the centroid from the cluster table just get the position of all points of a single cluster, sum them up and divide by the number of points.
First of all, an easier way to read your file is with numpy's genfromtxt
function. You don't need to import string, and you don't need to loop through all the lines and append text or count the characters.
from __future__ import division
import numpy as nx
data = nx.genfromtxt('file.pdb')
Then, the last three columns can be accessed as:
data[:, -3:]
Where the first :
means "all rows", and -3:
means from the third-to-last column to the last column.
So, you can average them as such:
nx.mean(data[:,-3:], axis=0)
where the axis=0
argument tells nx.mean
to take the average along the first (0th
) axis. It looks like this:
In : data[:,-3:]
Out:
array([[ 27.13, 7.77, 34.39],
[ 27.99, 7.76, 34.93],
[ 27.16, 6.96, 33.79],
[ 27.17, 8.58, 33.79],
[ 25.94, 7.78, 35.25],
[ 25.98, 9.09, 36.02],
[ 26.74, 10.1 , 35.32],
[ 26.75, 10.94, 35.86],
[ 24.64, 7.79, 34.46],
[ 24.53, 8.51, 33.5 ],
[ 23.59, 7.07, 34.76],
[ 23.59, 6.55, 35.61],
[ 22.42, 7.01, 33.9 ],
[ 21.62, 5.76, 34.27],
[ 22.48, 4.21, 33.97],
[ 21.59, 8.22, 34.04],
[ 21.37, 8.69, 35.16]])
In : np.mean(data[:,-3:], axis=0)
Out: array([ 24.74647059, 7.81117647, 34.64823529])
Some other things:
1) remove this line: import math as mean
, which imports the entire math
module and renames it mean
. What you intended was from math import mean
which imports the mean
function from the math
module. But in your code, you end up using the math
function from the numpy
(nx
) module anyway, so you never used the math
version.
2) your loop is not indented, which means you either pasted incorrectly into StackOverflow, or your loop is incorrectly indented. Possibly, this is what your code actually looks like:
for j in text:
x1 = eval(replace(j[30:38], ' ', '')) #extract x-coordinate
y1 = eval(replace(j[38:46], ' ', '')) #extract y-coordinate
z1 = eval(replace(j[46:54], ' ', '')) #extract z-coordinate
idcord = []
idcord.append(x1); idcord.append(y1); idcord.append(z1)
centroid = nx.mean(idcord)
print centroid
But the problem is that idcord
gets set to an empty list every time the loop goes through, and a new centroid is calculated, for each particle. You don't even need the loop at all if you import the data file all at once as above. In fact, your entire code can be:
from __future__ import division
import numpy as nx
data = nx.genfromtxt('file.pdb')
nx.mean(data[:,-3:], axis=0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With