I'm beginner to python coding. I'm working over structural coordinates. I have pdb structure which have xyz coordinate information (last three col) <pre class="prettyprint"><code>ATOM 1 N SER A 1 27.130 7.770 34.390 ATOM 2 1H SER A 1 27.990 7.760 34.930 ATOM 3 2H SER A 1 27.160 6.960 33.790 ATOM 4 3H SER A 1 27.170 8.580 33.790 ATOM 5 CA SER A 1 25.940 7.780 35.250 ATOM 6 CB SER A 1 25.980 9.090 36.020 ATOM 7 OG SER A 1 26.740 10.100 35.320 ATOM 8 HG SER A 1 26.750 10.940 35.860 ATOM 9 C SER A 1 24.640 7.790 34.460 ATOM 10 O SER A 1 24.530 8.510 33.500 ATOM 11 N CYS A 2 23.590 7.070 34.760 ATOM 12 H CYS A 2 23.590 6.550 35.610 ATOM 13 CA CYS A 2 22.420 7.010 33.900 ATOM 14 CB CYS A 2 21.620 5.760 34.270 ATOM 15 SG CYS A 2 22.480 4.210 33.970 ATOM 16 C CYS A 2 21.590 8.220 34.040 ATOM 17 O CYS A 2 21.370 8.690 35.160 </code></pre> <ul> <li>I have 1000 atoms in my structure. </li> <li>I have two queries.</li> </ul> How I can calculate the centroid of the structure from xyz coordinates. From the centroid I want to draw a sphere of radius 20cm. <pre class="prettyprint"><code>I try this from __future__ import division import math as mean import numpy as nx from string import* infile = open('file.pdb', 'r') #open my file text1 = infile.read().split('\n') infile.close() text = [] for i in text1: if i != '': text.append(i) for j in text: x1 = eval(replace(j[30:38], ' ', '')) #extract x-coordinate y1 = eval(replace(j[38:46], ' ', '')) #extract y-coordinate z1 = eval(replace(j[46:54], ' ', '')) #extract z-coordinate idcord = [] idcord.append(x1); idcord.append(y1); idcord.append(z1) centroid = nx.mean(idcord) print centroid </code></pre> it gives the centroid of each atom (xyz) i need a central point how??????

First of all, an easier way to read your file is with numpy's <code>genfromtxt</code> function. You don't need to import string, and you don't need to loop through all the lines and append text or count the characters. <pre class="prettyprint"><code>from __future__ import division import numpy as nx data = nx.genfromtxt('file.pdb') </code></pre> Then, the last three columns can be accessed as: <pre class="prettyprint"><code>data[:, -3:] </code></pre> Where the first <code>:</code> means "all rows", and <code>-3:</code> means from the third-to-last column to the last column. So, you can average them as such: <pre class="prettyprint"><code>nx.mean(data[:,-3:], axis=0) </code></pre> where the <code>axis=0</code> argument tells <code>nx.mean</code> to take the average along the first (<code>0th</code>) axis. It looks like this: <pre class="prettyprint"><code>In : data[:,-3:] Out: array([[ 27.13, 7.77, 34.39], [ 27.99, 7.76, 34.93], [ 27.16, 6.96, 33.79], [ 27.17, 8.58, 33.79], [ 25.94, 7.78, 35.25], [ 25.98, 9.09, 36.02], [ 26.74, 10.1 , 35.32], [ 26.75, 10.94, 35.86], [ 24.64, 7.79, 34.46], [ 24.53, 8.51, 33.5 ], [ 23.59, 7.07, 34.76], [ 23.59, 6.55, 35.61], [ 22.42, 7.01, 33.9 ], [ 21.62, 5.76, 34.27], [ 22.48, 4.21, 33.97], [ 21.59, 8.22, 34.04], [ 21.37, 8.69, 35.16]]) In : np.mean(data[:,-3:], axis=0) Out: array([ 24.74647059, 7.81117647, 34.64823529]) </code></pre> <hr> Some other things: 1) remove this line: <code>import math as mean</code>, which imports the entire <code>math</code> module and renames it <code>mean</code>. What you intended was <code>from math import mean</code> which imports the <code>mean</code> function from the <code>math</code> module. But in your code, you end up using the <code>math</code> function from the <code>numpy</code> (<code>nx</code>) module anyway, so you never used the <code>math</code> version. 2) your loop is not indented, which means you either pasted incorrectly into StackOverflow, or your loop is incorrectly indented. Possibly, this is what your code actually looks like: <pre class="prettyprint"><code>for j in text: x1 = eval(replace(j[30:38], ' ', '')) #extract x-coordinate y1 = eval(replace(j[38:46], ' ', '')) #extract y-coordinate z1 = eval(replace(j[46:54], ' ', '')) #extract z-coordinate idcord = [] idcord.append(x1); idcord.append(y1); idcord.append(z1) centroid = nx.mean(idcord) print centroid </code></pre> But the problem is that <code>idcord</code> gets set to an empty list every time the loop goes through, and a new centroid is calculated, for each particle. You don't even need the loop at all if you import the data file all at once as above. In fact, your entire code can be: <pre class="prettyprint"><code>from __future__ import division import numpy as nx data = nx.genfromtxt('file.pdb') nx.mean(data[:,-3:], axis=0) </code></pre>

How to Calculate Centroid in python

Q: How do you calculate a centroid?

Add all the x values from the vertices and divide the sum by n . Add all the y values from the vertices and divide the sum by n . That's it! The result from Step 3 is the x-coordinate of the centroid.

Q: How do you find the centroid of a set of data?

To calculate the centroid from the cluster table just get the position of all points of a single cluster, sum them up and divide by the number of points.

Tags:

python

math

numpy

shape

I'm beginner to python coding. I'm working over structural coordinates. I have pdb structure which have xyz coordinate information (last three col)

ATOM      1  N   SER A   1      27.130   7.770  34.390    
ATOM      2  1H  SER A   1      27.990   7.760  34.930     
ATOM      3  2H  SER A   1      27.160   6.960  33.790    
ATOM      4  3H  SER A   1      27.170   8.580  33.790    
ATOM      5  CA  SER A   1      25.940   7.780  35.250    
ATOM      6  CB  SER A   1      25.980   9.090  36.020    
ATOM      7  OG  SER A   1      26.740  10.100  35.320    
ATOM      8  HG  SER A   1      26.750  10.940  35.860    
ATOM      9  C   SER A   1      24.640   7.790  34.460    
ATOM     10  O   SER A   1      24.530   8.510  33.500    
ATOM     11  N   CYS A   2      23.590   7.070  34.760    
ATOM     12  H   CYS A   2      23.590   6.550  35.610    
ATOM     13  CA  CYS A   2      22.420   7.010  33.900    
ATOM     14  CB  CYS A   2      21.620   5.760  34.270    
ATOM     15  SG  CYS A   2      22.480   4.210  33.970    
ATOM     16  C   CYS A   2      21.590   8.220  34.040    
ATOM     17  O   CYS A   2      21.370   8.690  35.160

I have 1000 atoms in my structure.
I have two queries.

How I can calculate the centroid of the structure from xyz coordinates.
From the centroid I want to draw a sphere of radius 20cm.

I try this


from __future__ import division
import math as mean
import numpy as nx
from string import*


infile = open('file.pdb', 'r')           #open my file
text1 = infile.read().split('\n')
infile.close()

text = []
for i in text1:
if i != '':
    text.append(i)

for j in text:
x1 = eval(replace(j[30:38], ' ', ''))         #extract x-coordinate
y1 = eval(replace(j[38:46], ' ', ''))         #extract y-coordinate
z1 = eval(replace(j[46:54], ' ', ''))         #extract z-coordinate

idcord = []
idcord.append(x1); idcord.append(y1); idcord.append(z1)

centroid = nx.mean(idcord)
print centroid

it gives the centroid of each atom (xyz) i need a central point how??????

974

asked Sep 10 '13 08:09

awanit

1 Answers

First of all, an easier way to read your file is with numpy's genfromtxt function. You don't need to import string, and you don't need to loop through all the lines and append text or count the characters.

from __future__ import division
import numpy as nx

data = nx.genfromtxt('file.pdb')

Then, the last three columns can be accessed as:

data[:, -3:]

Where the first : means "all rows", and -3: means from the third-to-last column to the last column.

So, you can average them as such:

nx.mean(data[:,-3:], axis=0)

where the axis=0 argument tells nx.mean to take the average along the first (0th) axis. It looks like this:

In : data[:,-3:]
Out: 
array([[ 27.13,   7.77,  34.39],
       [ 27.99,   7.76,  34.93],
       [ 27.16,   6.96,  33.79],
       [ 27.17,   8.58,  33.79],
       [ 25.94,   7.78,  35.25],
       [ 25.98,   9.09,  36.02],
       [ 26.74,  10.1 ,  35.32],
       [ 26.75,  10.94,  35.86],
       [ 24.64,   7.79,  34.46],
       [ 24.53,   8.51,  33.5 ],
       [ 23.59,   7.07,  34.76],
       [ 23.59,   6.55,  35.61],
       [ 22.42,   7.01,  33.9 ],
       [ 21.62,   5.76,  34.27],
       [ 22.48,   4.21,  33.97],
       [ 21.59,   8.22,  34.04],
       [ 21.37,   8.69,  35.16]])

In : np.mean(data[:,-3:], axis=0)
Out: array([ 24.74647059,   7.81117647,  34.64823529])

Some other things:

1) remove this line: import math as mean, which imports the entire math module and renames it mean. What you intended was from math import mean which imports the mean function from the math module. But in your code, you end up using the math function from the numpy (nx) module anyway, so you never used the math version.

2) your loop is not indented, which means you either pasted incorrectly into StackOverflow, or your loop is incorrectly indented. Possibly, this is what your code actually looks like:

for j in text:
    x1 = eval(replace(j[30:38], ' ', ''))         #extract x-coordinate
    y1 = eval(replace(j[38:46], ' ', ''))         #extract y-coordinate
    z1 = eval(replace(j[46:54], ' ', ''))         #extract z-coordinate

    idcord = []
    idcord.append(x1); idcord.append(y1); idcord.append(z1)

    centroid = nx.mean(idcord)
    print centroid

But the problem is that idcord gets set to an empty list every time the loop goes through, and a new centroid is calculated, for each particle. You don't even need the loop at all if you import the data file all at once as above. In fact, your entire code can be:

from __future__ import division
import numpy as nx

data = nx.genfromtxt('file.pdb')
nx.mean(data[:,-3:], axis=0)

165

answered Sep 28 '22 01:09

askewchan

Related questions
                            
                                Flask Jsonify mongoengine query
                            
                                Python3: check if method is static
                            
                                Tkinter - Inserting text into canvas windows
                            
                                Trying to run simple PIL python example, can't convert jpeg to float
                            
                                matplotlib coordinates format [duplicate]
                            
                                Python how to ensure that __del__() method of an object is called before the module dies?
                            
                                Download a file directly to S3
                            
                                How to parse a Python module from file
                            
                                Python plot - stacked image slices
                            
                                Importing files in Python?
                            
                                Factorizing a number in Python
                            
                                How to judge a int number odd or even? (the binary way)
                            
                                'dict' object has no attribute 'loads'
                            
                                Trying to install wxpython on Mac OSX [duplicate]
                            
                                What's the difference between hasattr() and 'attribute' in dir()?
                            
                                python - iterating over a subset of a list of tuples
                            
                                finding a set of ranges that a number fall in
                            
                                Kivy: BoxLayout vs. GridLayout
                            
                                os.path.isdir() returns False even when folder exists
                            
                                Can't create new threads in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With