I appreciate that this task is probably a bit ambitious given my level (or lack) of knowledge, but still.
I have a list of 16 character strings, about 3000 items long, where each character denotes another list of numbers. Not sure if I'm making that clear; what it actually is a list of 16-amino acid long peptides, where each of the amino acids (1 of 20) is representable by 5 numbers.
I want to iterate through that list (of peptides), and then for each character (amino acid) add the relevant 5 numbers (Atchley factors, if you're interested) to an array, making a 3 dimensional array, where my axes are: instance of peptide (3000) x amino acid within that peptide (16) x factors (5).
I'm incredibly out of my depth, so I'm not sure if what I've got is useful is helpful, but here it is (using numpy):
array = np.empty(shape=(len(peptides),16,5)
for i in peptides:
for j in str(i):
(and at this point I tried a bunch of different things as I trawled the forums, ending with something a little like this, but I'm sure I've missed even what I was aiming for here)
if j == 'A':
L16Afctrs = np.append([-0.59145974, -1.30209266, -0.7330651, 1.5703918, -0.14550842], axis=1)
elif j == 'C':
L16Afctrs = np.append([-1.34267179, 0.46542300, -0.8620345, -1.0200786, -0.25516894], axis=1)
...
elif j == 'Y':
L16Afctrs = np.append([0.25999617, 0.82992312, 3.0973596, -0.8380164, 1.51150958], axis=1)
Like I say, I'm honestly struggling, any help would be much appreciated.
Edit: clarification (hopefully)
I have a list of around 3000 different 16 character strings, where each character in those strings denotes a further 5 numbers.
I want to generate a 3 dimensional array or structure, whereby I can (eventually) plot those 5 numbers for a given position across all 3000 strings, by looking across a given plane in the 3 dimensional array (where the dimensions I envisage are; original string x 16 characters x 5 factors).
I'm currently in the process of making a dictionary of the different characters, relating to the post from @Winston, then trying to fold that into a 3d array.
Edit 2: Success!
Winston's fix works beautifully!
Store your data in a dictionary:
DATA = {
'A' : numpy.array([-0.59145974, -1.30209266, -0.7330651, 1.5703918, -0.14550842]),
'B' : numpy.array([-1.34267179, 0.46542300, -0.8620345, -1.0200786, -0.25516894]),
'D' : numpy.array([1.05015062, 0.30242411, -3.6559147, -0.2590236, -3.24176791])
...
}
Use a python list comprehension to build a list of all those, and then have numpy convert that list into a numpy array
counters = numpy.array([DATA[letter] for peptide in peptides for letter in peptide])
Reshape the array into your 3D dimensions, since the last step will have 2D arrays
counters = counters.reshape( len(peptides), 16, 5 )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With