Intrinsic Dimension
Computes the ID of the system over the entire molecular dynamics trajectory.
If id_method is local the function returns:
the averaged value of instantaneous ID computed on the entire trajectory.
the averaged value of instantaneous ID computed from the
lastframes to the end of the trajectory.the instantaneous ID computed frame by frame on the entire trajectory.
mean_all, mean_last, local_id = intrinsic_dimension(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc', projection_method='Dihedral', id_method='local', verbose=False)
print('Mean instantaneous ID of the entire trajectory:', mean_all)
print('Mean instantaneous ID of the last 100 frames:', mean_last)
print('Istantaneous ID of the entire trajectory: \n', local_id[5:])
Mean instantaneous ID of the entire trajectory: 25.05909587663137
Mean instantaneous ID of the last 100 frames: 25.603595848008236
Istantaneous ID of the entire trajectory:
[25.29562521 25.3016539 25.58546274 ... 25.56638913 25.2186398
25.43423956]
If id_method is global the function returns:
the value of global ID computed on the entire trajectory.
the value of global ID computed on the
lastnumber of frames of the trajectory.
global_all, global_last = intrinsic_dimension(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc', projection_method='Dihedral', id_method='global', verbose = False)
print('Global ID of the entire trajectory:', global_all)
print('Global ID of the last 100 frames:', global_last)
Global ID of the entire trajectory: 25.89533641150811
Global ID of the last 100 frames: 27.14144547061944
Section ID
This function computes ID over sliding windows of a protein sequence.
Additional Parameters
window_size(int): window length in residues (default = 10)stride(int): number of residues between two windows (default = 1)
Returns a DataFrame.
results = section_id(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc',
window_size=15, stride=5 , projection_method='Dihedral', verbose =False)
print(f'ID table: \n {results.head()}')
ID table:
start end entire simulation last simulation \
0 42 56 15.020170 15.107589
1 47 61 15.256653 15.226222
2 52 66 15.899548 15.847965
3 57 71 16.098079 16.012590
4 62 76 15.407374 15.417718
instantaneous
0 [15.16317145884061, 14.100304302156, 15.245204...
1 [15.318657037027966, 14.668348024880796, 14.71...
2 [15.828122242566746, 15.889626441326188, 15.31...
3 [16.425784249177884, 16.798194425179883, 15.09...
4 [16.13297162330056, 15.402657321288679, 15.740...
Secondary Structure ID
This function computes ID over secondary structure elements.
Additional Parameters
simplified(bool): if True (default), uses simplified DSSP codes coil (C), strand (S) or helix (H); else helix (H), beta bridge (B), extended strand (E), three helix (G), hydrogen bonded turn (T), bend (S), loop or irregular element ( ).
Returns
A DataFrame with ID values per secondary structure
A DataFrame with DSSP assignment per residue
mol_ref=Molecule("examples/villin/2f4k.pdb")
results, secStr =secondary_structure_id(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc',
mol_ref = mol_ref,
simplified = True , projection_method='Dihedral', id_method='local', verbose = False)
print(f'ID table:\n {results.head(5)}')
print(f'\n Secondary structure assignments:\n {secStr.head(5)}')
ID table:
start end sec str type window entire simulation \
0 42 43 C [42, 43] 1.963018
1 44 50 H [44, 45, 46, 47, 48, 49, 50] 9.274704
2 51 54 C [51, 52, 53, 54] 5.384093
3 55 58 H [55, 56, 57, 58] 5.146533
4 59 62 C [59, 60, 61, 62] 5.439168
last simulation instantaneous
0 1.959972 [2.0501716461112105, 2.054921447896053, 2.0109...
1 9.354203 [9.473939433431966, 9.384160489840434, 8.75161...
2 5.381422 [5.132274328254565, 5.269852589635324, 5.23606...
3 5.138285 [5.117460673233659, 5.350647768246517, 4.86918...
4 5.440862 [5.807002533132328, 5.903334979361287, 5.04612...
Secondary structure assignments:
resid index resname sec str type
0 42 LEU C
1 43 SER C
2 44 ASP H
3 45 GLU H
4 46 ASP H