Intrinsic Dimension

Computes the ID of the system over the entire molecular dynamics trajectory.
If id_method is local the function returns:

  • the averaged value of instantaneous ID computed on the entire trajectory.

  • the averaged value of instantaneous ID computed from the last frames to the end of the trajectory.

  • the instantaneous ID computed frame by frame on the entire trajectory.

mean_all, mean_last, local_id = intrinsic_dimension(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc', projection_method='Dihedral', id_method='local', verbose=False)

print('Mean instantaneous ID of the entire trajectory:', mean_all)
print('Mean instantaneous ID of the last 100 frames:', mean_last)
print('Istantaneous ID of the entire trajectory: \n', local_id[5:])
Mean instantaneous ID of the entire trajectory: 25.05909587663137
Mean instantaneous ID of the last 100 frames: 25.603595848008236
Istantaneous ID of the entire trajectory: 
 [25.29562521 25.3016539  25.58546274 ... 25.56638913 25.2186398
 25.43423956]

If id_method is global the function returns:

  • the value of global ID computed on the entire trajectory.

  • the value of global ID computed on the last number of frames of the trajectory.

global_all, global_last = intrinsic_dimension(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc', projection_method='Dihedral', id_method='global', verbose = False)

print('Global ID of the entire trajectory:', global_all)
print('Global ID of the last 100 frames:', global_last)
Global ID of the entire trajectory: 25.89533641150811
Global ID of the last 100 frames: 27.14144547061944

Section ID

This function computes ID over sliding windows of a protein sequence.

Additional Parameters

  • window_size (int): window length in residues (default = 10)

  • stride (int): number of residues between two windows (default = 1)

Returns a DataFrame.

results = section_id(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc', 
                     window_size=15, stride=5 , projection_method='Dihedral', verbose =False)
print(f'ID table: \n {results.head()}')
ID table: 
    start  end  entire simulation  last simulation  \
0     42   56          15.020170        15.107589   
1     47   61          15.256653        15.226222   
2     52   66          15.899548        15.847965   
3     57   71          16.098079        16.012590   
4     62   76          15.407374        15.417718   

                                       instantaneous  
0  [15.16317145884061, 14.100304302156, 15.245204...  
1  [15.318657037027966, 14.668348024880796, 14.71...  
2  [15.828122242566746, 15.889626441326188, 15.31...  
3  [16.425784249177884, 16.798194425179883, 15.09...  
4  [16.13297162330056, 15.402657321288679, 15.740...  

Secondary Structure ID

This function computes ID over secondary structure elements.

Additional Parameters

  • simplified (bool): if True (default), uses simplified DSSP codes coil (C), strand (S) or helix (H); else helix (H), beta bridge (B), extended strand (E), three helix (G), hydrogen bonded turn (T), bend (S), loop or irregular element ( ).

Returns

  • A DataFrame with ID values per secondary structure

  • A DataFrame with DSSP assignment per residue

mol_ref=Molecule("examples/villin/2f4k.pdb")
results, secStr =secondary_structure_id(topology = 'examples/villin/2f4k.pdb', trajectory = 'examples/villin/2f4k_f0.xtc', 
        mol_ref = mol_ref,
        simplified = True , projection_method='Dihedral', id_method='local', verbose = False)

print(f'ID table:\n {results.head(5)}')
print(f'\n Secondary structure assignments:\n {secStr.head(5)}')
ID table:
    start  end sec str type                        window  entire simulation  \
0     42   43            C                      [42, 43]           1.963018   
1     44   50            H  [44, 45, 46, 47, 48, 49, 50]           9.274704   
2     51   54            C              [51, 52, 53, 54]           5.384093   
3     55   58            H              [55, 56, 57, 58]           5.146533   
4     59   62            C              [59, 60, 61, 62]           5.439168   

   last simulation                                      instantaneous  
0         1.959972  [2.0501716461112105, 2.054921447896053, 2.0109...  
1         9.354203  [9.473939433431966, 9.384160489840434, 8.75161...  
2         5.381422  [5.132274328254565, 5.269852589635324, 5.23606...  
3         5.138285  [5.117460673233659, 5.350647768246517, 4.86918...  
4         5.440862  [5.807002533132328, 5.903334979361287, 5.04612...  

 Secondary structure assignments:
    resid index resname sec str type
0           42     LEU            C
1           43     SER            C
2           44     ASP            H
3           45     GLU            H
4           46     ASP            H