WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Conversation

@Nitin-Prata
Copy link

@Nitin-Prata Nitin-Prata commented Dec 12, 2025

This PR adds a new analysis module RMSFResidue that computes
Root Mean Square Fluctuations (RMSF) on a per-residue basis.

Motivation

MDAnalysis currently provides atom-level RMSF via RMSF. This PR adds a
small convenience analysis that aggregates RMSF values at the residue
level, following the existing RMSF API. This does not introduce a new
method, but provides a commonly requested way to summarize RMSF results
at the residue level.

Implementation

  • Introduces RMSFResidue in MDAnalysis.analysis.rmsf_residue.
  • Computes residue-level mean positions frame-by-frame.
  • Accumulates statistics and final RMSF in _conclude().
  • Stores results in results.residue_rmsf.

Tests

Added:

  • testsuite/MDAnalysisTests/analysis/test_rmsf_residue.py
    Ensures:
  • Analysis runs without error
  • Output length matches number of residues
  • RMSF values are non-negative

Notes

The implementation avoids groupby("residues") due to current API limitations, and instead manually groups atoms by resid.

Status

All tests pass locally.


📚 Documentation preview 📚: https://mdanalysis--5176.org.readthedocs.build/en/5176/

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on GitHub Discussions so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.

@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

❌ Patch coverage is 96.87500% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 92.73%. Comparing base (bbcef1b) to head (4e30f8c).

Files with missing lines Patch % Lines
package/MDAnalysis/analysis/rmsf_residue.py 96.87% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #5176   +/-   ##
========================================
  Coverage    92.72%   92.73%           
========================================
  Files          180      181    +1     
  Lines        22472    22504   +32     
  Branches      3188     3191    +3     
========================================
+ Hits         20837    20868   +31     
  Misses        1177     1177           
- Partials       458      459    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Nitin-Prata
Copy link
Author

Hi!
All CI checks except mypy have passed.
It looks like the mypy job was cancelled due to a timeout and not due to an error in the code.
The module and tests run correctly locally.

Whenever you get a moment, I’d appreciate a review.
Thanks!

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please cite a handful of papers (~5) that require this kind of analysis?

@orbeckst orbeckst mentioned this pull request Dec 12, 2025
@Nitin-Prata
Copy link
Author

Thanks for the question.

Per-residue flexibility analysis (often reported as per-residue RMSF or equivalent fluctuation profiles) is widely used in MD studies to interpret protein dynamics and function. A few representative examples:

Karplus & McCammon (2002), Nature Structural Biology – foundational review establishing the importance of internal motions and residue-level flexibility in biomolecular simulations.

Hollingsworth & Dror (2018), Neuron – discusses analysis of MD trajectories using residue-wise flexibility profiles to relate dynamics to function.

Hospital et al. (2015), Advances and Applications in Bioinformatics and Chemistry – highlights ensemble-based analysis and residue-level dynamic properties derived from MD simulations.

Grant et al. (2006), Bioinformatics – Bio3D toolkit explicitly performs per-residue fluctuation analyses, demonstrating this as a standard and useful abstraction.

These works illustrate that residue-level RMSF (or closely related metrics) is a common and meaningful analysis output. The goal of this PR is to make this standard analysis directly accessible within MDAnalysis, complementing the existing atom-level RMSF.

@IAlibay
Copy link
Member

IAlibay commented Dec 13, 2025

@Nitin-Prata please take no offense in my question, but the way your response is written makes me think that you are using an LLM. Can you please confirm if you are using one?

@Nitin-Prata
Copy link
Author

@IAlibay I have performed a literature search independently and chose these papers based on my understanding of the use of per-residue RMSF in MD analysis. I do use tooling to help with wording the response, but the references and technical content are mine.

@IAlibay
Copy link
Member

IAlibay commented Dec 13, 2025

@Nitin-Prata

The first 3 papers you cite are review papers that broadly describe molecular dynamics but, as far as I can tell, offer no details about this specific per residue RMSF method and its applications.

Re Bio3D, whilst I am aware that Bio3D offers a groupby feature, the paper you link to also does not describe such an approach.

@IAlibay
Copy link
Member

IAlibay commented Dec 13, 2025

At this moment, before we proceed, I would ask you to try to provide specific example application cases.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking until more info is available.

@Nitin-Prata
Copy link
Author

Amadei et al Essential dynamics of proteins, Proteins (1993).

In the Methods (page 3), the authors define the covariance matrix of atomic positions as the time-average of squared deviations from the mean coordinates
x_i(t) are the position coordinates and (x_i) is the average of coordinate i over all configurations.

This is precisely the fluctuation in atomic positions from which the RMSF is calculated RMSF = sqrt(⟨(x − ⟨x⟩)²⟩).

These calculated fluctuations are then interpreted at the level of specific residues in the Results page 8

Whereas the catalytic site residues Glu-35 and Asp-52 are rigid residues involved in substrate binding namely 59, 62, 63, 101, and 107,show extensive flexibility. The following will be a direct example of residue-level position fluctuation analysis derived from MD trajectories

@orbeckst
Copy link
Member

I am replying to a comment that is not present anymore

Nitin-Prata left a comment (MDAnalysis/mdanalysis#5176)
I agree that a literature justification is not necessary here. This is because my intention here is not to assert a formally defined, literature-driven approach but merely a convenience analysis that sums up the existing RMSF at a residue level, following the existing atom-level RMSF API.

to make very clear why I asked my original question and to assert why it is important: Maintaining MDAnalysis is difficult and very labor-intensive. Any piece of code we add makes future maintenance more difficult — this is called technical debt. Therefore, we have to carefully weigh advantages to our users vs the technical debt that we are incurring. If a newly proposed featured is not something that our users would likely want to see then we will not include it.

We appreciate that you showed the initiative and started with code to discuss this feature. It is, however, more common that people first raise an issue for a desired feature where we can discuss the need for it. If we then come to the conclusion that we do not want the feature implemented then nobody has spent time coding.

With this PR we are at the stage where we are trying to assess if this is a feature that we want to include and we are asking you to convince us that there is enough interest (e.g., because it's a widely used method).

@Nitin-Prata
Copy link
Author

Nitin-Prata commented Dec 15, 2025

@orbeckst Thank you for clarifying. that puts everything in perspective.

I see your point concerning technical debt and concur with your approach to prioritize incorporating new features when there is a demand from consumers.

To clarify my intent: RMSFResidue is not intended to offer another way of doing analysis but rather a small convenience wrapper over a pattern which a set of users already seemed to have used before in a manner based on atom-level RMSF.

I think this is a topic that would better be discussed in an issue or a discussion before code is written. If it’s your preference to go first and get some feedback in the community before getting back to this topic, I am totally in favor of whatever this project wants to follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants