-
Notifications
You must be signed in to change notification settings - Fork 35
Add network info cleanup and missing file retry logic #706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Clean up network info files older than 1 hour when monitoring starts - Search for new SSH process after 5 consecutive file read failures
b43d8f1 to
dba73e1
Compare
| if (!file.endsWith(".json")) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If each filename corresponds to a PID, we should also validate that the PID is no longer active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh I like this actually, would mean that we might not need the 1 hour threshold 🤔 ?
I suppose the OS can re-use a PID though so it's not entirely accurate. We can do something like isPidAlive(pid) || isAnHourOld so it's more aggressive though not sure if it's super worth it for a PID check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I guess in the worst case scenario the process would still have the file handle (although deleted)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the process has a file lock then we might error as EBUSY (or maybe PERM depending on the OS) but in any case that wouldn't cause any issues + we only delete it if it's been untouched in an hour (this is usually updated every ~3 seconds). IMO I do not think a PID check here is worthwhile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly (have not reviewed the code, just based off the description) this would delete logs from active connections? Would make debugging any issue that happens after an hour difficult to diagnose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log file issue can be handled here #377 (rotation with a maximum number of files or maybe date-based etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahaha this is what I get for not fully reading. But would this break the connection info on the bottom bar for active connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh wait are you checking last modified time, not created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you are lol next time I will just read the code haha. All good from me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahaha understandable, I have another PR you can read as well 🙈
Summary
<pid>.jsonfiles in the network info directory on monitor start (files older than 1 hour)Details
Network info files (
<basePath>/net/<pid>.json) are created by the Coder CLI and persist after processes terminate. This adds automatic cleanup of files not modified in the last hour.Previously, if the network info file was missing (e.g., deleted or never created), the monitor would log errors indefinitely without attempting to find a new process. Now it behaves consistently with stale file detection - after 5 failures (with delays), it searches for a new SSH process.
Closes #51