-
Notifications
You must be signed in to change notification settings - Fork 131
Add run_scoped method to run closures and conditionally perform partial resets
#291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
fitzgen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I do want to add a similar API to bumpalo, but I think that the exact details here are not what we want. I think the API as proposed here is not quite powerful enough to satisfy most uses cases that people have for checkpoints, particularly the ability to do stuff like the following:
let mut bump = Bump::new();
let x = bump.alloc(5);
let checkpoint = bump.checkpoint();
let y = bump.alloc(10);
foo(x, y); // okay
drop(checkpoint); // or `bump.reset_to_checkpoint(checkpoint)` or whatever...
bar(x); // still okay
// bar(y); // NOT OKAY, should cause borrow checker errorsThe API in this PR will not allow x to continue to be used across sub-arenas/checkpoints. And to be fair, I don't the exact pseudocode above can be implemented by any safe+sound Rust API because without changing some API details we have the conflicting requirements that (a) the bump has live shared borrows from earlier allocations and (b) we must have exclusive access at reset points to act as a barrier and ensure that none of the allocations within the nested scope/checkpoint remain live. The only way I am aware of resolving this is to separate the underlying storage and the allocator/checkpoint/scope so that the allocator/checkpoint/scope has a lifetime and borrow of the storage and then you can make sub-allocators/checkpoints/scopes with sub-lifetimes. And each allocation would have the allocator's/checkpoint's/scope's lifetime, not the lifetime of the underlying storage.
There is some existing discussion of this feature in #105, and a proposed API, although it would be a breaking change. It might be possible to come up with something that is semver compatible though, even if not quite as nice API-wise. Anyways, I suggest reading up on that thread and grokking the API sketch in there before continuing any further down this line of investigation.
…fetimes for temporary allocations that may be reset
|
Thanks for the feedback! I made checkpoint and the associated methods private, then added a The API could definitely be polished up a bit, maybe adding a As for functionality, the |
run_scoped method to run closures and conditonally perform partial resets
run_scoped method to run closures and conditonally perform partial resetsrun_scoped method to run closures and conditionally perform partial resets
fitzgen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience, I've been traveling a bunch lately and haven't had time to sit down with this PR until just now.
I think the maybe-reset-maybe-not behavior is fairly niche, and not an API I want to support long-term. That said, I think it would be a failing of this crate if that functionality were impossible to build on top of bumpalo's public APIs.
So, what do you think of the following plan?
- Add a
pub struct RawCheckpoint { ... }type - Add
pub fn Bump::raw_checkpoint(&self) -> RawCheckpointandpub unsafe fn Bump::reset_to_raw_checkpoint(&self, checkpoint: RawCheckpoint)methods - Add
pub fn Bump::with_scope<F, T>(&self, f: F) -> T where F: for<'a> FnOnce(&'a Bump) -> T
The first two items allow you to implement run_scoped with its dynamic-reset behavior on top of this crate, the third item is the typical scoped usage described in #105 and would be implemented on top of the first two as well.
Aside: I think you might be able to improve the ergonomics of your run_scoped method a little bit by defining a custom type with a default type parameter, instead of using Option, so that you don't ever have to type out Option::<()>::None:
pub enum Scope<T = ()> {
Reset,
Advance(T),
}And then Scope::Reset shouldn't need an explicit turbofish.
|
No problem, I think that approach makes sense, sounds like the first 2 steps are pretty close to my original PR but with the reset function marked as unsafe - is that right? If so I'll update the PR to match, I won't look at implementing Thanks for the tip on |
Yes, and with the reset method taking |
fitzgen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some comments inline below.
Can you add a quickcheck test that uses this new API? See tests/all/quickchecks.rs. I'm thinking it should look something like this:
#[derive(Arbitrary)]
#[cfg(feature = "allocator_api")]
enum RawCheckpointOp {
TakeCheckpoint,
ResetToCheckpoint,
DropCheckpoint,
Alloc(u8, usize),
Realloc(u8, usize),
Dealloc,
Reset,
}
#[cfg(feature = "allocator_api")]
fn stress_raw_checkpoints(ops: Vec<RawCheckpointOp>) {
use std::alloc::{Allocator, Layout};
const MAX_SIZE: usize = 100;
const MAX_OPS: usize = 100;
let mut bump = Bump::new();
let mut live_allocs: Vec<(u8, NonNull<[u8]>, Layout)> = Vec::new();
let mut checkpoints: Vec<(usize, RawCheckpoint)> = Vec::new();
let check_alloc = |expected_byte: u8, alloc: NonNull<[u8]>| {
let alloc = unsafe { alloc.as_ref() };
for actual_byte in alloc {
assert_eq!(actual_byte, expected_byte);
}
};
ops.truncate(MAX_OPS);
for op in ops {
use RawCheckpointOp::*;
match op {
TakeCheckpoint => {
checkpoints.push(bump.raw_checkpoint());
}
ResetToCheckpoint => {
if let Some((len, checkpoint)) = checkpoints.pop() {
for (byte, alloc, _layout) in live_allocs.get(len..) {
check_alloc(byte, alloc);
}
live_allocs.truncate(len);
unsafe {
bump.reset_to_raw_checkpoint(checkpoint);
}
}
}
DropCheckpoint => {
let _ = checkpoints.pop();
}
Alloc(byte, size) => {
let size = std::cmp::min(size, MAX_SIZE);
let layout = Layout::array::<u8>(size).unwrap();
let ptr = bump.allocate(layout).unwrap();
unsafe {
ptr.cast::<u8>().write_bytes(byte, ptr.len());
}
live_allocs.push((byte, ptr, layout));
}
Realloc(new_byte, new_size) => {
if let Some((old_byte, old_ptr, old_layout)) = live_allocs.pop() {
check_alloc(old_byte, old_ptr);
let new_size = std::cmp::min(new_size, MAX_SIZE);
let new_layout = Layout::array::<u8>(new_size).unwrap();
let new_ptr = if new_size > old_layout.size() {
unsafe { bump.grow(old_ptr, old_layout, new_layout).unwrap() }
} else {
unsafe { bump.shrink(old_ptr, old_layout, new_layout).unwrap() }
};
unsafe {
new_ptr.cast::<u8>().write_bytes(new_byte, new_ptr.len());
}
live_allocs.push((new_byte, new_ptr, new_layout));
}
}
Dealloc => {
if let Some((byte, ptr, layout)) = live_allocs.pop() {
check_alloc(byte, ptr);
unsafe {
bump.deallocate(ptr, layout);
}
}
}
Reset => {
for (byte, ptr, _layout) in live_allocs.drain(..) {
check_alloc(byte, ptr);
}
checkpoints.clear();
bump.reset();
}
}
}
// Reset any extant checkpoints, checking its in-scope allocations.
for (len, checkpoint) in checkpoints {
for (byte, alloc, _layout) in live_allocs.get(len..) {
check_alloc(byte, alloc);
}
live_allocs.truncate(len);
unsafe {
bump.reset_to_raw_checkpoint(checkpoint);
}
}
// Check any extant allocations.
for (byte, ptr, _layout) in live_allocs {
check_alloc(byte, ptr);
}
}This should cover the realloc-to-before-the-checkpoint scenario I mention inline below, FWIW.
Thanks again!
| pub struct RawCheckpoint { | ||
| chunk_data: NonNull<u8>, | ||
| size: usize, | ||
| ptr: NonNull<u8>, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a benefit to storing these chunk_data and size fields instead of just a chunk: NonNull<ChunkFooter> field? The latter will result in smaller RawCheckpoints and less data to juggle/shepherd around and less things to check in RawCheckpoint::matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense - I'll update it
src/lib.rs
Outdated
| let chunk = unsafe { self.current_chunk_footer.get().as_ref() }; | ||
| (chunk.data, chunk.layout.size(), chunk.ptr.get()) | ||
| }; | ||
| /// This will never free chunks, it will only change the pointer position of the current chunk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should free any chunks beyond the checkpoint's chunk, or else the following usage pattern could "leak" when it really should not:
let mut bump = Bump::new();
loop {
let checkpoint = bump.raw_checkpoint();
foo(&bump);
unsafe {
bump.reset_to_raw_checkpoint(checkpoint);
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's some nuance for this one - if checkpoints are used in a hot loop and trigger a new chunk allocation, then future iterations are also likely to trigger an allocation as well if the new chunk is freed.
Maybe a middle ground would be appropriate? Only keep the largest new chunk, if there have been multiple growths then free the intermediate ones since we know they won't be used for anything.
A pathological example of this would be creating a checkpoint for a Bump with no capacity since that would always trigger a new allocation after resetting to the checkpoint
src/lib.rs
Outdated
| /// # Safety | ||
| /// **This will invalidate references in a way that cannot be validated by the borrow checker!** | ||
| /// | ||
| /// Any allocation made between checkpoint creation and this function being called may be overwritten | ||
| /// by future allocations to this `Bump`. It is your responsibility to ensure that these allocations | ||
| /// are never reused. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid talking about how existing allocations made after the checkpoint might be overwritten, that is an implementation detail, and instead just focus on what the user must to do maintain safety (which is to never to access any allocations made between the checkpoint's creation and this call to reset_to_raw_checkpoint again).
Also we should document that you must not have called Bump::reset between creating the checkpoint and calling reset_to_raw_checkpoint.
Also we should document that you must reset to checkpoints in a LIFO-esque manner (LIFO but you're allowed to drop checkpoints rather than reset to them: that is, once you reset to a checkpoint A you can never reset to another checkpoint B that was created after A but before reset_to_raw_checkpoint(A)).
Also we should note that the given checkpoint must be associated with this Bump (i.e. returned by a call to raw_checkpoint on this Bump). And we should also debug_assert! that property by walking the chunk linked list and checking that we find this checkpoint's chunk. And we should also debug assert that this checkpoint's ptr is (a) within its chunk and (b) is greater than or equal to the current ptr.
Hmm, but now that I think about it, I think (b) might not actually hold 100% of the time. Consider the sequence:
- create
Bump - allocate a "relatively big" allocation
- take a checkpoint
- realloc the allocation such that it shrinks and we actually back up our bump's pointer
- reset to the checkpoint
In this case, the bump's pointer could actually now be greater than the checkpoint's pointer. So I think we still want to debug-assert (a) but instead of (b) we want to only update the bump's pointer to the checkpoint's pointer when the checkpoint's pointer is greater than the bump's pointer.
Let's add a test that covers this case as well.
Finally, can you add an example to this doc comment? Something like this:
let bump = Bump::new();
let before_a = bump.raw_checkpoint();
let a = bump.alloc('a');
assert_eq!(*a, 'a'); // okay
{
let before_b = bump.raw_checkpoint();
let b = bump.alloc('b');
assert_eq!(*a, 'a'); // okay
assert_eq!(*b, 'b'); // okay
{
let before_c = bump.raw_checkpoint();
let c = bump.alloc('c');
assert_eq!(*a, 'a'); // okay
assert_eq!(*b, 'b'); // okay
assert_eq!(*c, 'c'); // okay
unsafe {
bump.reset_to_raw_checkpoint(before_c);
}
assert_eq!(*a, 'a'); // still okay
assert_eq!(*b, 'b'); // still okay
// Not okay! This would be a use-after-free bug!
//
// assert_eq!(*c, 'c');
}
unsafe {
bump.reset_to_raw_checkpoint(before_b);
}
assert_eq!(*a, 'a'); // still okay
// Not okay! This would be a use-after-free bug!
//
// assert_eq!(*b, 'b');
}
unsafe {
bump.reset_to_raw_checkpoint(before_a);
}
// Not okay! This would be a use-after-free bug!
//
// assert_eq!(*a, 'a');…ng internal fields to compare
…_list` to fit checkpoint use case, update docs
|
Have updated to address some of the feedback, still need to add tests though - not familiar with quickcheck so will need to figure that out. Probably won't have time to look at it until the weekend |
This PR adds a "CheckPoint" struct that can be used to save a position for a Bump that can be later reset to, allowing for partial resets of the allocator.
I have a use case where I build a collection from many complex structs that use a bump allocator, however these structs must pass a filter before being added to the collection, In cases where very few events make it through the filter I found that the bump allocator would grow to be massive due to leaked allocations from rejected structs.
After testing using a second bump allocator as scratch space that resolved the memory leak issue however caused a ~20% performance hit due to the extra memory copies, this PR solves the problem by taking a checkpoint before building the complex struct, then performing a partial reset if the struct is rejected.
Not sure if this kind of conditional logic is used elsewhere but figured I'd make a PR anyway in case it's useful, happy to update docs etc if needed.