-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Using S3 for this project is shockingly expensive, so far my bill is something like $50/month of S3 costs, and I think it's because my workload isn't what S3 is designed for. The actual serving of the website is somewhat secondary.
The most common thing I do is search the logs for ICEs:
rg "thread 'rustc' panicked|internal compiler error"
or for missing LLVM intrinsics shims:
rg -o "\`llvm\..*?\`" | cut -d: -f2 | counts
Of course S3 is hopeless at supporting these, so I've been running aws s3 sync on miri/raw whenever I want to search. Even if I have most of the files locally, it's still a lot of requests. And if I update the rendering code, then I have to download every raw file and upload a new HTML file for every version of every crate, so each sync is something like $20.
The reason I've been putting this in S3 is that I just assumed this would be about free (it's not) and that there wasn't a better option (there is).
S3 storage is very cheap by the GB but costly per request. I think EBS is a better deal, because if I just feed the raw data through xz, the whole storage space for the backend currently is 2.2 GB. So we could do about 12 GB of gp3 storage for $1/month which would be a lot of room to spare.
- Rebuild rendering code so it doesn't OOM a t4g.nano instance
- Put together a little server that can serve requests and render the HTML on the fly
- Update the
synccommand to not render the HTML anymore, and just update the crate list and landing page - Change
Clientto upload to an instance over SCP instead of to a bucket - Write Terraform for provisioning the instance, storage, and hooking up the API gateway and Cloudfront distribution
- Switch to uploading compressed raw output
- Prevent uploads from colliding with an ongoing rendering