I like the filesystem, but Ike brought up a good point that catsoop would be easier to set up (paticularly with modern tools) if content lived in a database (or could be cached to a database) instead.
This would allow multiple small VPS's to serve the web pages by talking to a DB server to grab cached page information as well as log information.
If the checker's database were also stored on that machine, it would be relatively easy to connect multiple workers to it that could handle the checking (and to scale up/down in response to load).
6.145 has been running with a PostgreSQL backend for a while now (using the code from this branch), and it is mostly going smoothly. But there are a couple of pain points that make me want to investigate other solutions before going all-in on that one; specifically:
allowing Postgres or filesystem storage complicates the codebase quite a lot (a lot of functionality has to be implemented both for PSQL and for the filesystem, not to mention that the existing codebase assumes filesystem access just about everywhere).
there is a significant performance difference between the two, particularly when loading pages that read from a lot of logs (PSQL is way slower, even on localhost).
full functionality would require rewrites of course material for anyone using cs_local_python_import or opening files, and opening files would be limited to files in the course tree.
These concerns are big enough that I think it's worth exploring other options. In particular, I'm going to spend some time working on an alternative implementation that uses the filesystem, in the hopes that the above two issues go away while still preserving the key features of this. In particular, I think the important thing is the ability to have checkers run on separate machines from the web server (and the ability to dynamically add/remove checkers in response to load).
No due date set.
This issue currently doesn't have any dependencies.
Deleting a branch is permanent. It CANNOT be undone. Continue?