why3server incorrect when client disconnects early
This is an issue that we encountered after the recent why3 merge in SPARK. Matteo wanted to open a ticket about this but I haven't seen it, so I am opening it myself.
The why3server identifies clients via the file descriptor (just an integer) of the unix domain socket that is used to communicate between the client and server. The integer is potentially reused when a client disconnects and another one connects later. This can lead to messages "leaking" from one client to the next.
The problematic scenario is where a client disconnects while there are still pending requests in the server for this client. Then a new client connects and receives notifications for the requests of the old client. This usually results in a crash in the new client (as it receives notifications for requests it didn't make).
I was half aware of this problem and took care to program why3 in such a way that it always reads all requests from the server before disconnecting. Claude wasn't aware of this and broke this behavior with patch 8f6b6c6b. (At least that's my current understanding).
What do people think of moving away from why3server? The code is very complex and error-prone. To replace its functionality, three things are required:
- a "limiting" wrapper similar to previous "why3cpulimit"
- a way to limit number of provers across multiple instances of why3
- a way to spawn processes from why3 that doesn't introduce small delays (this was the original reason for the why3server, together with (2). When forking from a process that takes a lot of memory, just marking the memory pages as COW takes some small amount of time, which adds up if hundreds of provers are spawned).
For (1), ideally we could reuse a lot of code from the why3server. And from the old why3cpulimit. This is the most complex part IMO, as
For (2), we already have code for Windows and Linux in SPARK that uses semaphores.
For (3), we were hoping that the ocaml spawn library doesn't have this issue (uses vfork).
If that's not an option, we should probably improve the why3server. It's a bit tedious but doable to drop pending requests when a client disconnects.
Yet another option is to live with this and make sure why3 reads the answers for all requests before disconnecting.
Any opinions?