batsim issueshttps://gitlab.inria.fr/batsim/batsim/-/issues2018-12-26T12:42:10+01:00https://gitlab.inria.fr/batsim/batsim/-/issues/80Making tests run in parallel2018-12-26T12:42:10+01:00MERCIER MichaelMaking tests run in parallelAdd a function to select a free port automatically so the tests can run in parallel.
It would speed up the tests.
It would also ensure that multiple batsim can run in the same machine without collision.
I use to write something to do...Add a function to select a free port automatically so the tests can run in parallel.
It would speed up the tests.
It would also ensure that multiple batsim can run in the same machine without collision.
I use to write something to do this, that we can use:
https://github.com/oar-team/kameleon/blob/380ca697bad28b120e7df65c6262402f073d8107/contrib/kameleon_bashrc.sh#L208https://gitlab.inria.fr/batsim/batsim/-/issues/79Better ci2018-11-09T16:13:40+01:00MERCIER MichaelBetter ci- [x] Use a lighter docker Nix based image for the CI.
Needed some hack on the base Nix image, see https://github.com/LnL7/nix-docker/pull/24
This branch can be used tu build the `oarteam/nix` and `oarteam/batsim_ci` images:
https://git...- [x] Use a lighter docker Nix based image for the CI.
Needed some hack on the base Nix image, see https://github.com/LnL7/nix-docker/pull/24
This branch can be used tu build the `oarteam/nix` and `oarteam/batsim_ci` images:
https://github.com/mickours/nix-docker/tree/mickours
- [ ] Use the cache of gitlab CI for the store.
**NOT POSSIBLE** because /nix is out of the project scope
- [X] Use the scripts in. /ci inside the Nix expressions for tests.
**Done** in kapack with e74bdf3 and in batsim with 9ed97d4
- [ ] use the scheduling capabilities of the gitlab CI to test with upstream Simgrid regularly
**DEFERED** This is discussed in https://gitlab.inria.fr/batsim/batsim/issues/90
- [x] check that Batsim compiles without warning on both clang and gcc.https://gitlab.inria.fr/batsim/batsim/-/issues/78Remove NOP2018-10-18T18:20:35+02:00MERCIER MichaelRemove NOPIt do not exists anymore because it was replaced by `CALL_ME_LATER`/ `REQUESTED_CALL` if I understand correctly.
So we need to remove it from the doc.
Also interestingly we have kept the old protocol letter mapping (not used anymore) t...It do not exists anymore because it was replaced by `CALL_ME_LATER`/ `REQUESTED_CALL` if I understand correctly.
So we need to remove it from the doc.
Also interestingly we have kept the old protocol letter mapping (not used anymore) that contains also NOP:
https://gitlab.inria.fr/batsim/batsim/blob/master/src/network.hpp
We need to remove it too!Batsim 3.0https://gitlab.inria.fr/batsim/batsim/-/issues/77Sphinx documentation?2018-10-16T18:12:38+02:00Millian PoquetSphinx documentation?What about transforming our documentation to a readthedocs-like one?
I didn't like rst a lot at first sight but it has very interesting features.
1. Include the content of files.
This is amazing to create up-to-date and CI-proof ...What about transforming our documentation to a readthedocs-like one?
I didn't like rst a lot at first sight but it has very interesting features.
1. Include the content of files.
This is amazing to create up-to-date and CI-proof tutorials.
2. Clear navigation between parts.
Current architecture is okay, but moving from one file to another is hard.
Is is also quite easy to miss a documentation part with current markdown doc.https://gitlab.inria.fr/batsim/batsim/-/issues/75Separate Simgrid process tracing and simgrid resource usage tracing in two op...2018-11-17T14:35:43+01:00MERCIER MichaelSeparate Simgrid process tracing and simgrid resource usage tracing in two optionsIt is now under the same option, this should be in two options.
Also, The simgrid process tracing bug when killing a process, see https://github.com/simgrid/simgrid/issues/285It is now under the same option, this should be in two options.
Also, The simgrid process tracing bug when killing a process, see https://github.com/simgrid/simgrid/issues/285https://gitlab.inria.fr/batsim/batsim/-/issues/74add a "no more jobs in workload" event2018-08-22T17:19:13+02:00MERCIER Michaeladd a "no more jobs in workload" eventThe schedulers that are doing dynamic submission have to notify batsim that the submission is finished.
But, without knowing if there is still jobs in the workload(s) that will be submitted in the future, and if there is a pause in the...The schedulers that are doing dynamic submission have to notify batsim that the submission is finished.
But, without knowing if there is still jobs in the workload(s) that will be submitted in the future, and if there is a pause in the submissions, the scheduler is unable to take a decision. That's why a "no more jobs in workload" event is required.https://gitlab.inria.fr/batsim/batsim/-/issues/73Remove external code from batsim repo2018-08-29T22:17:40+02:00Millian PoquetRemove external code from batsim repopugixml and docopt-cpp are already nixed, we should use these dependencies instead of including their code in Batsim.pugixml and docopt-cpp are already nixed, we should use these dependencies instead of including their code in Batsim.https://gitlab.inria.fr/batsim/batsim/-/issues/72msg_par_hg_tot jobs with only one resource and communication segfault2018-08-31T14:09:39+02:00MERCIER Michaelmsg_par_hg_tot jobs with only one resource and communication segfaultTo reproduce:
```json
{
"nb_res": 4,
"jobs": [
{"id":1, "subtime":0, "res": 1, "profile": "msg_tot_test"},
],
"profiles": {
"msg_tot_test": {
"type": "msg_par_hg_tot",
"cpu": 1e10...To reproduce:
```json
{
"nb_res": 4,
"jobs": [
{"id":1, "subtime":0, "res": 1, "profile": "msg_tot_test"},
],
"profiles": {
"msg_tot_test": {
"type": "msg_par_hg_tot",
"cpu": 1e10,
"com": 1e7
}
}
}
```https://gitlab.inria.fr/batsim/batsim/-/issues/71allocating different nb of resources than in "res" of a job does not crash fo...2018-07-30T16:04:48+02:00MOMMESSIN Clementallocating different nb of resources than in "res" of a job does not crash for msg_par_hgBatsim permits a scheduler to allocate less resources to a job than asked during submission of the job (the 'res' field of the job description).
This appears when the job profile is "msg_par_hg" (maybe also for "msg_par_hg_pfs" depending...Batsim permits a scheduler to allocate less resources to a job than asked during submission of the job (the 'res' field of the job description).
This appears when the job profile is "msg_par_hg" (maybe also for "msg_par_hg_pfs" depending on what Mickours is doing with this profile).
Normal behavior: when the scheduler allocates less resources than asked by the job description, Batsim should complain upon handling the "EXECUTE_JOB" event.
You can try it with the master branch of batsim, small_platform.xml and the workload issue32.json.
Scheduler : fillerSched of pybatsim by replacing the line 40
``res = self.availableResources[:nb_res_req]`` by ``res = self.availableResources[:1]`` to force the allocation on only one resource.https://gitlab.inria.fr/batsim/batsim/-/issues/70Batexec do not manage properly jobs that are exeeding the number of available...2022-01-19T20:16:28+01:00MERCIER MichaelBatexec do not manage properly jobs that are exeeding the number of available resourcesWhen a job in the workload exceed the number of available resources, batexec try to allocate non-existent machines here:
https://gitlab.inria.fr/batsim/batsim/blob/master/src/job_submitter.cpp#L407
Leading to an inconsistent message:
`...When a job in the workload exceed the number of available resources, batexec try to allocate non-existent machines here:
https://gitlab.inria.fr/batsim/batsim/blob/master/src/job_submitter.cpp#L407
Leading to an inconsistent message:
```Cannot get machine 4: it does not exist```
Adding a proper assert to check if the jobs fits in the total number of available machine would be greathttps://gitlab.inria.fr/batsim/batsim/-/issues/69Old Time idependent traces format cause segfault in Simgrid2018-11-30T08:45:19+01:00MERCIER MichaelOld Time idependent traces format cause segfault in SimgridMaybe document this in simgridMaybe document this in simgridhttps://gitlab.inria.fr/batsim/batsim/-/issues/68path resolution in SMPI trace lead segfault when batsim is not run alongside...2018-06-19T10:41:10+02:00MERCIER Michaelpath resolution in SMPI trace lead segfault when batsim is not run alongside the tracesTo reproduce:
Run batsim with SMPI profiles from an other location than the .txt file that link to the traces.To reproduce:
Run batsim with SMPI profiles from an other location than the .txt file that link to the traces.Millian PoquetMillian Poquethttps://gitlab.inria.fr/batsim/batsim/-/issues/67Simplify the Batsim protocol2018-11-23T10:38:22+01:00MERCIER MichaelSimplify the Batsim protocolCurrently some message of the batsim protocol have optional fields:
1) `SUBMIT_JOB` allows the scheduler to give a optional profile which is redundant with the `SUBMIT_PROFILE` message. Maybe just removing the optional profile is enoug...Currently some message of the batsim protocol have optional fields:
1) `SUBMIT_JOB` allows the scheduler to give a optional profile which is redundant with the `SUBMIT_PROFILE` message. Maybe just removing the optional profile is enough
2) `JOB_SUBMITTED` have 3 different ways to give information to the scheduler.
Also, I already reach the limitation of forwarding information in the message for the `JOB_KILLED` message: The progress given for composed profile is quite simple if the scheduler knows the profiles, but if it does not the code of Batsim to manage this become more and more complex.
So, I suggest to use the `SIMULATION_BEGINS` message to give a maximum of information (It already does that with the config, the resources, the workloads and I just added the profiles. Now we can add the jobs so we can simplify the `JOB_SUBMITTED` message.
It implies to store these information in the scheduler from the beginning so the memory consumption would be higher of course. But I think that we kept redis exactly for this purpose: when we reach scaling limitation.Batsim 3.0https://gitlab.inria.fr/batsim/batsim/-/issues/66Crash when "submission_finished" not sent by the scheduler2018-05-11T11:30:00+02:00MOMMESSIN ClementCrash when "submission_finished" not sent by the scheduler(Outdated)Flow of events:
* Scheduler sends "call_me_later" for the time 100
* All jobs have finished at time 50
* The "call_me_later" event is kind of forgotten by Batsim and the simulation_ends event is sent to the scheduler, causing t...(Outdated)Flow of events:
* Scheduler sends "call_me_later" for the time 100
* All jobs have finished at time 50
* The "call_me_later" event is kind of forgotten by Batsim and the simulation_ends event is sent to the scheduler, causing the simulation to stop.
I think that it simply lacks a condition in is_simulation_finished (server.cpp) with a count on the number of "call_me_later requests" and "requested_call" (or just count the number of "sleeper" processes).https://gitlab.inria.fr/batsim/batsim/-/issues/65Missing test: dynamic submissions without submission_finished2021-08-20T14:07:16+02:00Millian PoquetMissing test: dynamic submissions without submission_finishedAccording to @cmommess, batsim may terminate the simulation while submission_finished has not been told by the scheduler.
Can you provide a MWE @cmommess?According to @cmommess, batsim may terminate the simulation while submission_finished has not been told by the scheduler.
Can you provide a MWE @cmommess?https://gitlab.inria.fr/batsim/batsim/-/issues/64Add parallel job composition2022-01-20T06:26:32+01:00MERCIER MichaelAdd parallel job compositionOnly the sequence composition (a list of tasks that are executed one after the other) is implemented but we lack the possibility to compose tasks in parallel.
Making the composed profile to do that would be great.Only the sequence composition (a list of tasks that are executed one after the other) is implemented but we lack the possibility to compose tasks in parallel.
Making the composed profile to do that would be great.5.0.0Millian PoquetMillian Poquethttps://gitlab.inria.fr/batsim/batsim/-/issues/63Tests: use robin rather than exec*2018-11-30T00:37:25+01:00Millian PoquetTests: use robin rather than exec*[Robin](https://gitlab.inria.fr/batsim/batexpe) is now used to run experiments.
We should think about using it to run Batsim tests.
This would imply to:
- [x] call robin, either by:
- [x] generating many robin input files
- [ ] call...[Robin](https://gitlab.inria.fr/batsim/batexpe) is now used to run experiments.
We should think about using it to run Batsim tests.
This would imply to:
- [x] call robin, either by:
- [x] generating many robin input files
- [ ] calling robin from its CLI from scripts with parameters
- [x] write check scripts in specific files (they are currently exec* postcommands)
- [x] write wrapper scripts (that call robin then check the result)
## Flatten tests?
It could be the occasion to flatten Batsim tests.
Currently, a test run (a lot of) simulation instances. For each instances, it makes sure the simulation ends correctly [and checks simulation output or not depending on the test].
In a flatten architecture, each simulation instance [+ check] would be a specific test.
Flattened tests can still remain modular, as in CMake a test is just a command to launch.
For example, we could create a script for each current test, and create myriads of tests by calling such scripts multiple times with different parameters.
Flattening pros:
- Very easy to determine which simulation instance fails.
Currently we have to find it in the execN log (e.g., ``read_csv('instances_info.csv') %>% filter(status=='skipped')``).
- Easier to debug, as determining which instance fails and reexecuting it is easier.
Flattening cons:
- Each instance would require a unique name, so we need some caution when generating them.
- Will generate some CMake noise (but we could use CMake loops to avoid most of it)Millian PoquetMillian Poquethttps://gitlab.inria.fr/batsim/batsim/-/issues/62CMake : update SimGrid-related modules2018-07-04T23:30:06+02:00Millian PoquetCMake : update SimGrid-related modulesIt looks like Batsim uses a non-standard ``FindSimGrid.cmake`` for the moment, which is not desired on the long run.
We should use the standard one (distributed with SimGrid), or update our additions upstream.It looks like Batsim uses a non-standard ``FindSimGrid.cmake`` for the moment, which is not desired on the long run.
We should use the standard one (distributed with SimGrid), or update our additions upstream.Millian PoquetMillian Poquethttps://gitlab.inria.fr/batsim/batsim/-/issues/61Add documentation on profiles2018-04-10T17:36:14+02:00MERCIER MichaelAdd documentation on profilesWe need a real documentation on that.
The file to fill-in is doc/profiles.mdWe need a real documentation on that.
The file to fill-in is doc/profiles.mdBatsim 3.0https://gitlab.inria.fr/batsim/batsim/-/issues/60Improve forward_profiles for composed profiles2018-11-30T08:50:14+01:00Millian PoquetImprove forward_profiles for composed profilesCurrently (as of dc9abe9), forwarded profiles or composed profiles only contain the root profile and not the whole graph.
It would be better to have access to the whole graph.Currently (as of dc9abe9), forwarded profiles or composed profiles only contain the root profile and not the whole graph.
It would be better to have access to the whole graph.