Mentions légales du service

Skip to content
Snippets Groups Projects

Fix incorrect CPU micro-architecture metadata [#1]

Closed DIDIER Guillaume requested to merge gdidier/reference-repository:gdidier-fixes into master
2 unresolved threads

This PR contains two set of changes.

The first commit rectifies incorrect data relative to Xeon Ex-... v4 and Xeon Ex-... v2 CPUs whose micro-architectures are incorrect. (v4 is Broadwell, but some v4 CPUs were classified as Sandy Bridge or Haswell incorrectly, v2 is Ivy Bridge, but a cluster was marked as Sandy Bridge instead).

The second commit tweaks the Ice Lake and Skylake metadata. Grid5k only uses the Scalable Processor (aka server) derivative of the micro-architecture. There was a mix of Ice Lake and Ice Lake-SP in the data, which is made coherent as Ice Lake-SP, and all Skylake are turned into Skylake-SP for consistency.

The hardware.rb file is updated with the correct release date for those two micro-architecture, and the client ones are commented out.

It resolves the issue #1 I opened when I first noticed the issues.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Note, I am apparently unable to let the CI pipeline run, as some of the required templates are not public.

  • JACQUOT Pierre requested review from @pijacquo

    requested review from @pijacquo

  • Hello,

    I took the time to review your Merge Request and I noticed that the /data/grid5000/sites/rennes/clusters/roazhon9/nodes/roazhon9-1.json file is modified to have a Skylake-SP micro architecture, but its corresponding YAML file (input/grid5000/sites/rennes/clusters/roazhon9/roazhon9.yaml still has a Skylakearchitecture. This is strange since other clusters (such as roazhon2) have both their JSON and YAML file edited as they should be. I also noticed that the new microarchitectures (Skylake-SP and Ivy Bridge) were not reported in the lib/refrepo/input_loader.rb script. This means that the generation of the JSON files from the YAML input files is not working (I tried locally on my machine to be sure).

    This leads me to think that you have edited manually (or with a tool like sed) the files inside data/ and input/, while our workflow is to edit the files in input/, and then generate the JSON files in data/ with the rake reference-api task. Could you confirm that is what you have done ? If yes, we'll have to edit your commits to synchronise data/ and input/ directories.

    I also have a question regarding the Skylake-SP and Ivy Bridge architectures. In the lib/refrepo/input_loader.rb script, there is a method which purpose is to compute the flops of a given CPU regarding its microarchitecture. Is Skylake-SP and Skylake architecture equivalent regarding their FLOPs per cycle ? Same question for Ivy Bridge and Sandy Bridge: Are they equivalent ? If we change the microarchitecture of several CPUs, we might need to update their FLOP/s too.

    Edited by JACQUOT Pierre
  • I have totally done the editing of the generated files by hand, given that I did not have access to the documentation (the Tech Wiki is private).

    I will fix the inconsistency you mentioned, esp. the ruby file, and regenerate the generated files properly tonight.

    Can you write up a quick start guide on how to set uo the correct environment and generate those files ?

  • added 1 commit

    • bc1db4b5 - Fix generation and minor inconsistencies related to the micro-architecture renaming

    Compare with previous version

  • The latest commit updates lib/refrepo/input_loader.rb to know Ivy Bridge, Ice Lake-SP and Skylake-SP (and removes the client variants, with comment stating they are not in use in grid5k), and regenerates the data.

    abacus 1 and 2 see the flops figure change, as a result of being reclassified from Sandy Bridge into Broadwell

  • I had not seen the update. I will go do some double checking, but Ivy Bridge is a die shrink of Sandy Bridge, so they should be the same.

    Skylake-SP and Skylake cores have some differences, but Skylake-SP and Cascade Lake-SP should behave the same, so renaming Skylake into Skylake-SP should be the correct thing to do.

  • For Skylake client vs Skylake-SP, only Skylake-SP has AVX512, so the figures for Skylake client would likely be different from Skylake-SP, and likely in line with those for Broadwell. (Though I am not an expert in performance).

    On low end Skylake-SP, the single AVX-512 unit fuses ports 0 and 1 (which have the usual 256b units on Skylake). (hence, it probably means low end Skylake-SP have the same FLOPs per cycle as client Skylake)

    On High-end Skylake-SP a second AVX-512 unit is available (hence double the total).

    Ivy Bridge has no notable micro-architectural change that should impact the FLOPs per cycle.

    Edited by DIDIER Guillaume
  • By the way, should I rebase my branch on top of the current master ?

  • JACQUOT Pierre mentioned in merge request !772 (merged)

    mentioned in merge request !772 (merged)

  • Hello, I reworked your merge request to make it compatible with our tooling scripts, and rebased it on top of the master branch.

    You can find my merge request here

    If the reworked merge request seems OK to you, I'll close this current one. Before merging this new MR, I need to discuss its impact on our g5k-checks tool with the rest of the technical team. Once it is done, we'll be able to decide if we merge it or if we don't.

    Edited by JACQUOT Pierre
Please register or sign in to reply
Loading