SP17 Item 03 - Security logs on the nodes · Milestones · OBSOLETE_Fed-BioMed / OBSOLETE_Fed-BioMed

SP17 Item 03 - Security logs on the nodes
Milestone ID: 2991

As a node I want to keep track of all security significant actions and events on the node, so that I comply with legal requirements about logs.

Identified legal requirements:

mandatory security logging of user (eg CNIL in France https://www.cnil.fr/fr/securite-tracer-les-acces-et-gerer-les-incidents) including user (de)connection + security significant actions and events + anomalies; period 6-12 months; include user ID/date and time/action
GDPR (en https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016R0679 fr https://www.cnil.fr/fr/reglement-europeen-protection-donnees) requirements including:
- state the purpose of the treatment (keep track of security significant events and actions to comply for legal requirements and investigate security anomalies) + minimize logs
- information banner on GUI: logs collected and purpose, how to exercise rights ("contact your DPO")

Tasks:

Code design:
- extend activity logger (FedLogger()) or create separate security logger ?
- choose log format. The format should be human-readable. As a nice to have, the format should be easily parsable with a script (e.g. easy to convert to csv)
Implement logging options:
- syslog (default)
- local file (nice to have: log rotation + deletion after 1 year)
VPN/container environment support:
- check it works
- make logs persistent (outside of container) by default
Provide default GDPR banner on front page + easy way of customizing it for a site
nice to have: security logs on the researcher
- user ID is either "CLI" or "GUI unauthenticated" (no user authentication yet on researcher GUI)
nice to have: option to increase verbosity of security logs
- this option would enable to "copy" all of the activity logs above a certain level (e.g. warning) also into the security logs.

Log content:

date and time
user ID and source
- GUI + user login
- GUI unauthenticated (if GUI runs without auth)
- CLI (cannot have real IDs for CLI)
- network (request coming from network) + requestor ID (node_id, researcher_id)
events (list below) + event information fields
- general events:
  - node start; node stop
- user events: (after SP17 GUI auth is implemented)
  - user account added; user account deleted; user account modified; user password change; user password recovery (lost password)
  - GUI user connect (authenticate); GUI user disconnect; GUI user session timeout
- dataset events:
  - add Dataset (GUI, -a, -am, -adff, etc.); remove Dataset; modify dataset
- model approval events:
  - register (add+approve) TrainingPlan; add requested TrainingPlan (pending) for approval; approve; reject; delete TrainingPlan; update TrainingPlan
  - note: at node launch, register_update_default_models check_hashes_for_registered_models are a combination of: register (default); delete; update
  - error received TrainingPlan request for existing model (from network message)
- network message events:
  - note: The receiving of any message in the on_message function should be logged
  - correct requests message received
    - train request received
    - ping request received
    - search request received
    - list request received
    - model approval request received
    - model-status request received
  - suspect bad request message errors (with possible message forging network attack):
    - not implemented (unknown request)
    - json decode error (corrupted request)
    - key error (bad request)
  - suspect bad model or parameters errors (with possible model/data attack)
    - error during training (in run_model_training)
    - error during validation (in run_model_training)
    - cannot load model parameters (in run_model_training)
    - cannot instantiate model (in run_model_training)
    - model not approved (in run_model_training)
  - note: Do we want to log request handling errors ? Maybe yes (possible network attack ?)
  - misc request message handling error:
    - type error (errors in the reply serialization, thus internal error)
    - model status request handling error
    - model approval request handling error
    - train request handling error
    - training errors event
      - cannot create train/validation data (in run_model_training)
      - dataset not existing (in parser_task)
      - error in download/upload parameters
      - other training error (in task_manager, run_model_training)
  - note: Do we need to log any reply message ? Maybe not, only for those for which it is a security significant action.
  - reply message sent
    - model approval reply sent
    - train reply sent
    - ping reply sent
    - search reply sent
    - list reply sent
    - model-status reply sent
- train events
  - start handling training request; finish handling training request
  - start training; stop training; start validation; stop validation

Example: log detail for a few events

node start
- common fields for all log entries (date, time, user id, etc.)
- environ['NODE_ID']
- environ['MODEL_APPROVAL'] + environ['ALLOW_DEFAULT_MODEL']
- environ['HASHING_ALGORITHM'] (just because listed as a security parameter ...)
- Note: should we copy all environment ? Maybe not.
approve model
- common fields for all log entries (date, time, user id, etc.)
- (Note: should we copy all model parameters ? dump model code ? Maybe not.)
- model_id
- security related fields
  - algorithm
  - hash
  - researcher_id
- additional model description fields
  - date created + modified + registered + last action (remove modified/created ?)
  - model_path
  - model_type (can be used for default, registered, requested)
  - model_name
add dataset
- common fields for all log entries (date, time, user id, etc.)
- dataset_id
- additional dataset description fields
  - data_type
  - name
  - path
  - shape
  - tags

Additional notes

Some notes following a discussion on July 13th with @jls.

Implementation

There are two options:

(preferred) create a separate logger associated with a fileHandler on the node. The downside is that two separate calls may be necessary for messages that need to be logged both on the security log and the FedLogger, but we do not expect this to happen too often (see discussion below)
extend the current FedLogger to also handle security logging, e.g. by adding a FileHanlder to the FedLogger, and maybe setting a threshold to avoid logging DEBUG messages

What to log

The security logger should be less verbose (much less verbose) than the FedLogger. The security logger probably does not need to log specific parameters/arguments of the action performed, but only the really relevant information such as which action was performed, when, and by whom.

Dates

Minor remark, we should at some point make an explicit strategy about timezones and dates in the logging.

SP17 Item 03 - Security logs on the nodes Milestone ID: 2991

Additional notes

Implementation

What to log

Dates

SP17 Item 03 - Security logs on the nodes
Milestone ID: 2991