CLEA-specification-EN.md 84.8 KB
Newer Older
1
# The Cluster Exposure Verification (CLEA) Protocol: Specifications of Protocol Version 0
ROCA Vincent's avatar
ROCA Vincent committed
2 3 4 5 6 7 8

Vincent Roca, Antoine Boutet, Claude Castelluccia

PRIVATICS team, Inria, France

{firstname.lastname}@inria.fr

9
**_Work in Progress, May 20th, 2021_**
ROCA Vincent's avatar
ROCA Vincent committed
10 11


ROCA Vincent's avatar
ROCA Vincent committed
12 13
----

ROCA Vincent's avatar
ROCA Vincent committed
14 15 16
[comment]: # ( [[_TOC_]] )
[comment]: # ( ———— )

ROCA Vincent's avatar
ROCA Vincent committed
17

ROCA Vincent's avatar
ROCA Vincent committed
18
[[_TOC_]]
ROCA Vincent's avatar
ROCA Vincent committed
19 20

----
ROCA Vincent's avatar
ROCA Vincent committed
21

ROCA Vincent's avatar
ROCA Vincent committed
22 23 24

## 1- Introduction

ROCA Vincent's avatar
ROCA Vincent committed
25
This document is a specification of the CLuster Exposure verificAtion (CLEA) protocol meant to warn the participants of a private event (e.g., wedding or private party) or the persons present in a commercial or public location (e.g., bar, restaurant, sport center, or train), when this event or location later became a cluster because a certain number of people who were present at the same time have been tested COVID+.
ROCA Vincent's avatar
ROCA Vincent committed
26 27 28

It is based:

ROCA Vincent's avatar
ROCA Vincent committed
29
- on a central server, in order to automatically detect potential clusters. This server is under the responsibility of a health authority;
ROCA Vincent's avatar
ROCA Vincent committed
30 31 32 33

- on the display a QR code at the location or on a ticket, either in a static (e.g., printed) or dynamic manner (e.g., via a dedicated device, smartphone, or tablet);

- and on a smartphone application.
ROCA Vincent's avatar
ROCA Vincent committed
34

ROCA Vincent's avatar
ROCA Vincent committed
35 36
This smartphone application is used to scan a QR code, to store it locally for the next 14 days, and to perform periodic risk analyses, in a decentralized manner, on the smartphone.
In order to enable this decentralized risk analysis, information about clusters (i.e., the location pseudonyms and timing information) needs to be disclosed.
ROCA Vincent's avatar
ROCA Vincent committed
37 38
We believe this is an acceptable downside because this information is not per se sensitive health data (it does not reveal any user health information to an eavesdropper), although it can be considered as personal data (it is indirectly linked to the location manager).

39 40 41
Two broad categories of use-cases exist:

- those involving a synchronous scan of a QR code, for situations where a user scans a QR code upon entering an event or location (e.g., a restaurant);
ROCA Vincent's avatar
ROCA Vincent committed
42

ROCA Vincent's avatar
ROCA Vincent committed
43
- those involving an asynchronous scan of a QR code, for situations where a QR code is generated that can be scanned either in advance or after visiting the event or location (e.g., an on-line train ticketing service can add a QR code on the ticket to let a user scan it at its discretion).
44

45
Finally, the CLEA protocol is also meant to be used by the location employees in order to warn them if their work place is qualified as cluster, or on the opposite to let them upload information to the server if they are themselves tested COVID+.
ROCA Vincent's avatar
ROCA Vincent committed
46 47


48
## 2- CLEA protocol high level principles
ROCA Vincent's avatar
ROCA Vincent committed
49 50 51 52 53 54 55

### 2.1- Terminology

The following terms are used in this document:

| Name                     | Description                                                                     |
|--------------------------|---------------------------------------------------------------------------------|
ROCA Vincent's avatar
ROCA Vincent committed
56
| `MCT`                    | Manual Contact Tracing team. |
ROCA Vincent's avatar
ROCA Vincent committed
57 58 59 60 61 62 63
| **Location**             | Synonymous to venue, this is a closed area where people meet. It can be a private location (e.g., for a wedding or a party event) or a commercial or public location (e.g., a bar, a restaurant, a sport center, an entertainment hall or auditorium, a train). |
| **Device**               | A specialized device, or a general purpose smartphone or tablet with the appropriate software, used by the location manager or event organizer, that displays a QR code. Alternatively, QR codes can also be printed or included in a digital ticket. |
| **(User) terminal**      | The user smartphone. |
| **CLEA application**     | The application on the user terminal (smartphone). |
| **QR code**              | The dynamic or static QR code of a location that is scanned (e.g., when entering the location). It contains a URL ("deep link") structured as: `"country-specific-prefix" "Base64url(location-specific-part)"`. |
|**Location Specific Part**| This is the part of the QR code that contains the information specific to the location. With a dynamic QR code, the information contained in this part is periodically renewed. |
| **Period**               | With dynamic QR codes, time is split into periods (e.g., 24 hours), during which the location pseudonyms (more precisely a temporary cryptographic key and a derived temporary UUID) are stable. After that period, a new location pseudonym is generated. For practical reasons, a new period MUST start at a round predefined hour (e.g., 4:00am may be chosen as a default period start). As a special case, and this is used by static QR codes, a period can also have an unlimited duration, meaning that the location pseudonym will remain unchanged. |
ROCA Vincent's avatar
ROCA Vincent committed
64 65 66

### 2.2- Overview

ROCA Vincent's avatar
ROCA Vincent committed
67
#### Two key design choices: centralized cluster detection and decentralized risk estimation and notification
68

ROCA Vincent's avatar
ROCA Vincent committed
69 70 71
This protocol must comply with two privacy-related requirements:

- the location manager must not be able to collect any data with respect to the clients (unlike written records where clients fill-in their name and contact information), and
72
- the amount of information uploaded by the CLEA application to the central server must be minimized.
ROCA Vincent's avatar
ROCA Vincent committed
73 74 75 76 77

In practice, no information is uploaded to the server unless a client is tested COVID+.
In that case, if the user explicitly agrees (informed consent), the application uploads the list of scanned QR codes during the past 14 days[^footnote-1] along with timing information to the central server, in order to enable a **_centralized anonymous cluster detection_**.
The server can detect clusters by considering the number of COVID+ users in  a location at the same time, without having access to the name nor address of this location.
Then this central server updates its list of location temporary pseudonyms and time (with an hour granularity by default) corresponding to clusters.
ROCA Vincent's avatar
ROCA Vincent committed
78

ROCA Vincent's avatar
ROCA Vincent committed
79
<img src="img/CLEA_centralized_cluster_detection.jpg" alt="CLEA_centralized_cluster_detection.jpg" width="700"/>    
ROCA Vincent's avatar
ROCA Vincent committed
80

ROCA Vincent's avatar
ROCA Vincent committed
81
_Figure 1: Centralized cluster detection. Here Alice, tested COVID+, agrees to upload her scanned QR codes to the CLEA backend server, which, after verifying the validity of the upload, identifies if some of the visited locations needs to be qualified as potential cluster._    
ROCA Vincent's avatar
ROCA Vincent committed
82 83


84
In parallel, each CLEA application periodically downloads this list containing the latest clusters that have been identified, in order to check locally whether or not there is a match.
ROCA Vincent's avatar
ROCA Vincent committed
85
In case of a match, the user is informed with a "warning".
ROCA Vincent's avatar
ROCA Vincent committed
86 87
The exact type of warning message could be adjusted to reflect the risk level (e.g., if a high number of COVID+ users have been identified in a cluster), which is out of scope of the present specification.
Therefore CLEA performs a **_decentralized risk evaluation_**.
ROCA Vincent's avatar
ROCA Vincent committed
88

ROCA Vincent's avatar
ROCA Vincent committed
89
<img src="img/CLEA_decentralized_risk_evaluation.jpg" alt="CLEA_decentralized_risk_evaluation.jpg" width="600"/>    
ROCA Vincent's avatar
ROCA Vincent committed
90

ROCA Vincent's avatar
ROCA Vincent committed
91
_Figure 2: Decentralized risk evaluation. Here Bob compares his scanned QR codes with the new potential cluster location pseudonyms in a first step, and if a match is found, if the corresponding period overlaps significantly with his own presence as stored in his local database._    
ROCA Vincent's avatar
ROCA Vincent committed
92

ROCA Vincent's avatar
ROCA Vincent committed
93 94
We believe that making public the list of location temporary UUIDs and time corresponding to clusters is an acceptable tradeoff, because this information is not per se sensitive health data (it does not reveal any user health information to an eavesdropper), although it can be considered as personal data (it is associated to the location manager)[^footnote-2].

ROCA Vincent's avatar
ROCA Vincent committed
95 96 97
[^footnote-1]: the 14 days number is provided as an example. The national health authority will define the appropriate epidemiological value that is considered the most appropriate, that may also depend on another considerations like the date of first symptoms when known. The details are out of scope of this document.

[^footnote-2]: This is a big difference with a decentralized contact tracing system, for instance based on the Google/Apple Exposure Notification (GAEN) component, where the pseudonyms of COVID+ users are freely available over the Internet: revealing sensitive health data enables any curious neighbour who uses a dedicated BLE scanning system (and [https://coronadetective.eu](https://coronadetective.eu) has shown how trivial this can be since a web browser is sufficient) to immediately identify the health status of their neighbours if they upload their pseudonyms later on, with potentially major discrimination consequences. With CLEA, a decentralized risk evaluation approach makes sense as it does not disclose sensitive health information.
98 99


ROCA Vincent's avatar
ROCA Vincent committed
100
#### A single protocol, three potential deployments
101 102 103 104 105 106

Central to the deployment of CLEA is the question of the role given to the Manual Contact Tracing Team (MCT).
Three options exist:

- Option 1: the MCT is at the center, for a maximum control.
	Here the upload of Alice scanned QR code history is done during or after an interview with the MCT, under MCT control;
ROCA Vincent's avatar
ROCA Vincent committed
107

ROCA Vincent's avatar
ROCA Vincent committed
108
<img src="img/CLEA_deployment_option1.jpg" alt=".CLEA_deployment_option1.jpg" width="600"/>    
ROCA Vincent's avatar
ROCA Vincent committed
109 110 111

_Figure 3: CLEA deployment option 1, with the MCT at the center._

112
- Option 2: the MCT is at the edge, for maximum scalability and speed, and to avoid overloading the MCT.
ROCA Vincent's avatar
ROCA Vincent committed
113 114
	Here clusters are identified as soon as Alice uploads her scanned QR code history, without any delay.
	The MCT is also informed of those new clusters, yet they are not in the critical path;
ROCA Vincent's avatar
ROCA Vincent committed
115

ROCA Vincent's avatar
ROCA Vincent committed
116
<img src="img/CLEA_deployment_option2.jpg" alt=".CLEA_deployment_option2.jpg" width="600"/>    
ROCA Vincent's avatar
ROCA Vincent committed
117 118 119

_Figure 4: CLEA deployment option 2, with the MCT at the edge._

ROCA Vincent's avatar
ROCA Vincent committed
120
- Option 3: the MCT is not involved in any manner.
ROCA Vincent's avatar
ROCA Vincent committed
121
	As a direct consequence, it is not possible to couple the digital system with any hand-written attendance register.
122

123 124 125 126
<img src="img/CLEA_deployment_option3.jpg" alt=".CLEA_deployment_option3.jpg" width="600"/>    

_Figure 4: CLEA deployment option 3, involving no MCT._

127
Choosing an option is a local decision, based on local criteria, that does not compromize interoperability with other types of deployment in neighbouring countries.
ROCA Vincent's avatar
ROCA Vincent committed
128

129

ROCA Vincent's avatar
ROCA Vincent committed
130
#### QR codes for synchronous versus asynchronous scans
ROCA Vincent's avatar
ROCA Vincent committed
131

ROCA Vincent's avatar
ROCA Vincent committed
132
Regardless of which deployment option is chosen, two types of QR codes exist that reflect two broad categories of use-cases:
133

ROCA Vincent's avatar
ROCA Vincent committed
134
- those involving a **synchronous scan** of a QR code, for situations where a user scans a QR code upon entering an event or location.
135
	This is typically the case with a restaurant.
ROCA Vincent's avatar
ROCA Vincent committed
136
	The QR code requires a synchronous scan (i.e., when entering a location), and the location check-in timestamp is the scanning time.
137

ROCA Vincent's avatar
ROCA Vincent committed
138
- those involving an **asynchronous scan** of a QR code, for situations where a QR code is produced that can be scanned either in advance or after visiting the event or location.
139 140 141 142
	This is typically the case with an on-line train ticketing service, where the QR code is printed on the ticket itself to let the user scan it at its discretion.
	The QR code enables an asynchronous scan, before, during, or after visiting the location, and the location check-in timestamp is the one provided in the QR code itself rather than the scanning time.


ROCA Vincent's avatar
ROCA Vincent committed
143
#### Static versus dynamic QR codes
144

ROCA Vincent's avatar
ROCA Vincent committed
145 146
In order to further improve privacy and security, the current specification defines **_dynamic QR codes_** that are periodically renewed and displayed with the help of a device.
Each QR code includes, among other things, the location temporary UUID (behaving as a temporary pseudonym) that, for instance, changes once a day.
147
These dynamic QR codes necessarily require a synchronous scan, since the QR code will change over the time.
ROCA Vincent's avatar
ROCA Vincent committed
148

ROCA Vincent's avatar
ROCA Vincent committed
149 150 151
The current specification can also be used with **_static QR codes_** (e.g., printed on paper and made available to clients) if a location does not own a dedicated device or with QR codes for an asynchronous scan.
Being static, this solution has downsides: it is less robust in front of relay attacks, and it enables an attacker to display all the clusters on a map (since the location pseudonyms will not change, it is relatively easy to collect them all), or to focus on a specific set of locations to know if they have been cluster.
When possible, a good practice is to regularly change static QR codes, manually, in particular if the location is identified as cluster.
ROCA Vincent's avatar
ROCA Vincent committed
152
This aspect is out of scope of the present specification.
153

ROCA Vincent's avatar
ROCA Vincent committed
154
It can also be noticed that both static and dynamic QR codes are processed homogeneously by the same CLEA protocol, the same application and central server.
155

ROCA Vincent's avatar
ROCA Vincent committed
156

ROCA Vincent's avatar
ROCA Vincent committed
157 158 159 160
#### The particular case of employees

Finally the employees of a location can benefit from the service, in order to be warned if their workplace is a cluster, or, on the opposite, to inform the server that they have been tested COVID+.
Since they have a long presence in the location, the employees must scan a specific QR code which slightly differs from regular QR codes scanned by clients.
ROCA Vincent's avatar
ROCA Vincent committed
161 162 163 164 165


### 2.3- Attacker model and trust considerations

This specification considers two different types of attackers.
ROCA Vincent's avatar
ROCA Vincent committed
166

ROCA Vincent's avatar
ROCA Vincent committed
167
The first type is composed of individuals who try to corrupt the service, deny the service, or break the confidentiality of the service.
ROCA Vincent's avatar
ROCA Vincent committed
168
Although a certain number of measures are taken to mitigate risks, for instance with dynamic QR codes, there are limits.
ROCA Vincent's avatar
ROCA Vincent committed
169
For instance, the CLEA protocol cannot prevent a static QR code to be communicated to other persons, potentially tested COVID+, in order to trigger wrong cluster detections. 
ROCA Vincent's avatar
ROCA Vincent committed
170
This is also a direct consequence of a fully anonymous system that is meant to preserve user privacy.
ROCA Vincent's avatar
ROCA Vincent committed
171

ROCA Vincent's avatar
ROCA Vincent committed
172 173 174
In the second type, the authority that operates the CLEA system could try to know as much as possible on the users.
Yet the system is expected to be audited by an external trusted Data Protection Authority (DPA, for instance CNIL in case of France).
Because of these audits, the authority in charge of the CLEA system is assumed to be honest: it will not try to modify the CLEA protocol itself, nor the implementation of the CLEA protocol, since this would be detected by the DPA.
ROCA Vincent's avatar
ROCA Vincent committed
175 176
However, it may benefit from the recorded information to infer additional information or use it for different purposes.

ROCA Vincent's avatar
ROCA Vincent committed
177 178 179
It follows that the CLEA server needs to be split into several independent entities: a frontend that collects the traffic from the CLEA users and sanitizes the traffic, removing the source IP address for instance, "on-the-fly", without storing any piece of information beyond what is strictly required (care should be put to logs for instance).
On the opposite, the backend only processes messages that have been sanitized by the frontend.
The backend may also leverage specific hardware for storing system keys, in order to minimize the security risks in case of intrusion.
ROCA Vincent's avatar
ROCA Vincent committed
180

ROCA Vincent's avatar
ROCA Vincent committed
181
It is assumed that the authority in charge of Manual Contact Tracing (MCT) is trustworthy when it comes to dealing with personal data (e.g., when an MCT team contacts a location manager or event organizer), so that it does not take advantage of the information collected beyond what is strictly required to perform its task.
ROCA Vincent's avatar
ROCA Vincent committed
182 183
However this authority must not be involved in the cluster detection process, that is not under its responsibility.

ROCA Vincent's avatar
ROCA Vincent committed
184
It should be noted that detailed implementation choices (e.g., the exact design of the CLEA server or application) are out of scope of the present document, whereas such considerations could also impact security and privacy properties.
ROCA Vincent's avatar
ROCA Vincent committed
185 186 187 188


### 2.4- Technical requirements

ROCA Vincent's avatar
ROCA Vincent committed
189
Several technical requirements, in particular motivated by the use of embedded devices, have shaped the design:
ROCA Vincent's avatar
ROCA Vincent committed
190

ROCA Vincent's avatar
ROCA Vincent committed
191
- each QR code contains a country specific URL ("deep link"), composed of a country specific prefix (for instance: `https://tac.gouv.fr?v=0#` in case of France), and a location specific part, defined in Section [Dynamic QR code generation within the device](#dynamic-qr-code-generation-within-the-device).
ROCA Vincent's avatar
ROCA Vincent committed
192
Therefore, any binary information of the location specific part, is first translated to a printable character, using a Base64url encoding (i.e., an URL and filename safe variant of Base 64), which adds a 33% overhead compared to the binary size (see [RFC4648](#references) section 5.).
Cypres TAC's avatar
Cypres TAC committed
193
Since the output of a Base64url encoding uses an alphabet of 65 characters, it is not compatible with the Alphanumeric Mode of a QR code (limited to 45 printable characters), and it requires the use of the 8-bit Byte Mode (see [QRcode18004](#references), Section~8.4.4).
ROCA Vincent's avatar
ROCA Vincent committed
194

195
- the need to easily and reliably scan a QR code type 2 and the screen size/resolution constraints of the specialized device (e.g., 200 x 200 pixels) impact the maximum QR code size.
ROCA Vincent's avatar
ROCA Vincent committed
196 197 198
This specification targets a Level 12 QR code Type 2 (see [QRcodeWeb](#references)), of size 65x65.
It also uses the 8-bit byte mode (in particular because the `#` character is absent from the alphanumeric mode), and the information capacity ranges between 155 and 367 binary characters, depending on the chosen redundancy.
With maximum sized QR codes, the redundancy is set to the Medium level, leaving a maximum of 287 characters for the URL.
ROCA Vincent's avatar
ROCA Vincent committed
199
When the URL is shorter (i.e., when `locContactMsg` is absent, see below), the redundancy is set to Q level for a better error correction feature, leaving a maximum of 203 characters for the URL.
ROCA Vincent's avatar
ROCA Vincent committed
200 201 202 203 204
In both cases, the content of the location specific part, before Base64url encoding, uses a binary format (rather than JSON or Protobuf) for compactness reasons, in order to comply with the 287 or 203 character size limits.

- a specialized device is typically not connected to the Internet nor any wireless network, it does not feature any connector (no USB), and is powered by a non-rechargeable battery (no power plug, an autonomy of several months being expected).
This specialized device will typically feature a dedicated micro-controller (e.g., a MICROCHIP micro-controller, PIC32MM0256GPM036-I/M2), with low computation capabilities, which also limits the QR code renewal period.
These considerations have been considered in the present CLEA design.
ROCA Vincent's avatar
ROCA Vincent committed
205

206
- a dedicated tablet could easily remove some of the above limitations, but on the other hand a tablet is more costly, is subject to theft, and is subject to attacks, being connected to wireless networks. It is therefore a potential target device for displaying dynamic QR codes, but not the privileged one.
ROCA Vincent's avatar
ROCA Vincent committed
207 208 209 210 211 212 213 214 215 216


## 3- Detailed operational description

### 3.1- Acronyms

The following acronyms and variable names are used:

| Short name     | Full Name                 | Description                                        |
|----------------|---------------------------|----------------------------------------------------|
ROCA Vincent's avatar
ROCA Vincent committed
217 218 219 220
| `LSP`          | locationSpecificPart      | The QR code of a location, at any moment, contains a URL ("deep link"), structured as: `"country-specific-prefix" "Base64url(location-specific-part)"`. The location specific part, renewed periodically with a dynamic QR code, contains information related to the location. |
| `SK_L`         | permanentLocationSecretKey | Permanent location 408-bits secret key. This key is never communicated, but is shared by all the location devices. For instance, this key can be stored in a protected stable memory of a dedicated device (or set of devices) by the manufacturer. The manufacturer should also keep a record of this `SK_L` in a secure place if the location manager later asks for additional devices. An appropriate location manager authentication mechanism needs to be defined for that purpose that is out of the scope of this document. |
| `{PK_SA, SK_SA}` | serverAuthorityPublicKey / SecretKey | Public/secret ECDH key pair of the Authority in charge of the backend server. The public key is known by all devices. |
| `{PK_MCTA, SK_MCTA}` | manualCTAuthorityPublicKey / SecretKey | Public/secret ECDH key pair of the Authority in charge of the manual contact tracing. The public key is known by all devices. It is assumed that this authority is different from the authority in charge of the backend server. |
ROCA Vincent's avatar
ROCA Vincent committed
221 222 223 224 225 226 227 228 229
| `LTKey`     | locationTemporarySecretKey     | Location temporary 256-bits secret key, specific to a given Location at a given period. This key is never communicated outside of the device(s). |
| `LTId`      | locationTemporaryPublicID      | Location temporary public universally unique 128-bits Identifier (UUID), specific to a given location at a given period. This public location identifier is derived from the associated secret location key. |
| `t_periodStart` (in seconds) | periodStartingTime | Starting time of the period, expressed as the number of seconds since January 1st, 1900 (NTP timestamp limited to the 32-bit seconds field), by convention in the UTC (Coordinated Universal Time) timezone. A period necessarily starts at a round hour. |
| `ct_periodStart` | compressedPeriodStartingTime   | Compressed form of the period starting time, obtained by dividing `t_periodStart` by 3600, which is guaranteed to be an integral value since a period necessarily starts at a round hour (i.e., is multiple of 3600). |
| `periodDuration` (in hours) | idem           | Duration of the period, expressed as the number of hours. This duration is transmitted in a 1-byte field. Value 0 is invalid and should not be used, values between 1 and 254 inclusive indicate a period duration of 254 hours (i.e., 10 days and 14 hours), value 255 is reserved to the special case of an unlimited period duration. The default value is 24 hours (a period per day), but different values may be defined.
| `qrCodeRenewal Interval` (in seconds) | idem  | QR codes renewal interval. QR codes are renewed every `qrCodeRenewalInterval` seconds, a value of 0 indicating the QR code is never renewed during the period. This value is chosen by the device and communicated within the QR code as a power of 2, via the `CRIexp` field. |
| `CRIexp` | qrCodeRenewalIntervalExponent     | Compact version of the `qrCodeRenewalInterval` as the exponent of a power of two, coded in 5 bits. When equal to `0x1F`, `qrCodeRenewalInterval` must be set to `0` (i.e. no renewal period), otherwise `qrCodeRenewalInterval` must be set to `2^^CRIexp` seconds. |
| `t_qrStart` (in seconds) | qrCodeValidityStartingTime | Starting time of the QR code validity timespan, expressed as the number of seconds since January 1st, 1900 (NTP timestamp limited to the 32-bit seconds field), by convention in UTC (Coordinated Universal Time) timezone. |
| `t_qrScan` (in seconds) | qrCodeScanTime     | Timestamp when a user terminal scans a given QR code, expressed as the number of seconds since January 1st, 1900 (NTP timestamp limited to the 32-bit seconds field), by convention in UTC (Coordinated Universal Time) timezone. |
230
| `t_event`(in seconds) | eventTime            | When `LSPtype` is equal to 1, this is the time when the user is expected to enter the location, expressed as the number of seconds since January 1st, 1900 (NTP timestamp limited to the 32-bit seconds field), by convention in UTC (Coordinated Universal Time) timezone. |
ROCA Vincent's avatar
ROCA Vincent committed
231
| `t_checkin`(in seconds) | checkinTime        | The time when the user is enters the location, either equal to `t_qrScan` when `LSPtype` is equal to 0, or `t_event` when `LSPtype` is equal to 1. |
ROCA Vincent's avatar
ROCA Vincent committed
232
| `visitDuration` (in hours) | idem             | When `LSPtype` is equal to 1, this is the expected duration of the stay in the location, expressed in number of hours. This field is not necessarily meaningful nor known upon the generation of QR code. In that case it must contain value 0. |
ROCA Vincent's avatar
ROCA Vincent committed
233
| `localList` | idem                           | Within the user terminal, this list contains all the `{QR code, t_checkin}` tuples collected by a user within the current 14-day window. Entries are added in this localList as the user visits new locations and scans the corresponding QR code, and automatically deleted after 14 days. |
ROCA Vincent's avatar
ROCA Vincent committed
234
| `clusterList` | idem                         | Within the backend server, this list contains all the `LTId` and timing information corresponding to a potential cluster. This list is public, it is downloaded by all the user terminals, and is updated each time a new cluster is identified. The cluster qualification happens when the hourly counter of a location exceeds a given threshold that depends on the location features. |
ROCA Vincent's avatar
ROCA Vincent committed
235
| `dupScanThreshold` (in seconds) | idem       | Time tolerance in the duplicated scan mechanism: for a given `LTId`, a single QR code can be recorded in the localList every `dupScanThreshold` seconds. A similar check is performed on the server frontend. |
ROCA Vincent's avatar
ROCA Vincent committed
236
| `locationPhone` | idem                       | Phone number of the location contact person, stored as a set of 4-bit sub-fields that each contain a digit. This piece of information is only accessible to the manual contact tracing authority. It is meant to create a link between the digital system and the hand-written attendance register. |
ROCA Vincent's avatar
ROCA Vincent committed
237
| `locationRegion` | idem                      | Coarse grain geographical information for the location, in order to facilitate the work of the Manual Contact Tracing team. In case of France, it can contain a department number. |
ROCA Vincent's avatar
ROCA Vincent committed
238
| `locationPIN` | idem                         |  Secret 6 digit PIN, known only by the location contact person, stored as a set of 4-bit sub-fields that each contain a digit. This piece of information is only accessible to the manual contact tracing authority. It is meant to create a link between the digital system and the hand-written attendance register. |
ROCA Vincent's avatar
ROCA Vincent committed
239 240


ROCA Vincent's avatar
ROCA Vincent committed
241
### 3.2- Initial configuration of the service 
ROCA Vincent's avatar
ROCA Vincent committed
242

ROCA Vincent's avatar
ROCA Vincent committed
243
#### Case of a location using a dedicated device(s) or tablet(s) and dynamic QR codes for synchronous scans
ROCA Vincent's avatar
ROCA Vincent committed
244

ROCA Vincent's avatar
ROCA Vincent committed
245 246
The device(s) of a location must be initialized.
This is either managed by the manufacturer in case of specialized device(s), before being used by the location manager, or by the location manager in case of tablet(s).
ROCA Vincent's avatar
ROCA Vincent committed
247 248

- The location keeps a long-term secret, `SK_L`, specific to this location, that is never communicated.
roca's avatar
roca committed
249
This key is 408 bits (51 bytes) long, so that after concatenation with the `t_periodStart` and addition of 72 bits of padding, the total is 512 bits long and fits in a single SHA256 block size.
ROCA Vincent's avatar
ROCA Vincent committed
250 251 252 253
If this location uses several devices, each of them must be configured with the same `SK_L`.
With a dedicated device, this configuration can be done by the device manufacturer, meaning that the manufacturer is in charge of keeping this long-term secret.
With a tablet, this is performed by the CLEA software used on the tablet, and when several tablets are used, a synchronization is required to make sure they all use the same long-term secret.
Details are out of the scope of this document.
ROCA Vincent's avatar
ROCA Vincent committed
254

255 256
- Each device knows the public key of the Authority in charge of the backend server, `PK_SA`.
When the deployment involves the MCT (options 1 and 2), the device also knows the public key of the Authority in charge of manual contact tracing, `PK_MCTA`.
ROCA Vincent's avatar
ROCA Vincent committed
257 258 259 260

- If the location has several totally independent rooms (e.g., a restaurant across two different buildings), distinct devices initialized with different long-term secrets, `SK_R1` and `SK_R2`, may be used in order to generate different location keys and identifiers.

- A pre-determined round hour (e.g., 4:00am, local timezone) is defined, that ideally corresponds to a moment when this location is closed.
ROCA Vincent's avatar
ROCA Vincent committed
261
At that time, every day, the location `LTKey` and `LTId` are automatically renewed.
ROCA Vincent's avatar
ROCA Vincent committed
262 263
Later, if the authority identifies a cluster in this location, it will be notified through this location temporary UUID, known only by the clients of the location who scanned the associated QR code.

ROCA Vincent's avatar
ROCA Vincent committed
264 265
- The case of a location that never closes should be handled in the most appropriate manner, for instance with a location `LTKey` and `LTId` renewal that corresponds to a low affluence period.
The motivation is to reduce the probability that a client who arrives after the renewal and learns the new `LTId_{i+1}` via the new QR code, misses a cluster warning for the previous `LTId_i` since it started before the renewal.
ROCA Vincent's avatar
ROCA Vincent committed
266 267

- The case of a long event (e.g., over several days) requires specific attention.
ROCA Vincent's avatar
ROCA Vincent committed
268 269 270
For instance, it can be a private event in a closed location during several days (e.g., a marriage or party over a week-end), with people who may only arrive on the second day, or leave earlier, or stay during the whole event.
Such events are incompatible with a daily renewal of the location `LTKey` and `LTId`, since requiring all users to scan the new QR code after each renewal is impractical.
Having the location `LTKey` and `LTId` lasting the full duration of the event can be a practical solution.
ROCA Vincent's avatar
ROCA Vincent committed
271 272 273 274 275 276 277 278 279
Then, since the exact moment when a participant definitively leaves the event is unknown by default, it can be preferable to use a coarse grain warning during the risk evaluation process.

- A period can also be shorter than 24 hours, when several activity periods are clearly defined, each of them separated by official closures of the location (e.g., the two services of a restaurant, at noon and in the evening).
It is then possible to configure all the devices of the location with two or more renewal round hours (e.g., by the manufacturer).
Those shorter periods are managed exactly in the same manner without any impact on the present specification.
The main benefit of shorter periods is to avoid that a client of the noon service of a restaurant learns the potential cluster status of this location for the evening service, since this client will not know the new `LTId` used in the evening.

- The `periodDuration` parameter contains the chosen period duration, expressed in number of hours, between 1 and 255 (note that value 0 is invalid), the value 255 being reserved to the special case of a unlimited period duration. 
It is part of the information carried in the QR code and it is used by the server.
ROCA Vincent's avatar
ROCA Vincent committed
280
This `periodDuration` determines the renewal frequency of the location `LTKey` and `LTId`.
ROCA Vincent's avatar
ROCA Vincent committed
281 282 283 284 285 286
It is therefore a key parameter that defines the robustness against attackers who want to monitor the potential cluster status of a set of locations in the long term.

- An appropriate value for the `qrCodeRenewalInterval` parameter (duration after which a QR code is renewed) is chosen, depending on the device specifications and the desired protection against relay attacks. A value of 0 indicates the QR code is never renewed during the period, otherwise `qrCodeRenewalInterval` must be equal to a power of two.
A default value is: `2^^10 = 1024 seconds` (approx. 17 minutes).


ROCA Vincent's avatar
ROCA Vincent committed
287
#### Case of a location manager or private event organizer who relies on a web service to generate a static QR code for synchronous scans
ROCA Vincent's avatar
ROCA Vincent committed
288 289 290 291

It is also possible to use a web service to generate a static QR code.
Here the whole generation process is done within the web browser, thanks to a dedicated javascript library: an `SK_L` secret key is generated locally, the `PK_SA` and potentially `PK_MCTA` public keys are also communicated to the web browser.
No state is kept on the web service.
ROCA Vincent's avatar
ROCA Vincent committed
292 293 294
The produced QR code is necessarily static, meaning that `periodDuration = 255` and `qrCodeRenewalInterval = 0`.
With large locations, the manager can easily provide several prints of the same QR code.
In any case, it is recommended to regularly generate and propose a new QR code in order to (slightly) reduce the attack probability and improve the user privacy.
ROCA Vincent's avatar
ROCA Vincent committed
295 296


ROCA Vincent's avatar
ROCA Vincent committed
297
#### Case of static QR codes for asynchronous scans
ROCA Vincent's avatar
ROCA Vincent committed
298

ROCA Vincent's avatar
ROCA Vincent committed
299 300 301
Static QR codes can be generated by online electronic ticketing systems (e.g., for buses, shared rides, trains, or shows): part of the ticket, a QR code is added to let the user register her presence.
The QR code is static since a single `LTKey` and `LTId` is generated for the event, and it is necessarily associated to an asynchronous scan since the user can scan it at any time (before, during, or after the event).
There is no anti-replay verification.
ROCA Vincent's avatar
ROCA Vincent committed
302 303 304 305 306

For instance, after buying a train ticket, the user will receive a QR code associated to the coach and seat, for that day, with timing information for the trip.


### 3.3- Location Temporary Key (LTKey) and Location Temporary UUID (LTId) generation
ROCA Vincent's avatar
ROCA Vincent committed
307

ROCA Vincent's avatar
ROCA Vincent committed
308 309
#### Step 1: key generation

ROCA Vincent's avatar
ROCA Vincent committed
310 311 312
A key is generated for the location. 
With a dynamic QR code, this is a temporary key which is automatically renewed (by default once a day) at a predefined round hour (e.g., at 4:00 am) which ideally corresponds to a closing time of the location.
For the particular case of a static QR code, this key is in fact never renewed, the location manager needs to go to the website to renew the whole QR code, `SK_L` included.
ROCA Vincent's avatar
ROCA Vincent committed
313 314 315 316 317 318 319 320
For the given period, this key is computed as follows:
```
	LTKey(t_periodStart) = SHA256(SK_L | t_periodStart)
```
where `t_periodStart` is the reference timestamp for the beginning of the period, in NTP format (number of seconds since January 1, 1900).
The `t_periodStart` value must match the predefined round hour: it cannot just be the result of a `gettimeofday()` (or similar) converted to an NTP time, a rounding to the nearest predefined round hour is necessary.
For instance, 3h59mn48s and 4h00mn31s are both rounded to the same 4h00mn00 `t_periodStart` value, that is also necessarily multiple of 3600 seconds.

ROCA Vincent's avatar
ROCA Vincent committed
321 322 323

#### Step 2: UUID generation

324
In order to keep this key secret with respect to the user, the device derives the following UUID from it:
ROCA Vincent's avatar
ROCA Vincent committed
325 326 327 328 329 330 331 332 333
```
	LTId(t_periodStart) = HMAC-SHA-256-128(LTKey(t_periodStart), "1")
```
where HMAC-SHA-256-128 denotes the Keyed-Hash Message Authentication Code (HMAC) in conjunction with the SHA-256 cryptographic hash function and whose output is truncated to 128 bits, as defined in **[RFC4868](https://tools.ietf.org/html/rfc4868)** and **[RFC2104](https://tools.ietf.org/html/rf2104)**.

Indeed, an attacker who scans a QR code must not know the `LTKey(t_periodStart)` key in order to prevent him from being able to forge a new QR code in place of the device.

If there are several devices, each of them must generate the same `LTKey(t_periodStart)` and `LTId(t_periodStart)` when switching to a new period.
This is guaranteed if the same `t_periodStart` value is generated.
ROCA Vincent's avatar
ROCA Vincent committed
334 335 336
The renewal at a pre-defined well-known full hour, plus a limited drift of the devices (e.g., one or two minutes per year) guarantees this.
The fact the devices are not perfectly synchronized (because of different clock drifts across devices), a small hazard is possible (i.e., some devices will still display the old QR code and others the new one).
It does not have any consequence if the location is closed to public at that moment.
ROCA Vincent's avatar
ROCA Vincent committed
337 338


ROCA Vincent's avatar
ROCA Vincent committed
339
### 3.4- QR code content 
ROCA Vincent's avatar
ROCA Vincent committed
340

ROCA Vincent's avatar
ROCA Vincent committed
341
#### High level view of the QR code content
ROCA Vincent's avatar
ROCA Vincent committed
342

ROCA Vincent's avatar
ROCA Vincent committed
343
The QR code of a location contains a URL ("deep link") structured as follows:
ROCA Vincent's avatar
ROCA Vincent committed
344
```
Cypres TAC's avatar
Cypres TAC committed
345
	"country-specific-prefix" "Base64url(location-specific-part)"
ROCA Vincent's avatar
ROCA Vincent committed
346
```
ROCA Vincent's avatar
ROCA Vincent committed
347
The country specific prefix is: `https://tac.gouv.fr?v=0#` in case of France, where `v=0` indicates it's the protocol version 0 of CLEA, and the `#` character prevents the text that follows (namely the Base64url encoding of the location specific part) to be sent to the `tac.gouv.fr` server if the application is not already installed on the user terminal.
ROCA Vincent's avatar
ROCA Vincent committed
348

ROCA Vincent's avatar
ROCA Vincent committed
349
The remaining of this section defines the location specific part.
ROCA Vincent's avatar
ROCA Vincent committed
350

ROCA Vincent's avatar
ROCA Vincent committed
351 352 353 354 355
With a dynamic QR code, this QR code is renewed when switching from one period to another (change of `LTKey` and `LTId`), but also periodically during the period.
This renewal during the period happens every `qrCodeRenewalInterval` seconds.
It is a balance between the calculation and autonomy constraints of the device on the one hand (the higher the `qrCodeRenewalInterval` value, the better), and the desired protection against relay attacks on the other hand (the lower, the better).
Note that if there are several devices, an asynchronism between them during the renewal of the QR code does not pose any problem: if the `t_qrStart` values may differ, the `LTId(t_periodStart)` will remain identical.
With a static QR code, the `LTKey` and `LTId` are kept unchanged for an undefined duration.
ROCA Vincent's avatar
ROCA Vincent committed
356 357


ROCA Vincent's avatar
ROCA Vincent committed
358
With the current specification for CLEA protocol version 0, two `location-specific-part` (LSP) types are defined:
359

ROCA Vincent's avatar
ROCA Vincent committed
360
- `LSPtype = 0`: for a QR code compatible with a synchronous scan (i.e., when entering a location).
ROCA Vincent's avatar
ROCA Vincent committed
361
	The QR code may either be static or dynamic, and in the latter case associated to a freshness check.
ROCA Vincent's avatar
ROCA Vincent committed
362
	The check-in timestamp is the time when the user scans this QR code, called `t_qrScan` hereafter.
363

ROCA Vincent's avatar
ROCA Vincent committed
364 365
More precisely, it is structured as follows (high-level view):
```
366
	LSP(t_periodStart, t_qrStart) = [ version | LSPtype = 0 | reserved1 | LTId(t_periodStart)
ROCA Vincent's avatar
ROCA Vincent committed
367 368 369 370
		| Enc(PK_SA, msg) ]
```
where:
```
ROCA Vincent's avatar
ROCA Vincent committed
371
	msg = [ staff | locContactMsgPresent | reserved2 | CRIexp | venueType | venueCategory1 | 
ROCA Vincent's avatar
ROCA Vincent committed
372 373 374 375
		| venueCategory2 | periodDuration | ct_periodStart | t_qrStart | LTKey(t_periodStart)
		| Enc(PK_MCTA, locContactMsg) if locContactMsgPresent==1 ]
```

ROCA Vincent's avatar
ROCA Vincent committed
376
- `LSPtype = 1`: for a QR code compatible with an asynchronous scan (i.e., before, during, or after visiting a location).
ROCA Vincent's avatar
ROCA Vincent committed
377
	The QR code is necessarily static (i.e., the `LTKey` and `LTId` remain constant over the whole period), qrCodeRenewalInterval is necessarily equal to 0 (i.e., there is no renewal), and there is no freshness check.
378
	This QR code corresponds to a unique event, that takes place at a well defined time.
ROCA Vincent's avatar
ROCA Vincent committed
379
	The check-in timestamp is the one provided in the clear-text part of the LSP, `t_event`, and not the timestamp when scanning the QR code (which may happen several days before or after the visit).
380
	Sometimes, the check-in time may not exactly correspond to the reality (e.g., case of a delayed train), but this is not an issue since all users will use the same time.
ROCA Vincent's avatar
ROCA Vincent committed
381
	When meaningful (e.g., a train trip), a duration information is also provided in the clear-text part of the LSP, otherwise the duration is that of the full event and is not specified.
382 383 384

More precisely, it is structured as follows (high-level view):
```
385
	LSP(t_periodStart, t_qrStart) = [ version | LSPtype = 1 | reserved1 | visitDuration | t_event  
386
		| LTId(t_periodStart) | Enc(PK_SA, msg) ]
387
```
ROCA Vincent's avatar
ROCA Vincent committed
388
with the same definitions for `msg`.
389 390


ROCA Vincent's avatar
ROCA Vincent committed
391
The various fields are described below.
392
The `Enc(PK_MCTA, locContactMsg)` is defined in section ["A user tested COVID+ has used the CLEA system"](#a-user-tested-covid-has-used-the-cléa-system).
ROCA Vincent's avatar
ROCA Vincent committed
393 394


ROCA Vincent's avatar
ROCA Vincent committed
395
#### Binary format of the location-specific-part for LSPtype = 0 (synchronous scan)
ROCA Vincent's avatar
ROCA Vincent committed
396

ROCA Vincent's avatar
ROCA Vincent committed
397
The following binary format must be used when `LSPtype = 0`:
ROCA Vincent's avatar
ROCA Vincent committed
398 399 400 401
```
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
402
| ver |t = 0|res|        ...                                    |
ROCA Vincent's avatar
ROCA Vincent committed
403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...            LTId (16 bytes)                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...     |    Enc(PK_SA, msg) (variable size...)         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.                                                               .
.                                                               .
.                                                               .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

420

ROCA Vincent's avatar
ROCA Vincent committed
421
#### Binary format of the location-specific-part for LSPtype = 1 (asynchronous scan)
422

ROCA Vincent's avatar
ROCA Vincent committed
423
The following binary format must be used when `LSPtype = 1`:
424 425 426 427
```
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
428
| ver |t = 1|res| visitDuration |       t_event (4 bytes)       |
429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                     |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...            LTId (16 bytes)                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                     |       Enc(PK_SA, msg)         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.                    (variable size...)                         |
.                                                               .
.                                                               .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```


ROCA Vincent's avatar
ROCA Vincent committed
449
#### Binary format of the msg
450

ROCA Vincent's avatar
ROCA Vincent committed
451
Regardless of the `LSPtype`, the following binary format for the `msg` message must be used:
ROCA Vincent's avatar
ROCA Vincent committed
452 453 454 455
```
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
ROCA Vincent's avatar
ROCA Vincent committed
456
|S|C|       reserved2       | CRIexp  |  vType  | vCat1 | vCat2 |
ROCA Vincent's avatar
ROCA Vincent committed
457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|periodDuration |           ct_periodStart (3 bytes)            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    t_qrStart (4 bytes)                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     LTKey (32 bytes)                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Enc(PK_MCTA, locContactMsg) (65 bytes, if C==1)        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.                                                               .
.                                                               .
.                                                               .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       ...     |
+-+-+-+-+-+-+-+-+
```

488

ROCA Vincent's avatar
ROCA Vincent committed
489
#### Field descriptions
490

ROCA Vincent's avatar
ROCA Vincent committed
491 492 493 494 495 496
The "big endianness" (also called "network endianness" with the Internet protocol suite), transmitting the most significant byte first, must be used whenever meaningful.
The location specific part contains (in plaintext or encrypted) the following fields, in this order:

- `version` (3 bits) (`ver` in figure):
this is the protocol version number, in order to enable an evolution of the protocol. The present specification corresponds to protocol version 0.

497 498
- `LSPtype` (3 bits) (`t=0` or `t=1` in figure):
this is the LSP type:
ROCA Vincent's avatar
ROCA Vincent committed
499 500
	- `LSPtype = 0`:
	this type indicates a QR code compatible with a synchronous scan (i.e., when entering a location).
501

ROCA Vincent's avatar
ROCA Vincent committed
502 503
	- `LSPtype = 1`:
	this type indicates a QR code compatible with an asynchronous scan (i.e., before, during, or after visiting a location).
ROCA Vincent's avatar
ROCA Vincent committed
504

ROCA Vincent's avatar
ROCA Vincent committed
505
- `reserved1` (2 bits) (`res` in figure):
ROCA Vincent's avatar
ROCA Vincent committed
506 507
this field is unused in the current specification and must be set to zero.

508
- `visitDuration` (1 byte):
ROCA Vincent's avatar
ROCA Vincent committed
509
Restricted to `LSPtype = 1`, this is the expected duration of the stay in the location, expressed in number of hours.
510 511 512
This field is not necessarily meaningful nor known upon the generation of QR code.
In that case it must contain value 0.

513
- `t_event` (4 bytes):
ROCA Vincent's avatar
ROCA Vincent committed
514
Restricted to `LSPtype = 1`, this is the time when the user is expected to enter the location.
ROCA Vincent's avatar
ROCA Vincent committed
515
	This time is different from the timestamp when scanning this QR code, which is not used with that type of QR code.
516
	This event time may not exactly correspond to the reality (e.g., a delayed train), but this is not an issue as all users will use the same theoretical time.
517

ROCA Vincent's avatar
ROCA Vincent committed
518 519 520 521 522 523 524 525 526 527
- `LTId` (16 bytes, or 128 bits): 
this field carries the location temporary UUID for the period.

- `staff` (1 bit) (`S` in figure):
this field, when equal to 0, indicates a regular QR code, for regular users, and when equal to 1, indicates a QR code specific to a staff member of the location.

- `locContactMsgPresent` (1 bit) (`C`in figure):
this field, when equal to 1, indicates the `locContactMsg` is used and present in the QR code, and when equal to 0, indicates it is absent.
It follows that the `msg` size can largely vary, depending on the use or not of a `locContactMsg`.

ROCA Vincent's avatar
ROCA Vincent committed
528 529
- `reserved2` (12 bits):
this field is unused in the current specification and must be set to zero.
ROCA Vincent's avatar
ROCA Vincent committed
530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570

- `CRIexp` (5 bits):
this field enables to communicate the `qrCodeRenewalInterval` value in a compact manner, as the exponent of a power of two.
If this field contains the value `0x1F` (maximum value for a 5 bit field), `qrCodeRenewalInterval` must be set to `0` in order to indicate the QR code will not be renewed during the whole period (no QR code renewal).
Otherwise, `qrCodeRenewalInterval` must be set to the value `2^^CRIexp` seconds.
It follows that:
```
	qrCodeRenewalInterval = (CRIexp == 0x1F) ? 0 : 2^^CRIexp; // value in seconds
```
It means the QR code renewal will happen after an interval that is comprised between `1` and `2^^30` seconds inclusive, or never (if `qrCodeRenewalInterval == 0`).
Of course, a new QR code must be generated at the start of a new period (because the `LTKey` and `LTId` fields change) even if the `qrCodeRenewalInterval` is not finished.

- `venueType` (5 bits) (`vType` in figure):
this field specifies the type of the location/venue (e.g., a restaurant).
The encoding is country specific (for instance, for France, it can be mapped to the [Types d'ERP](https://www.service-public.fr/professionnels-entreprises/vosdroits/F32351) classification).

- `venueCategory1` (4 bits) (`vCat1` in figure):
this field specifies a first level of venue category. This is an opaque field whose semantic is out of scope of the present document.

- `venueCategory2` (4 bits) (`vCat2` in figure):
this field specifies a second level of venue category. This is an opaque field whose semantic is out of scope of the present document.

- `periodDuration` (1 byte):
this field contains the duration, in terms of number of hours, of the period.
Since this period duration is location dependent, this information needs to be communicated to the server.
The value 255 is reserved to the special case of an unlimited period duration.
The default value is 24 hours (a period per day), but different values may be defined up to a maximum of 254 hours (i.e., 10 days and 14 hours).

- `ct_periodStart` (24 bits):
this field contains the starting time of the period in a compressed manner, dividing `t_periodStart` by 3600, which is guaranteed to be an integral value since a period necessarily starts at a round hour (i.e., is multiple of 3600).
It follows that:
```
	t_periodStart = ct_periodStart * 3600;
```

- `t_qrStart` (32 bits):
this field contains the starting time of the QR code validity timespan, expressed as the number of seconds since January 1st, 1900 (NTP timestamp limited to the 32-bit seconds field).

- `LTKey` (32 bytes, or 256 bits):
this field carries the location temporary key for the period.

ROCA Vincent's avatar
ROCA Vincent committed
571 572 573

#### Encryption and integrity protection

ROCA Vincent's avatar
ROCA Vincent committed
574
The `msg` message must be encrypted using the ECIES-KEM **[ISO18033-2] [Shoup2006] [Libecc]** hybrid encryption scheme that provides both confidentiality, using an asymmetric encryption scheme, and integrity verification.
ROCA Vincent's avatar
ROCA Vincent committed
575
More precisely, this scheme uses SECP256R1 ECDH as KEM, KDF1 using SHA256 hash as KDF and AES-256-GCM with a fixed 96-bits IV as DEM and TAG.
ROCA Vincent's avatar
ROCA Vincent committed
576
To the original `msg` message, the hybrid ECIES-KEM scheme appends a block of 49 bytes that contains both a tag and an ephemeral public key.
ROCA Vincent's avatar
ROCA Vincent committed
577 578
A detailed description is given in [Appendix A](#a-description-of-the-hybrid-encryption-scheme-and-the-enc-and-dec-functions).

ROCA Vincent's avatar
ROCA Vincent committed
579
While only the `msg` message is encrypted, the integrity protection encompasses the whole `LSP` message, including the cleartext part of the LSP: any accidental or malicious modification is therefore automatically detected.
ROCA Vincent's avatar
ROCA Vincent committed
580

ROCA Vincent's avatar
ROCA Vincent committed
581

ROCA Vincent's avatar
ROCA Vincent committed
582
#### Size of the various QR codes
ROCA Vincent's avatar
ROCA Vincent committed
583

ROCA Vincent's avatar
ROCA Vincent committed
584 585
A Level 12 65x65 QR code Type 2 (see [Section 2.4](#24-technical-requirements)) has a limited capacity, 287 binary characters for redundancy level M or 203 for redundancy level Q. 
It is therefore essential that the full "deep link" complies with these limits.
586
The country specific prefix, namely `https://tac.gouv.fr?v=0#` in case of France, requires 24 characters.
ROCA Vincent's avatar
ROCA Vincent committed
587

ROCA Vincent's avatar
ROCA Vincent committed
588
The location specific part size depends on:
ROCA Vincent's avatar
ROCA Vincent committed
589

ROCA Vincent's avatar
ROCA Vincent committed
590
- the LSP type, since type 1 adds 5 more bytes (this value is before the Base64url encoding);
ROCA Vincent's avatar
ROCA Vincent committed
591

ROCA Vincent's avatar
ROCA Vincent committed
592
- the optional presence of the `locContactMsg` message (when the `locContactMsgPresent == 1`), which consists of 16 bytes of cleartext.
ROCA Vincent's avatar
ROCA Vincent committed
593

ROCA Vincent's avatar
ROCA Vincent committed
594 595 596 597 598
It should also be noted that :

- the hybrid ECIES-KEM encryption add a 49-byte overhead;

- the Base64url encoding increases the size by a factor `1.33` approximately.
ROCA Vincent's avatar
ROCA Vincent committed
599 600

The following table summarizes the situation.
ROCA Vincent's avatar
ROCA Vincent committed
601

ROCA Vincent's avatar
ROCA Vincent committed
602

ROCA Vincent's avatar
ROCA Vincent committed
603
| Name                                                         | size with LSP Type 0 | size with LSP Type 1 |
ROCA Vincent's avatar
ROCA Vincent committed
604
|--------------------------------------------------------------|----------------------|----------------------|
605
| `https://tac.gouv.fr?v=0#` prefix (characters)               | 24 chars             | 24 chars             |
ROCA Vincent's avatar
ROCA Vincent committed
606 607 608 609
| Plain text part of the LSP (bytes)                           | 17 bytes             | 22 bytes             |
| msg part of the LSP, without `locContactMsg` (bytes)         | 93 bytes             | 93 bytes             |
| `locContactMsg` size (bytes)                                 | 65 bytes             | 65 bytes             |
| size with `locContactMsg` before Base64url encoding (bytes)  | 175 bytes            | 180 bytes            |
610 611
| size with `locContactMsg` after Base64url encoding (chars)   | 234 chars            | 240 chars            |
| **_total URL size with `locContactMsg` (characters)_**       | **_258 chars_**      | **_264 chars_**      |
ROCA Vincent's avatar
ROCA Vincent committed
612
| size w/o `locContactMsg` before Base64url encoding (bytes)   | 110 bytes            | 115 bytes            |
613 614
| size w/o `locContactMsg` after Base64url encoding (chars)    | 147 chars            | 154 chars            |
| **_total URL size w/o `locContactMsg` (chars)_**             | **_171 chars_**      | **_178 chars_**      |
ROCA Vincent's avatar
ROCA Vincent committed
615 616


ROCA Vincent's avatar
ROCA Vincent committed
617
### 3.5- Synchronous scan of a QR code when a client enters a location (LSP Type 0)
ROCA Vincent's avatar
ROCA Vincent committed
618

619
A client entering a location scans the QR code, and the CLEA application adds the following tuple to its local list, `localList`, which records the visited locations:
ROCA Vincent's avatar
ROCA Vincent committed
620 621 622
```
	{QR_code, t_qrScan}
```
623
where `t_qrScan` is the timestamp in NTP format (32-bit seconds field) of the CLEA application.
ROCA Vincent's avatar
ROCA Vincent committed
624 625 626
Entries in the local list are automatically removed after 14 days.


ROCA Vincent's avatar
ROCA Vincent committed
627
#### Detection of duplicated scans by the CLEA application
ROCA Vincent's avatar
ROCA Vincent committed
628

629
Before adding `{QR_code, t_qrScan}` in the local list, the CLEA application checks that an entry with the same `LTId` is not already there, with a scanning time "close" to `t_qrScan`:
ROCA Vincent's avatar
ROCA Vincent committed
630 631 632 633 634 635 636 637
```C
	// assume a previous entry already exists for the same LTId, with a scanning time t_scan0
	if (abs(t_qrScan - t_scan0) > dupScanThreshold) {
		// record the new entry, scanned sufficiently later after the previous scan for this LTId
	} else {
		// reject the new entry as duplicated
	}
```
ROCA Vincent's avatar
ROCA Vincent committed
638
where the `dupScanThreshold` is the time tolerance in the duplicated scan mechanism: for a given `LTId`, a single QR code can be recorded in the localList every `dupScanThreshold` seconds.
ROCA Vincent's avatar
ROCA Vincent committed
639 640 641 642 643 644 645 646 647 648 649

This verification is intended to avoid disrupting the "cluster" qualification mechanism by artificially increasing the number of reports for a given time slot and location, which may be accidental (a client unwillingly scans twice the QR code) or deliberate (a malicious client, who knows he will likely be tested COVID+).
From this point of view a large value for `dupScanThreshold` is preferable.
However, a regular client of a restaurant that does not distinguish between the noon and evening services (e.g., it remains continuously open from 11:00am and 12:00pm) will need to scan and register in its `localList` two QR codes with the same LTId on that day, one for lunch and another one for dinner.
Choosing an appropriate value of closeness for this check is key.
By default, a value of 3 hours is used:
```
	dupScanThreshold = 3 * 3600;
```

Note that this is not an absolute protection as an attacker using a malicious application could easily bypass this check.
650
Note also that having a `dupScanThreshold` value that depends on the location specificities (e.g., the expected duration during which a client is supposed to stay in a location) is not feasible since this piece of information is in the encrypted part of the QR code and is not accessible to the CLEA application.
ROCA Vincent's avatar
ROCA Vincent committed
651 652


ROCA Vincent's avatar
ROCA Vincent committed
653
#### Reliability of the t_qrScan timestamp
ROCA Vincent's avatar
ROCA Vincent committed
654 655 656 657 658 659

The replay protection is limited by the availability of a trustworthy `t_qrScan` timestamp, which garanties that the local terminal clock has not been maliciously modified to match that of the replayed QR code.

Although nothing can prevent a malicious application from storing a specially crafted timestamp, the official application should propose a trustworthy internal clock to be used for this purpose.
The accuracy of this trustworthy clock needs to be in line with the `qrCodeRenewalInterval`. 
With an interval of `2^^10 = 1024 seconds`, the accuracy requirement is pretty low.
660
The CLEA application benefits from such an internal trustworthy clock, making it relatively robust in front of such a relay attack.
ROCA Vincent's avatar
ROCA Vincent committed
661 662


ROCA Vincent's avatar
ROCA Vincent committed
663
### 3.6- Asynchronous scan of a QR code (LSP Type 1)
ROCA Vincent's avatar
ROCA Vincent committed
664

ROCA Vincent's avatar
ROCA Vincent committed
665 666 667
A user who receives a QR code of LSP Type 1 can use the CLEA application to scan it, at her own discretion, before, during or after the event.
Similarly to LSP Type 0, the CLEA application will add the following tuple to its local list:
```
668
	{QR_code, t_event}
ROCA Vincent's avatar
ROCA Vincent committed
669
```
670
where `t_event` is the timestamp in NTP format (32-bit seconds field) contained in the cleartext part of the LSP.
ROCA Vincent's avatar
ROCA Vincent committed
671
Note that the scanning time is meaningless in case of these QR codes and is not recorded.
672 673 674
The `t_event` information is also redundant with that contained in the QR code, but is added here in order to enable a uniform processing of the LSP Types 0 and 1.
Entries in the local list are automatically removed after 14 days, delay that is measured with respect to the `t_event` date.

ROCA Vincent's avatar
ROCA Vincent committed
675

ROCA Vincent's avatar
ROCA Vincent committed
676
#### Detection of duplicated scans by the CLEA application
ROCA Vincent's avatar
ROCA Vincent committed
677

678 679 680
Before adding `{QR_code, t_event}` in the local list, the CLEA application checks that an entry with the same `LTId` is not already there.
Since LSP Type 1 QR codes correspond to unique events, there can be only a single entry for a given `LTId` at any time and any duplicate is systematically removed.
This is a difference with respect to LSP Type 0 QR codes where a client can visit the same location several times.
ROCA Vincent's avatar
ROCA Vincent committed
681

ROCA Vincent's avatar
ROCA Vincent committed
682 683

### 3.7- Upload of the location history by a client tested COVID+ and cluster detection on the server
ROCA Vincent's avatar
ROCA Vincent committed
684 685

Let us assume the user has been tested COVID+.
686
In that case, her CLEA application asks for her explicit informed consent to upload her location history.
ROCA Vincent's avatar
ROCA Vincent committed
687 688 689 690 691 692 693 694 695 696
If the user explicitly agrees, the following operations take place.


#### Processing of the user location history by the frontend server

The user application uploads to the server, within a TLS connection, the location history stored in its local list, `localList`, along with the associated authorisation, meant to prove the user has indeed been tested COVID+.
The details of this authorisation mechanism are out of scope of the present document.

The location history consists of a set of records of the form:
```
697
	{QR_code_0, t_checkin_0}, {QR_code_1, t_checkin_1}, {QR_code_2, t_checkin_2}...
ROCA Vincent's avatar
ROCA Vincent committed
698
```
699
where the `t_checkin` is either a scanning timestamp (in case of LSP Type 0) or the event timestamp (in case of LSP Type 1).
700 701 702

This history is by design limited to 14 days of history.
It could be further restricted, or the uploaded data could add additional information.
703
For instance, if the goal is to do forward tracing, and if the user experienced symptoms starting from a known date, it could be helpful to take advantage of the start of the "infectious period" (i.e., when the user could contaminate others) to remove records prior to that date.
ROCA Vincent's avatar
ROCA Vincent committed
704
On the opposite, if the goal is to do backward tracing, it could be helpful to distinguish between the "infectious period" (when the user could contaminate others) and "infected period" (when the user has potentially been contaminated), when known.
705
The details of what to do exactly, as they depend on the Health Authority decisions, are out of scope of this specification.
ROCA Vincent's avatar
ROCA Vincent committed
706 707 708 709 710

The frontend of the server:

- first of all verifies the COVID+ status of the user and discards an invalid upload from a user who does not show a valid authorisation.

ROCA Vincent's avatar
ROCA Vincent committed
711 712 713
- then it checks that this history does not contain duplicated scans, using the same methodology as before, namely by checking if: `(abs(t_qrScan - t_qrScan0) > dupScanThreshold)`.
If any duplicated scan is identified (test is true), it is recommended to discard the whole history as coming from a invalid application.
This verification is meant to protect the server against malicious applications that could try to bypass the local duplicated scan check.
ROCA Vincent's avatar
ROCA Vincent committed
714 715 716 717 718 719 720 721

- the frontend then sanitizes the message (e.g., by removing the source IP address).

- finally the frontend mixes each entry from this user with that of other other COVID+ uploads in order to minimize privacy risks.


#### Processing of a user location record by the backend server

722
When receiving a given `{QR_code, t_checkin}` tuple (they are processed independently from one another as a result of the frontend mixnet), the backend server:
ROCA Vincent's avatar
ROCA Vincent committed
723 724 725 726 727 728


**_- Step 1:_** decrypts the `msg` part of the QR code, using its `SK_SA` secret key, and checks the message integrity.
In case of problem, the server rejects the tuple.


729 730 731
**_- Step 2:_** With an LSP Type 0 QR code (synchronous scan), if `qrCodeRenewalInterval > 0`, a freshness check is performed for this tuple in order to limit relay attacks.
In that case, the `t_checkin` of this tuple is the QR code scan timestamp, `t_qrScan`.
So, if `t_qrScan` (generated by the CLEA application during the scan) and `t_qrStart` (generated by the device and protected from malicious modifications by being in the encrypted part of the QR code) are "too different", the server rejects the tuple.
732
The tolerance depends on the `qrCodeRenewalInterval` value, on the possible drift of the device clock (e.g., one or two minutes per year), and on the accuracy of the CLEA application clock on the user terminal.
ROCA Vincent's avatar
ROCA Vincent committed
733 734 735 736
For instance it checks that: 
```
	| t_qrScan - t_qrStart | < qrCodeRenewalInterval + 300 sec + 300 sec
```
737
in order to take into account the possibility of scanning the code just before its renewal, including a maximum drift of 5mn for this device compared to the official time, and also a maximum drift of 5mn for the CLEA clock.
ROCA Vincent's avatar
ROCA Vincent committed
738 739 740

This verification is intended to limit (without being able to totally prevent them) relay attacks where the attacker scans a QR code from a target location and communicates it to several supposed patients in order to create a fake cluster afterwards. The attack is thus limited in time to the defined tolerance.

741
If `qrCodeRenewalInterval == 0` of with an LSP Type 1 QR code (asynchronous scan), there is no freshness check, the QR code being static during the whole period (e.g., a day).
ROCA Vincent's avatar
ROCA Vincent committed
742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762


**_- Step 3:_** computes from the information stored in the encrypted `msg`, `HMAC-SHA-256-128(LTKey(t_periodStart), "1")`, and compares its value to the `LTId` value retrieved from the unencrypted part of the QR code. If the two values differ, the server rejects the tuple.


**_- Step 4:_** depending on the location category, the server determines the corresponding exposure time (e.g., 3 hours of presence in the case of a restaurant to take into account the fact that periods of presence are rounded up to the hour).
In the case of a QR code of type "staff" (indicated by the flag `staff` set to 1), the exposure time is to be considered from `t_qrScan` and until the end of the period.


**_- Step 5:_** The server stores the exposure as follows.
If `LTId` has already been flagged as exposed, it retrieves the associated context, otherwise it creates a new context for `LTId`, for example in a record of the following type:
```C
	typedef struct {
	        LTId_t          LTId;                   // for that LTId (a hashtable can be used)
	        uint32_t        t_periodStart;          // start time (in seconds) of the period
	        uint8_t         periodDuration;         // duration in terms of hours, also the number
	                                                // of entries in the hourlyExposureCount table
							// if lower than 255.
	        uint8_t         hourlyExposureCount[];  // number of COVID+ users per hour
	} LTId_exposure_context;
```
ROCA Vincent's avatar
ROCA Vincent committed
763
where the first three fields are initialized thanks to the corresponding fields in the QR code (after conversion for `t_periodStart`).
ROCA Vincent's avatar
ROCA Vincent committed
764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782
The `hourlyExposureCount[]` table should be large enough to encompass the whole duration of a very long event, when `periodDuration` is equal to 255 (possibly by using a different data structure, a list instead of table). The technical details are out of scope of the present document.

Then, if the exposure is 3 hours (previous example of the restaurant), the server calculates the index of the first hour of exposure:
```C
	h1 = floor((t_qrScan - t_periodStart) / 3600);
```
and increments the three hourly exposure counters  of `LTId` (assuming `e` is a pointer to the appropriate context entry of type `LTId_exposure_context`): 
```C
	for (uint32_t i = 0; i < 3; i++) {
	        uint32_t        h  = h1 + i;
	        if (h >= e->periodDuration)
	                break;       // beyond the period end, stop immediately
	        if (e->hourlyExposureCount[h] < 255)
	                e->hourlyExposureCount[h]++;
	}
```
It is important to verify that any index is within 0 and `e->periodDuration - 1` (inclusive) before updating any counter, since the 3 exposure hours (previous example) may extend beyond the closure of the location.
The above code avoids wrapping to zero when a counter already reached its maximum value, 255 (counting above 255 is of course possible after changing the data type).

783 784
With an LSP Type 1 (asynchronous scan), the QR code itself contains a `visitDuration` field that indicates the exposure (e.g., in case of a train travel, the timespan when the user stays within the wagon).

ROCA Vincent's avatar
ROCA Vincent committed
785 786 787 788 789 790 791 792

**_- Step 6:_** if the `e->hourlyExposureCount[h1]` (for instance) goes above the cluster qualification threshold (this threshold may depend on the location category and capacity), it adds `{LTId, h1}` to its `clusterList`, a public list periodically downloaded by all terminals.
This list needs to be structured in batches, in order to make possible the partial download of a subset of it by terminals (see below).
A threshold equal to 1 is likely to accelerate the cluster identification process.
Several levels of severity could also be considered depending on the exposure counter value.
The details on how to exploit the various exposure counters are related to epidemiological policy considerations and are therefore out of scope of the present document.


ROCA Vincent's avatar
ROCA Vincent committed
793
### 3.8- Incremental downloads of the clusterList
ROCA Vincent's avatar
ROCA Vincent committed
794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830

The `clusterList` is made available by the server to all terminals, for instance via a Content Delivery Network, CDN, service.
The `clusterList` information is structured in a manner that enables a terminal to download the entries in an incremental manner (rather than the 14 days content at once).
Therefore, a terminal that fetches the list every day only downloads the latest entries, corresponding to new cluster locations identified since the previous fetch.
This approach contributes to limit traffic overhead as well as the required amount of processing and storage on the terminal.

To that goal the following data structure[^footnote-4] is used (inspired from GAEN, see [Kessibi2020](#references)):

- the `clusterList` is split into a collection of files, each of them corresponding to a given time span (e.g., 6 hours) and collecting all the new cluster locations identified by the server during this time span;
- a well-known URL is defined, for instance: `https://tacw.example.com/clusterlist/`, that is meant to contain the various files of the `clusterList` collection;
- the server makes available a well-known index, `index.txt`,  that lists the various files available, in a chronological order, and limited to a 14 days history;
- each file contains some metadata and the identification of all the clusters added to the `clusterList` during that time span (LTId and timing information);
- the file names use the following convention: `cluster_file_ID_DATE.json` where `ID` is a monotonically incremented identifier, starting at 0 when bootstraping the system, and the `DATE` suffix indicates the corresponding `yyyymmdd` (it is essentially here to facilitate human checks);
- all files are made available as soon as possible in order to quickly let users know if they are at risk. It follows that several files per day should be made available (4 in this example).

Here is an example of `index.txt` file (usually there are as many entries as required to cover the 14 days window):
```
cluster_file_521_20210215.json
cluster_file_522_20210215.json
cluster_file_523_20210215.json
cluster_file_524_20210215.json
cluster_file_525_20210216.json
cluster_file_526_20210216.json
```

Here is an example of `cluster_file_521_20210215.json` file (2 clusters only are listed, corresponding to time span 0am-6am UTC time):
```
{
    clusterListExport: {
        start: 3822336080,
        end: 3822357680,
        signature_infos: {
            TBD
        }
    },
    clusterInfo: [
        {
Cypres TAC's avatar
Cypres TAC committed
831
            TLId: "put-here-the-resulf-of-base64url-encoding-of-TLId",
ROCA Vincent's avatar
ROCA Vincent committed
832 833 834 835 836
            clusterStart: 3822346880,
            clusterDuration: 2
            warningLevel: 1
        },
        {
Cypres TAC's avatar
Cypres TAC committed
837
            TLId: "put-here-the-resulf-of-base64url-encoding-of-TLId",
ROCA Vincent's avatar
ROCA Vincent committed
838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860
            clusterStart: 3822354080,
            clusterDuration: 3
            warningLevel: 3
        }
    ]
}
```

where:

- `clusterStart` is the round hour from which the location is considered cluster, using NTP time.
- `clusterDuration` is the number of hours, starting at `clusterStart` (included), where the location is considered as a cluster.
- `warningLevel` determines the severity of the warning, from "low" (1), "medium" (2), to "high" (3).
	The exact criteria defining the severity of a warning are out of scope of the present document.

Note that Unix timestamps (that uses an epoch located at 1/1/1970-00:00h (UTC)) and NTP (that uses 1/1/1900-00:00h) timestamps can be converted to one another by adding or substracting a fixed number of seconds, corresponding to a fixed offset equivalent to 70 years in seconds (there are 17 leap years between the two dates)[^footnote-5]:
```
(70 * 365 + 17) * 86400 = 2208988800
```

[^footnote-5]: See: [https://stackoverflow.com/questions/29112071/how-to-convert-ntp-time-to-unix-epoch-time-in-c-language-linux]


ROCA Vincent's avatar
ROCA Vincent committed
861
### 3.9- Decentralized risk analysis in the CLEA application 
ROCA Vincent's avatar
ROCA Vincent committed
862

863
Each CLEA application periodically downloads the `clusterList` from the server, in an incremental manner.
ROCA Vincent's avatar
ROCA Vincent committed
864
This is achieved by downloading the `index.txt` file first, identifying the newly available files (it is assumed the application remembers what is the latest file name downloaded).
865
The CLEA application then downloads each of the new files, remembers the name of the last one, and processes them one by one.
ROCA Vincent's avatar
ROCA Vincent committed
866

867
Then the CLEA application checks locally if there is one or more intersections between:
ROCA Vincent's avatar
ROCA Vincent committed
868

ROCA Vincent's avatar
ROCA Vincent committed
869
- the information `{LTId, t_checkin}` from each tuple of its `localList` (the `LTId` is available in clear text in the QR code scanned in order to allow this comparison).
ROCA Vincent's avatar
ROCA Vincent committed
870 871 872 873 874 875
- the information `{LTId_cluster, h1_cluster}` from the downloaded `clusterList`.
In case of a match, the application informs the user by means of a warning, indicating for instance the associate date.
It the server provide a certain degree of risk (i.e., distinguishes low and high risks), this information is communicated to the user.
However, since the `{LTId_cluster, h1_cluster}` information is public, a curious user may be able to know more about the exact time of exposure.


ROCA Vincent's avatar
ROCA Vincent committed
876
### 3.10- Linking the CLEA digital system and the hand-written attendance register
ROCA Vincent's avatar
ROCA Vincent committed
877

878
The use of the CLEA digital system is based on a voluntary decision of the user, the alternative consisting for this user in leaving her name in the hand-written attendance register.
ROCA Vincent's avatar
ROCA Vincent committed
879
Consequently, a link between the two systems should be established. 
880
The following sections explain how this can be done, depending on whether a user tested COVID+ has used the CLEA system or the hand-written attendance register.
881 882 883 884 885 886

It should also be noted that there are use-cases where the hand-written attendance register may not exist, for instance in case of digital ticketing.
In that case, the `locContactMsg` should be ignored, by setting the `locContacMsgPresent` flag to 0.
Similarly, the Health Authority may decide not to link the two systems together, in which case the `locContacMsgPresent` flag should be set to 0.

It should also be noted that the link between the two systems is not perfect.
887
If the cluster qualification threshold is strictly superior to `1`, it can happen that a given location should be qualified as cluster because the total number of COVID+ persons who were there at the same time is sufficient, but no alert is raised because some of them used the CLEA application, and the others the attendance register.
888

ROCA Vincent's avatar
ROCA Vincent committed
889

890
#### A user tested COVID+ has used the CLEA system
ROCA Vincent's avatar
ROCA Vincent committed
891 892 893 894 895 896 897

In that case, the backend server qualifies as a cluster a given location, based on an uploaded QR code (and perhaps previous ones depending on the threshold).
Since the re-identification of the location is the responsibility of the authority in charge of the manual contact tracing, assumed different from the authority in charge of the backend server, the backend server communicates through a TLS connection the location contact re-identification part of the QR code, encrypted via the public key of the Manual Contact Tracing Authority, along with cluster timing information.

The `locContactMsg` message is structured as follows (high-level view):

```
898
locContactMsg = [ locationPhone | padding | locationRegion | locationPIN | t_periodStart ]
ROCA Vincent's avatar
ROCA Vincent committed
899 900 901 902 903 904 905
```

The following binary format must be used:
```
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
ROCA Vincent's avatar
ROCA Vincent committed
906
|              locationPhone (60 bits)                          |
ROCA Vincent's avatar
ROCA Vincent committed
907
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
ROCA Vincent's avatar
ROCA Vincent committed
908
|              ...                                      | pad   |
ROCA Vincent's avatar
ROCA Vincent committed
909
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
ROCA Vincent's avatar
ROCA Vincent committed
910
|locationRegion |          locationPIN (3 bytes)                |
ROCA Vincent's avatar
ROCA Vincent committed
911 912 913 914 915
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|              t_periodStart (4 bytes)                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

ROCA Vincent's avatar
ROCA Vincent committed
916 917 918 919 920 921
- `locationPhone` (60 bits):
this field contains a phone number, where each digit is stored one by one in a 4-bit nibble.
The phone number must be encoded using the [E.164](https://www.itu.int/rec/T-REC-E.164/) standard that requires phone numbers to have a maximum length of 15 digits.
For instance, in case of France, `+33 1 02 03 04 05` will be stored as (binary) `0011 0011  0001 0000  0010 0000  0011 0000  0100 0000  0101 1111  1111 1111  1111`.
Unused nibbles must contain the `1111` / `0xF` value.

922
- `padding`(4 bits) (`pad` in figure):
ROCA Vincent's avatar
ROCA Vincent committed
923 924 925
this field is unused in the current specification and must be set to zero.

- `locationRegion` (1 byte):
ROCA Vincent's avatar
ROCA Vincent committed
926 927
this field contains coarse grain geographical information for the location, in order to facilitate the work of the Manual Contact Tracing team (e.g., for countries that rely on a regional organisation, it enables the cluster record to be routed directly to the right regional Manual Contact Tracing team).
In case of France, it can contain a department number.
ROCA Vincent's avatar
ROCA Vincent committed
928

ROCA Vincent's avatar
ROCA Vincent committed
929 930
- `locationPIN` (3 bytes):
this field contains a 6-digit secret PIN known only by the location contact, communicated when registering to the device manufacturer or on the web site when generating a static QR code.
ROCA Vincent's avatar
ROCA Vincent committed
931
It is meant to prevent an attacker who knows the contact phone number of a target location (this phone number is usually public) to forge a new QR code and handle it to a user tested COVID+.
932
Thanks to the `locationPIN`, the manual contact tracing team can check the QR code validity with the location contact: if the two pin codes do not match, the QR code is reputed invalid and ignored (note that the CLEA users have no risk, the forged `LTKey` and `LTId` being totally distinct from the ones actually used in this location).
ROCA Vincent's avatar
ROCA Vincent committed
933 934