Mentions légales du service

Skip to content
Snippets Groups Projects

Surveillance System Wiki

Overview

Alt text

This surveillance system is designed to process video feeds using a distributed Edge-to-Cloud architecture. Its primary goal is to identify and alert residents and authorities about dangerous animals detected in video footage. The system integrates edge devices (for real-time processing) and cloud servers (for intensive tasks like object recognition), ensuring quick and accurate responses.

Table of Contents

  1. Overview
  2. System Components
  3. Docker Compose Tutorial
  4. Deploying on K3S using Enoslib on Grid'5000

System Components

Camera

  • Function: Captures video frames, resizes them for optimized processing, and serializes and transmits them to the Motion Detection service.
  • Key Features:
    • Handles video sequences based on detected animal types.
    • Configurable appearance frequency for specific events.
    • Tracks FPS rate and integrates with TracerProvider for distributed tracing.
  • Connection: Maintains a persistent TCP connection for frame transmission.

Motion Detection

  • Function: Processes frames from the Camera to detect significant motion and forwards relevant frames to Object Recognition.
  • Key Features:
    • Motion detection using grayscale and Gaussian blur.
    • Real-time performance monitoring of CPU usage, frame processing time, and FPS.
    • Integrates with TracerProvider for tracing frame reception, processing, and transmission.
  • Connection: Listens for incoming frames from the Camera and sends processed frames to Object Recognition.

Object Recognizer

  • Function: Uses YOLO to detect objects in frames and tracks performance metrics like processing time and response time.
  • Key Features:
    • YOLO v3 for object detection.
    • Distributed tracing for end-to-end monitoring.
    • Supports multiple clients with concurrency.
  • Connection: Listens for incoming frames from the Motion Detection service and processes them asynchronously.

Other Components

OpenTelemetry Collector

  • Function: Collects, processes, and exports telemetry data (traces, metrics, logs).
  • Metrics Sent:
    • Collects data from Camera, Motion Detection, Object Recognition.
    • Sends telemetry data to Prometheus or other backends for visualization.

cAdvisor

  • Function: Monitors container resource usage (CPU, memory, network).
  • Metrics Sent:
    • CPU, memory, network I/O, disk I/O, container lifespan metrics.

Node Exporter

  • Function: Exports hardware and system performance metrics from Linux systems.
  • Metrics Sent:
    • CPU load, memory usage, disk utilization, network stats, and system uptime.

Prometheus

  • Function: Scrapes and stores metrics from various exporters and services.
  • Metrics Collected:
    • Application, service, and system-level metrics (e.g., FPS, processing time, system performance).

Docker Compose Tutorial

Step 1: Build Docker Images for Services

  1. Clone the repository:
    First, clone the repository to your local machine:

    git clone https://gitlab.inria.fr/sikaddou/surveillance-system-edge-to-cloud-video-processing.git
    cd surveillance-system-edge-to-cloud-video-processing
  2. Build the Docker images locally:
    You don't need to navigate to each service's directory individually. From the root of the cloned repository, run the following commands to build the Docker images for each service:

    docker build -t camera:latest ./services/camera
    docker build -t motion_detector:latest ./services/motion_detector
    docker build -t object_recognizer:latest ./services/object_recognizer

    These commands will automatically find the Dockerfile in each service's directory (./services/camera, ./services/motion_detector, and ./services/object_recognizer). The latest tag is used to mark the most recent image.

Step 2: Create or Update docker-compose.yml

Services Envirment Variables

Object Recognizer:

  • No environment variables specified.

Motion Detector:

  • INDEX: A unique identifier for each motion detector instance (e.g., 1, 2, 3).
  • OR_HOST: The hostname of the object recognizer service (set to object_recognizer).
  • OR_PORT: The port on which the object recognizer service is listening (set to 9999).

Camera:

  • CAMERA: Set to true, indicating that the container is a camera service.
  • ANIMAL_NAME: The name of the animal being observed by the camera (e.g., tiger, bear, wolf).
  • APPEARANCE_RATE: The frequency at which the animal appears in the footage (in frames per minute).
  • MDHOST: The hostname of the corresponding motion detector (e.g., motion_detector_1, motion_detector_2, motion_detector_3).
  • MDPORT: The port on which the corresponding motion detector is listening (set to 9998).
  • INDEX: A unique identifier for each camera (e.g., 1, 2, 3).

Here's an example for reference. Make sure the docker-compose.yml file is set up correctly:

version: '3'

services:
  object_recognizer:
    image: object_recognizer:latest
    ports:
      - "9999:9999"
      - "5000:5000"

  motion_detector_1:
    image: motion_detector:latest
    ports:
      - "9998:9998"
    depends_on:
      - object_recognizer
      - jaeger
    environment:
      - INDEX=1
      - OR_HOST=object_recognizer
      - OR_PORT=9999
    command: python src/motion_detection.py

  motion_detector_2:
    image: motion_detector:latest
    ports:
      - "9997:9998"
    depends_on:
      - object_recognizer
      - jaeger
    environment:
      - INDEX=2
      - OR_HOST=object_recognizer
      - OR_PORT=9999
    command: python src/motion_detection.py

  motion_detector_3:
    image: motion_detector:latest
    ports:
      - "9996:9998"
    depends_on:
      - object_recognizer
      - jaeger
    environment:
      - INDEX=3
      - OR_HOST=object_recognizer
      - OR_PORT=9999
    command: python src/motion_detection.py

  camera_1:
    image: camera:latest
    depends_on:
      - motion_detector_1
      - otel-collector
    environment:
      - CAMERA=true
      - ANIMAL_NAME=tiger
      - APPEARANCE_RATE=600
      - MDHOST=motion_detector_1
      - MDPORT=9998
      - INDEX=1
    command: /bin/sh -c "python src/camera.py"

  camera_2:
    image: camera:latest
    depends_on:
      - motion_detector_2
      - otel-collector
    environment:
      - CAMERA=true
      - ANIMAL_NAME=bear
      - APPEARANCE_RATE=500
      - MDHOST=motion_detector_2
      - MDPORT=9998
      - INDEX=2
    command: /bin/sh -c "python src/camera.py"

  camera_3:
    image: camera:latest
    depends_on:
      - motion_detector_3
      - otel-collector
    environment:
      - CAMERA=true
      - ANIMAL_NAME=wolf
      - APPEARANCE_RATE=700
      - MDHOST=motion_detector_3
      - MDPORT=9998
      - INDEX=3
    command: /bin/sh -c "python src/camera.py"

  jaeger:
    image: jaegertracing/all-in-one
    ports:
      - "16686:16686"
      - "6831:6831/udp"
      - "14268"
      - "14250"

  zipkin-all-in-one:
    image: openzipkin/zipkin:latest
    environment:
      - JAVA_OPTS=-Xms1024m -Xmx1024m -XX:+ExitOnOutOfMemoryError
    restart: always
    ports:
      - "9411:9411"

  otel-collector:
    image: otel/opentelemetry-collector-contrib
    restart: always
    command: ["--config=/etc/otel-collector-config.yaml", "${OTELCOL_ARGS}"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "1888:1888"   # pprof extension
      - "8888:8888"   # Prometheus metrics exposed by the collector
      - "8889:8889"   # Prometheus exporter metrics
      - "13133:13133" # health_check extension
      - "4317:4317"   # OTLP gRPC receiver
      - "55679:55679" # zpages extension
    depends_on:
      - jaeger
      - zipkin-all-in-one

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    hostname: cadvisor
    platform: linux/aarch64
    volumes:
      - "/:/rootfs:ro"
      - "/var/run:/var/run:ro"
      - "/sys:/sys:ro"
      - "/var/lib/docker/:/var/lib/docker:ro"
      - "/dev/disk/:/dev/disk:ro"
    ports:
      - "8080:8080"

  prometheus:
    image: prom/prometheus:latest
    restart: always
    volumes:
      - ./prometheus.yaml:/etc/prometheus/prometheus.yml
      - ./rules.yml:/etc/prometheus/rules.yml
    ports:
      - "9090:9090"

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro"
    command:
      - '--path.procfs=/host/proc'
      - '--path.rootfs=/rootfs'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
    expose:
      - 9100

Step 3: Deploy with Docker Compose

  1. Start the services with Docker Compose: After updating the docker-compose.yml, run the following command to start the services locally:

    docker-compose -f ./deploy/docker-compose/docker-compose.yml up -d --build --force-recreate
  2. Stop and remove existing containers: If you want to stop and remove any previously running containers and volumes, use the following command before running the above command:

    docker-compose -f ./deploy/docker-compose/docker-compose.yml down --volumes --remove-orphans

Step 4: Verify the Deployment

  1. Check the status of running containers: You can check if the containers are running properly with the following command:

    docker ps
  2. Access the services:

Deploying on K3S using Enoslib on Grid'5000

This tutorial will walk you through the steps required to deploy a Kubernetes cluster on K3s using Enoslib on the Grid'5000 (G5k) infrastructure. The focus is on deploying the surveillance system, which includes components such as the Camera, Motion Detection, and Object Recognizer. These steps cover cluster configuration, deployment of Kubernetes resources, Helm setup, Prometheus and Chaos Mesh installation, and additional tasks for monitoring and resilience testing of the system.

Step 0: Prerequisites

  • Grid'5000 account and access to the infrastructure.
  • enoslib installed on your local machine.
  • kubectl, helm, and ssh tools installed on your local machine.
  • K3s installed on your machines.

Step 1: Initialize Logging

First, initialize the logging for the deployment process.

import enoslib as en
from datetime import datetime

# Initialize logging
_ = en.init_logging()

# Generate a timestamp for the job name
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
job_name = f"usecase_{timestamp}"

Step 2: Cluster Resource Configuration

Define the configuration for the cluster resources on Grid'5000. This will specify the number of machines for each role (master, agent, etc.).

cluster = "parasilo"
conf = (
    en.VMonG5kConf
    .from_settings(job_name=job_name, walltime="04:00:00")
    .add_machine(roles=["master"], cluster=cluster, number=1, flavour="large")
    .add_machine(roles=["agent", "object_recognizer"], cluster=cluster, number=1, flavour_desc={"core": 8, "mem": 8192})
    .add_machine(roles=["agent", "motion_detector"], cluster=cluster, number=3, flavour_desc={"core": 4, "mem": 4096})
    .add_machine(roles=["agent", "camera"], cluster=cluster, number=3, flavour_desc={"core": 1, "mem": 1024})
    .finalize()
)

# Initialize the provider
provider = en.VMonG5k(conf)
roles, networks = provider.init()
en.wait_for(roles)

Step 3: Kubernetes Setup

Deploy the K3s cluster and ensure that it’s set up correctly.

# Initialize K3s with master and agent nodes
k3s = en.K3s(master=roles["master"], agent=roles["agent"])
k3s.deploy()

# Create a tunnel to the head node
print("Create a tunnel from your local machine to the head node:")
print(f"ssh -NL 8001:{roles['master'][0].address}:8001 {G5K_username}@access.grid5000.fr")

Step 4: Label Nodes

Label your nodes with specific attributes based on their roles (cloud, edge, camera).

def label_nodes(node_names, labels):
    label_str = ",".join(f"{key}={value}" for key, value in labels.items())

    for node in node_names:
        command = f"kubectl label nodes {node} {label_str} --overwrite"
        try:
            result = en.run_command(command, roles=roles['master'])
            print(f"Successfully labeled {node} with {label_str}")
        except subprocess.CalledProcessError as e:
            print(f"Failed to label {node}. Error: {e.stderr.decode('utf-8')}")

# Label nodes
label_nodes([str(node.alias) for node in roles["object_recognizer"]], {"pos": "cloud"})
label_nodes([str(node.alias) for node in roles["motion_detector"]], {"pos": "edge"})
label_nodes([str(node.alias) for node in roles["camera"]], {"pos": "camera"})

Step 5: Docker Registry Secret

Create a Docker registry secret with anonymized credentials to authenticate with Docker Hub.

docker_username = "username_placeholder"
docker_password = "password_placeholder"
docker_email = "email_placeholder"

# Create the Docker registry secret
en.run_command(f"kubectl create secret docker-registry dockerhub-secret \
  --docker-username={docker_username} \
  --docker-password={docker_password} \
  --docker-email={docker_email}", roles=roles['master'])

Step 6: Send Files to Cluster

Transfer necessary configuration files to the cluster using scp.

import os

def send_file(file_path, file_name):
    scp_command = f"scp {file_path} root@{roles['master'][0].address}:{file_name}"
    os.system(f"ssh-keygen -f /home/{G5k_username} /.ssh/known_hosts -R {roles['master'][0].address}")
    os.system(scp_command)
    return scp_command

# Send configuration files to the cluster
local_directory = "deploy/G5k/K3S/config"
for root, dirs, files in os.walk(local_directory):
    for file_name in files:
        send_file(local_directory + file_name, file_name)

# Apply Kubernetes manifests
en.run_command("kubectl apply -f deployement.yaml", roles=roles['master'])
en.run_command("kubectl apply -f configmap.yaml", roles=roles['master'])

Step 7: Helm Installation and Prometheus Setup

Install Helm, add Prometheus charts, and deploy Prometheus for monitoring.

with en.actions(roles=roles["master"]) as a:
    a.raw("curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3")
    a.raw("chmod 700 get_helm.sh")
    a.raw("./get_helm.sh")

    # Install Prometheus
    a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo add prometheus-community https://prometheus-community.github.io/helm-charts")
    a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo update")
    a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml install prometheus prometheus-community/prometheus -n default --set grafana.enabled=True -f costum-values.yaml")
    a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml upgrade --kubeconfig /etc/rancher/k3s/k3s.yaml prometheus prometheus-community/prometheus --namespace default --set prometheus-node-exporter.service.port=9500")

Step 8: Chaos Mesh Installation

Install Chaos Mesh using Helm for fault injection and resilience testing.

with en.actions(roles=roles["master"]) as a:
    a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo add chaos-mesh https://charts.chaos-mesh.org")
    a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo update")
    a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml install chaos-mesh chaos-mesh/chaos-mesh --set chaosDaemon.runtime=containerd --set chaosDaemon.containerdSocket=/run/containerd/containerd.sock -n default")
    a.raw("kubectl apply -f rbac.yaml")
    a.raw("kubectl create token account-default-admin-eechh")
    a.raw('kubectl patch svc chaos-dashboard -n default -p \'{"spec": {"ports": [{"name": "http", "protocol": "TCP", "port": 2333, "targetPort": 2333, "nodePort": 30312}]}}\'')

Step 9: Patch Prometheus Service

Expose Prometheus service on a NodePort for web access.

with en.actions(roles=roles["master"]) as a:
    a.raw('kubectl patch svc prometheus-server -n default -p \'{"spec": {"type": "NodePort", "ports": [{"name": "http", "port": 80, "targetPort": 9090, "nodePort": 30090}]}}\'')

Step 10: Fetch Webpage Hosts

Retrieve the host IP of Prometheus and Chaos Dashboard services.

def get_host(service):
    results = en.run_command(f"kubectl get pods -n kube-system -l app={service}", roles=roles['master'])
    import re
    pattern = r"host='([^']*)'"
    match = re.search(pattern, str(results))

    if match:
        host = match.group(1)
        print("Extracted host:", host)
    else:
        print("Host not found")
        return None

    results = en.run_command(f"kubectl describe node {host}", roles=roles['master'])
    pattern = r"InternalIP:\s*([\d.]+)"
    match = re.search(pattern, str(results))

    if match:
        return match.group(1)
    else:
        print("Host not found")
        return None

# Fetch the Prometheus and Chaos Dashboard IPs
url = f"http://{get_host('prometheus')}:30090"
print(f"Prometheus web page host: {url}")

url = f"http://{get_host('chaos-dashboard')}:30312"
print(f"Chaos-mesh web page host: {url}")

Step 11: Token Creation and Cleanup

Create the necessary tokens and perform cleanup after the job is complete.

results = en.run_command("kubectl create token account-default-admin-eechh", roles=roles["master"])
for res in results:
    print(res.payload['stdout'])

# Cleanup the resources
provider.destroy()

Conclusion

By following the steps above, you have successfully deployed the surveillence system on Kubernetes cluster using K3s on Grid'5000, set up monitoring with Prometheus, fault injection with Chaos Mesh, and labeled nodes for efficient management.