Phase 1: Camera Calibration and Real-Time Object Detection with Depth¶
Pipeline for the non-contact monitoring project using camera01 from the Free-Viewpoint RGB-D Video Dataset (SJTU).
What it does¶
- Camera calibration: Loads intrinsics (and extrinsics) from
Camera Parameters/paras.txtfor camera01 (index 0). - Depth: Converts grayscale depth frames to metric depth (meters) using the dataset formula.
- Detection: Runs face (Haar) and person (HOG) detection on the RGB stream via OpenCV (no extra model files).
- Depth per ROI: For each detection bounding box, computes median depth and overlays it on the image.
Setup¶
cd "Phase 4 - Track B - Nvidia Jetson/5. Edge AI Optimization/non-contact-monitoring-edge"
pip install -r phase1/requirements.txt
Ensure the dataset is present:
Free-Viewpoint-RGB-D-Video-Dataset-main/camera01-rgb.mp4Free-Viewpoint-RGB-D-Video-Dataset-main/camera01-depth.mp4Free-Viewpoint-RGB-D-Video-Dataset-main/Camera Parameters/paras.txt
Run¶
From the non-contact-monitoring-edge directory:
Or from phase1:
Options¶
| Option | Description |
|---|---|
--dataset-dir PATH |
Root of the dataset (default: ../Free-Viewpoint-RGB-D-Video-Dataset-main) |
--camera-index N |
Camera index in paras.txt (0 = camera01) |
--no-person |
Only run face detection (faster) |
--no-display |
No GUI (e.g. headless); still processes and prints progress |
--out PATH |
Write output video (e.g. phase1_out.mp4) |
--max-frames N |
Process at most N frames (0 = all) |
Module overview¶
| Module | Role |
|---|---|
calibration.py |
Parse paras.txt, expose K, R, t; world-to-image projection. |
depth_utils.py |
Grayscale → metric depth; median depth in a bounding box. |
detection.py |
Face (Haar) and person (HOG) detectors; composite that avoids duplicate person/face boxes. |
run_pipeline.py |
Main: load calibration, read RGB + depth, detect, compute depth per box, visualize. |
Citation¶
When using the dataset, cite: Guo S, Zhou K, Hu J, et al. A new free viewpoint video dataset and DIBR benchmark. Proceedings of the 13th ACM Multimedia Systems Conference. 2022: 265–271.