Introduction
This is the PyTorch implementation of our paper "OSLO: On-the-Sphere Learning for Omnidirectional images and its application to 360-degree image compression" accepted in IEEE Transactions on Image Processing. In case of using this code, please cite our paper as mentioned in section Citation. Some pretrained models and a few 360-degree images are provided in oslo_data. Note that in our paper we used SUN360 equirectangular image database of size 9104×4552 for experiments, but the provided images in oslo_data repository are different from the SUN360 images.
Prerequisites
The code is based on PyTorch. So, first you should install PyTorch in your virtual environment depending on your cuda version (if you don't have cuda, use the cpu version of PyTorch). You also need to install PyG (PyTorch Geometric) depending again on your cuda and PyTorch versions. The other dependencies can be installed using the following command
pip install -r requirements.txt
HEALPix sampling
HEALPix (an acronym for Hierarchical Equal Area isoLatitude Pixelation) is an algorithm for pixelisation of the 2-sphere in which each pixel covers the same surface area as every other pixel. In HEALPix, the tessellation process begins by partitioning the spherical surface into 12 equal-area regions (base resolution). To have finer pixelization, each region is recursively divided into 2x2 equal-area sub-pixels until the desired resolution is reached.
A Healpix tessellation is parametrized by a number, called NSIDE=2^(resolution), and the total number of pixels is equal to 12xNSIDE^(2)=12x2^(resolution)x2^(resolution).
By construction, each pixel in HEALPix has eight adjacent neighbors (structured in a diamond pattern), except for 24 pixels that have seven neighbors (for any resolution higher than the base resolution). The orientation of neighboring points with the central point is almost fixed all over the sphere. Each neighbor can be identified by its relative direction to the central pixel: SW, W, NW, N, NE, E, SE, and S.
Step 1: Sampling from Equirectangular images
In the first step, the equirectangular images must be sampled according to HEALPix sampling. For that, use generate_healpix_samples_and_split_to_train_and_test.py
. The script samples the sphere and at the same time it splits the dataset into train, validation, and test sets. For example:
python ./generate_healpix_samples_and_split_to_train_and_test.py -i ../oslo_data/images_equirectangular -e jpg -r 10 -s -o ../oslo_data/Healpix_res_10 -t 0.8 -v 0.1
The script generates the sampled files in numpy .npy
format. It also generates three text files named train.txt
, test.txt
, validation.txt
.
Arguments description:
-
[-i|--image-dir]
: Directory of equirectangular images -
[-e|--image-ext]
: image type (accepted values are ["jpg", "jpeg", "png", "tiff"]) -
[-r|--healpix-res]
: Resolution of HEALPix for sampling (see HEALPix sampling). -
[-s|--run-healpix-sampling]
: Runs HEALPix sampling. Otherwise, it only splits the dataset into train/test/validation sets). -
[-o|--out-dir]
: Output directory where the healpix sampling are stored in numpy arrays. -
[-t|--ratio-train-data]
: Ratio of data that will be considered as training data. -
[-v|--ratio-validation-data]
: Ratio of data that will be considered as validation data.
Step 2: Neighboring structure
Now you need to create a structure that determines the neighboring of each pixel. Each pixel in HEALPix has eight adjacent neighbors and each neighbor can be identified by its relative direction to the central pixel: SW, W, NW, N, NE, E, SE, and S. This structure can be defined using generate_and_save_healpix_sdpa_struct.py
. For example:
python ./generate_and_save_healpix_oslo_struct.py -o ../oslo_data/neighbor_structure -hr 10 -pr 8
Arguments description:
-
[-o|--out-dir]
: Directory to save structures. -
[-hr|--healpix-res]
: Resolution of HEALPix sampling. -
[-pr|--patch-res]
: Resolution of the HEALPix patches.
Note: In our paper we used the following values for the pair (-hr, -pr)
: {(10, 8), (9, 7), (8, 6), (7, 5), (6, 4), (5, 3), (4, 2)}
. For the fist element in the set, i.e., (10, 8)
, hr=10
determines the input resolution of HEALPix images which is equivalent to 12x2^(10)x2^(10)=12582912 pixels (a resolution almost equal to 4K for 2D equirectangular images), and pr=8
determines the patch size of 2^(8)x2^(8)=256x256
as it is suggested in 1(#1). The remaining pairs in the set refer to downsampled healpix resolution and their corresponding patch size used in auto-encoder architectures explained in 1(#1), 2(#2). So, to construct the rest we do:
python ./generate_and_save_healpix_oslo_struct.py -o ../oslo_data/neighbor_structure -hr 9 -pr 7
python ./generate_and_save_healpix_oslo_struct.py -o ../oslo_data/neighbor_structure -hr 8 -pr 6
python ./generate_and_save_healpix_oslo_struct.py -o ../oslo_data/neighbor_structure -hr 7 -pr 5
python ./generate_and_save_healpix_oslo_struct.py -o ../oslo_data/neighbor_structure -hr 6 -pr 4
python ./generate_and_save_healpix_oslo_struct.py -o ../oslo_data/neighbor_structure -hr 5 -pr 3
python ./generate_and_save_healpix_oslo_struct.py -o ../oslo_data/neighbor_structure -hr 4 -pr 2
Side Note: If you want to create DeepSphere graph structure as explained in 3(#3), which results in less expressive isotropic convolutional filters, you can use script generate_and_save_healpix_deepsphere_graph.py
which has almost similar arguments.
Step 3: Training
The training is done using main_sphere_compression.py
script. Two architectures were implemented in this script: Factorized prior 1(#1) and scale hyperprior 2(#2). For example, to use scale hyperprior architecture, one can use it as follows:
python ./main_sphere_compression.py --gpu -m SphereScaleHyperprior -nd ../oslo_data/neighbor_structure -i ../oslo_data/Healpix_res_10/train.txt -o ../oslo_data/scaleHyperprior_lambda0.0018 --lambda 0.0018 -q 1 --dataloader-num-workers 2 -bst 10 --checkpoint-interval 10 --no-scheduler
Note: If you get segmentation fault (core dumped)
while running the training, please double check that the torch version is compatible with torch_geometric
(PyG) version.
Arguments description:
-
[-g|--gpu]
: Run on with GPU (cuda). If GPU is not available remove this argument. -
[-gid|--gpu_id]
: Choose GPU device by its index. --gpu has to be set. Default is -1 (cuda). -
[-m|--model]
: Architecture to use which must be either SphereFactorizedPrior or SphereFactorizedPrior. -
[-i|--train-data]
: Path of the text file containing location of train images in HEALPix format (Created in step 1). -
[-nd|--neighbor-struct-dir]
: Path to the neighboring structure (Created in step 2). -
[-o|--out-dir]
: Path to the output directory to save model and results. -
[--lambda]
: Bit-rate distortion parameter in "Rate + lambda . Distortion". -
[-q|--quality]
: Quality level that determines number of parameters in the architecture (1: lowest, highest: 8) -
[--dataloader-num-workers]
: Determines the number of loader worker processes for multi-process data loading. -
[-bst|--batch-size-train]
: Batch size for training. -
[--checkpoint-interval]
: Interval to be used for saving data either for inference or resuming training. -
[--no-scheduler]
: Disable scheduler
You can define a validation set during training to have an scheduler to reduce the learning rate based on the loss value on the validation set.
python ./main_sphere_compression.py --gpu -m SphereScaleHyperprior -nd ../oslo_data/neighbor_structure -i ../oslo_data/Healpix_res_10/train.txt -v ../oslo_data/Healpix_res_10/validation.txt -o ../oslo_data/scaleHyperprior_lambda0.0018 --lambda 0.0018 -q 1 --dataloader-num-workers 2 -bst 10 -bsvt 10 --checkpoint-interval 1 --scheduler-patience 20
Additional Arguments:
-
[-v|--validation-data]
: Path to the text file containing location of the validation images in HEALPix format (Created in step 1). -
[-bsvt|--batch-size-valtest]
: Batch size for validation and test datasets. -
[--scheduler-patience]
: Patience in terms of number of epochs for scheduler ReduceLROnPlateau.
Note: If you want to load a checkpoint model for inference or resuming training you can use the following argument:
-
[--checkpoint-file]
: File address to resume training from the previous saved checkpoint
For example, to load the checkpoint of epoch 800:
python ./main_sphere_compression.py --gpu -m SphereScaleHyperprior -nd ../oslo_data/neighbor_structure -i ../oslo_data/Healpix_res_10/train.txt -v ../oslo_data/Healpix_res_10/validation.txt -o ../oslo_data/scaleHyperprior_lambda0.0018 --lambda 0.0018 -q 1 --dataloader-num-workers 2 -bst 10 -bsvt 10 --checkpoint-interval 1 --scheduler-patience 20 --checkpoint-file ../oslo_data/scaleHyperprior_lambda0.0018/checkpoint_800.pth.tar
Side note: The script can be used to compress images using DeepSphere architecture 3(#3), which results in less expressive isotropic convolutional filters due to the use of graph convolution. To compress a HEALPix image using DeepSphere, one can use the following script with arguments -pf max_pool -upf pixel_shuffle -c ChebConv -w gaussian
:
python ./main_sphere_compression.py --gpu -m SphereScaleHyperprior -nd ../oslo_data/neighbor_structure -i ../oslo_data/Healpix_res_10/train.txt -v ../oslo_data/Healpix_res_10/validation.txt -o ../oslo_data/DeepSphere_lambda0.0018 --lambda 0.0018 -q 1 --dataloader-num-workers 2 -bst 10 -bsvt 10 -pf max_pool -upf pixel_shuffle -c ChebConv -w gaussian --checkpoint-interval 1 --scheduler-patience 20 --checkpoint-file ../oslo_data/DeepSphere_lambda0.0018/checkpoint_800.pth.tar
Note that to use the above-mentioned script for DeepSphere, the graph structures must be created using generate_and_save_healpix_deepsphere_graph.py
and saved in ../oslo_data/neighbor_structure
.
Step 4: Evaluation
The same main_sphere_compression.py
script can be used to evaluate a trained model on the test dataset. The trained model can be loaded using [--checkpoint-file]
. For example, to evaluate a FactorizePrior
trained model we can use:
python ./main_sphere_compression.py --gpu -m SphereFactorizedPrior -nd ../oslo_data/neighbor_structure -t ../oslo_data/Healpix_res_10/test.txt -o ../oslo_data/factorizedPrior_lambda0.0932 --lambda 0.0932 -q 7 --dataloader-num-workers 2 -bsvt 10 --checkpoint-file ../oslo_data/checkpoints/factorizedPrior/factorizedPrior_lambda_0.0932_q_7.pth.tar --foldername-valtest reconstruction
Additional Arguments:
-
[--foldername-valtest]
: folder name (created in the address specified by-o
argument) in which the reconstructed files are stored in.npy
format. Note that- In addition to the reconstructed format in
.npy
, the grayscale Mollweide projection of the reconstructed HEALPix files are stored inpng
format for visualization. - Also, the rate of each file is written in
rates.txt
. In this file, for each image the theoretical (second column) and actual rates (third column) are written. Due to some inefficiency from the entropy coder and also some implementation details there is a slight difference between the actual and theoretical rates.
- In addition to the reconstructed format in
Note: The lambda
and quality
values must match with the values used during training and the checkpoint is saved with. For example, by changing lambda
value to 0.0009, the quality
is changed to 1 as follows:
python ./main_sphere_compression.py --gpu -m SphereFactorizedPrior -nd ../oslo_data/neighbor_structure -t ../oslo_data/Healpix_res_10/test.txt -o ../oslo_data/factorizedPrior_lambda0.0009 --lambda 0.0009 -q 1 --dataloader-num-workers 2 -bsvt 10 --checkpoint-file ../oslo_data/checkpoints/factorizedPrior/factorizedPrior_lambda_0.0009_q_1.pth.tar --foldername-valtest reconstruction
Note: In the experiments explained in our paper, we chose the values for the pair (lambda, quality)
from the following set: {(0.0005, 1), (0.0009, 1), (0.0018, 1), (0.0035, 2), (0.0067, 3), (0.0130, 4), (0.0250, 5), (0.0483, 6), (0.0932, 7), (0.1800, 8)}
.
Computing Spherical BD-rates
Script compute_spherical_psnr.py
is provided in misc
folder to compute rate-spherical BD-rates. Spherical PSNR (S-PSNR) and Weighted to Spherically uniform PSNR (WS-PSNR) are implemented as the objective quality evaluation. The script must be called as follows:
python ./misc/compute_spherical_psnr.py --original-dir ../oslo_data/images_equirectangular --original-ext jpg --projection-original erp -t ../oslo_data/Healpix_res_10/test.txt --test-files-prefix healpix_sampling_res_10_ --reconstruction-dir ../oslo_data/factorizedPrior_lambda0.0009 --reconstruction-subfolder reconstruction --reconstruction-ext npy --reconstruction-prefix healpix_sampling_res_10_ --reconstruction-suffix _reconstructed --projection healpix --rate-prefix healpix_sampling_res_10_ --sphere-points ./misc/sphere_655362.txt
Additional Arguments:
-
[-o|--original-dir]
: Directory where the original images are. -
[-oe|--original-ext]
: Extension of the original images. -
[-po|--projection-original]
: Projection type of the original images. In our case it iserp
(equirectangular). -
[-t|--test-data]
: The text file containing location of test images. -
[--test-files-prefix]
: Common prefix in the uncompressed test file names (compared to file names of the original images). -
[--test-files-suffix]
: Common suffix in the uncompressed test file names (compared to file names of the original images).. -
[-r|--reconstruction-dir]
: Main directory where the reconstructed projections are. -
[-rf|-reconstruction-subfolder]
: Folder name where the reconstructed projections are. It is the folder name we used after argument--foldername-valtest
during evaluation in step 4 where therates.txt
is stored. -
[-re|--reconstruction-ext]
: Extension of the reconstructed files. -
[-rp|--reconstruction-prefix"]
: Common prefix in the reconstructed file names. -
[-rs|--reconstruction-suffix"]
: Common suffix in the reconstructed file names. -
[-p|--projection"]
: Projection of reconstructed images (healpix
in our case). -
[-ratef|--rate-prefix"]
: File name prefix of each reconstructed image in the rate text file. -
[-rates|--rate-suffix"]
: File name suffix in each reconstructed image in the rate text file. -
[-s|--sphere-points"]
: File that shows sphere points uniformly sampled (needed for SPSNR and it is stored inmisc
folder).
Visualisation
To plot a Healpix sampled map in Mollweide projection create_mollweide_projection.py
script is provided in misc
folder. Note that Mollweide projection is a different projection and it is different from HEALPix sampling. It is just a way of visualization. The script can be used as follows:
python ./misc/create_mollweide_projection.py -i ../oslo_data/Healpix_res_10/healpix_sampling_res_10_01.npy -o ../oslo_data/Healpix_res_10/healpix_sampling_res_10_01.png -w 4000
Additional Arguments:
-
[-i|--input]
: File address of the input healpix sampled map (stored in npy file). -
[-o|--output]
: File address indicating where to save the Mollweide projection. -
[-w|--width]
: Width of the Mollweide projection image
License
This code is licensed under GNU General Public License v3.0.
Citation
If you use this project, please cite our relevant publication:
@ARTICLE{mahmoudian2022oslo,
author={Mahmoudian Bidgoli, Navid and de A. Azevedo, Roberto G. and Maugey, Thomas and Roumy, Aline and Frossard, Pascal},
journal={IEEE Transactions on Image Processing},
title={OSLO: On-the-Sphere Learning for Omnidirectional Images and Its Application to 360-Degree Image Compression},
year={2022},
volume={31},
number={},
pages={5813-5827},
doi={10.1109/TIP.2022.3202357},
}
References
[1] J. Ballé, V. Laparra, and E. P. Simoncelli, “End-to-end optimized image compression,” in International Conference on Learning Representations (ICLR), 2017.
[2] J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” in International Conference on Learning Representations (ICLR), 2018.
[3] N. Perraudin, M. Defferrard, T. Kacprzak, and R. Sgier, “DeepSphere: Efficient spherical convolutional neural network with HEALPix sampling for cosmological applications,” Astronomy and Computing, vol. 27, pp. 130–146, Apr. 2019