Video Compression Using Convolutional Neural Networks of Video with Chroma Subsampling
Date & Time
Monday, October 24, 2022, 12:15 PM - 12:45 PM
Rob Gonsalves Vahid Khorasani Ghassab

In the context of Convolutional Neural Networks based video compression, motivated by the lower acuity of the human visual system for color differences as compared with luma, we investigate a video compression framework using autoencoder networks that encode and decode videos by using less chroma information than luma information. For this purpose, instead of converting Y’CbCr 4:2:2/4:2:0 videos to and from RGB 4:4:4 as per the current state-of-the-art, we have kept the video in Y’CbCr  4:2:2/4:2:0 and merged the luma and chroma channels after the luma is downsampled to match the chroma size. We have performed an inverse function for the decoder. The performance of our models against the 4:4:4 baseline is evaluated by using CPSNR, MS-SSIM, and VMAF metrics. Our experiments reveal that, as compared to video compression involving conversion to and from RGB 4:4:4, the proposed method increases the video quality by about 5% for Y’CbCr 4:2:2 and 6% for Y’CbCr 4:2:0 while reducing the amount of computation by nearly 37% for Y’CbCr 4:2:2 and 40% for Y’CbCr 4:2:0. These results point us to optimization for 4:2:2 and 4:2:0 video of the current state-of-the-art autoencoder.

Location Name
Salon 1
Take-Aways from this Presentation
We designed a new codec framework, and our contributions are as follows: We propose a novel autoencoder architecture in which two parallel sections are defined for each of the encoders and decoders. The proposed video compression models increase the decoded video quality while decreasing the computational cost compared to current autoencoder-based compression methods. We propose keeping the video in Y'CbCr 4:2:0 spaces and merging the luma and chroma channels after the luma is downsampled to match the chroma size instead of converting Y'CbCr 4:2:0 videos to and from RGB 4:4:4.