Name
Efficient Content driven Encoding Towards a Target Video Quality
Date & Time
Thursday, October 24, 2024, 9:30 AM - 10:00 AM
Description

Video streaming has become a prevalent method for accessing multimedia content, where optimizing video quality while ensuring smooth playback is crucial. Traditional methods use a fixed bitrate ladder with predetermined bitrate-resolution pairs, which often fail to co-optimize bitrate and video quality. This mismatch results in either suboptimal video quality for a given bitrate or excessive bitrate for the desired quality. This paper introduces a novel content-driven approach that predicts encoding parameters to achieve target perceptual video quality. We utilize the Video Multimethod Assessment Fusion (VMAF) metric, which aligns closely with human perception and is derived from a machine learning model trained on extensive video content and subjective quality assessments. Our approach employs a deep learning model with a SlowFast Network Architecture, analyzing video at two resolutions to predict a VMAF rate-distortion curve. This model uses higher-resolution frames for spatial information and more frequent lower-resolution frames for temporal details, optimizing encoding efficiency and perceptual quality. Our results demonstrate that this method reduces encoding attempts and cloud computing resources, while maximizing video quality and encoding efficiency.

Technical Depth of Presentation
Intermediate
Take-Aways from this Presentation

Can we emply learning methodologies and use predictive models for learning encoding parameters. How important is content in determining the encoding parameters. Determining the curves that would work best for such a VMAF rate-distortion curve.