Session Details: SMPTE Media Technology Summit 2024

Name

Efficient Content driven Encoding Towards a Target Video Quality

Date & Time

Thursday, October 24, 2024, 9:30 AM - 10:00 AM

Description

Video streaming has become a prevalent method for accessing multimedia content, where optimizing video quality while ensuring smooth playback is crucial. Traditional methods use a fixed bitrate ladder with predetermined bitrate-resolution pairs, which often fail to co-optimize bitrate and video quality. This mismatch results in either suboptimal video quality for a given bitrate or excessive bitrate for the desired quality. This paper introduces a novel content-driven approach that predicts encoding parameters to achieve target perceptual video quality. We utilize the Video Multimethod Assessment Fusion (VMAF) metric, which aligns closely with human perception and is derived from a machine learning model trained on extensive video content and subjective quality assessments. Our approach employs a deep learning model with a SlowFast Network Architecture, analyzing video at two resolutions to predict a VMAF rate-distortion curve. This model uses higher-resolution frames for spatial information and more frequent lower-resolution frames for temporal details, optimizing encoding efficiency and perceptual quality. Our results demonstrate that this method reduces encoding attempts and cloud computing resources, while maximizing video quality and encoding efficiency.

Speakers

Trisha Mittal - Dolby Laboratories, Inc.

Technical Depth of Presentation

Intermediate

Take-Aways from this Presentation

Can we emply learning methodologies and use predictive models for learning encoding parameters. How important is content in determining the encoding parameters. Determining the curves that would work best for such a VMAF rate-distortion curve.

Presentation

24_10_24_R1_1000_Mittal_pres.pdf

Manuscript

24_10_24_R1_Mittal_Man_Final.pdf

The New Paradigm of Software Architected Broadcast Facilities:

View Session