There is often a lot of confusion from people using MP4Box to create MP4 files or DASH content with a specific aspect ratio. This confusion often comes from badly-chosen acronyms in the MP4/DASH standards. In this post, we clarify what MP4Box uses and does.
First, there is an acronym mess! The acronym PAR can sometimes refer to Pixel Aspect Ratio (aka Sample Aspect Ratio, SAR) or Picture Aspect Ratio (aka Display Aspect Ratio, DAR). Unfortunately SAR also refers sometimes to Storage Aspect Ratio…
Regarding the input to MP4Box, it is possible when encoding a video stream to achieve better compression by not encoding all input pixels, for example by sub-sampling in one direction (horizontal, or vertical), often the horizontal one. This is illustrated in the following figure.
A decoder will then do the opposite process and will stretch the decoded pixels on the display. In this case, one pixel in the encoded image will correspond to two pixels in the displayed image. This is what happens in anamorphic video.
Now, let’s take an example. Consider an input image made of 200×100 pixels, encoded with the above configuration, i.e. with sub-sampling by a factor of 2 in the horizontal direction. When encoding using H.264|AVC or HEVC will produce a bitstream with the following characteristics:
- SPS dimensions: 100×100
- VUI message with desired sample aspect ratio 2:1.
Upon importing such bitstream, MP4Box will produce an MP4 file with the following characteristics:
- stsd: width: 100 / height: 100
- pasp box: hSpacing: 2 / vSpacing: 1
- track header: width: 200/ height: 100
Note that the same effect can be obtained, without encoding the video, by passing the parameter ‘par’ upon adding the video stream to the MP4 file.
MP4Box -add file.264:par=2:1 file.mp4
When DASH-ing such file, MP4Box will produce a DASH Representation with the following characteristics:
- width = 100 / height = 100
- sar = 2:1
- par = 200:100