EvalVid with GPAC - Usage

You need some software which can produce MP4-files, e.g., MP4Box from GPAC, mp4creator from MPEG4IP or Apple QuickTime Pro. Here, only the use of MP4Box is described. Also, I recommend that you install the GIT-version of ffmpeg:
 git clone git://git.ffmpeg.org/ffmpeg.git 
If you have trouble compiling ffmpeg, try my pre-compiled binaries.

Video Source

You can download several Video source files in CIF format from, e.g., here.

Codec

These examples will create compressed raw videos with 30 frames per second, a GOP length of 30 frames with no B-frames. The bitrate-control from XviD does not work, so it is omitted here. Also, ffmpeg H.263 does not support B-frames.

  1. XviD (MPEG-4)
    xvid_encraw -i akiyo_cif.yuv -w 352 -h 288 -framerate 30 -max_key_interval 30 -o a01.m4v
  2. ffmpeg (MPEG-4)
    ffmpeg -s cif -r 30 -b 64000 -bt 3200 -g 30 -i akiyo_cif.yuv -vcodec mpeg4 a02.m4v
  3. x264 (H.264)
    x264 -I 30 -B 64 --fps 30 -o a03.264 --input-res 352x288 akiyo_cif.yuv
  4. JM 10.2 (H.264)
    lencod -d a04.cfg
  5. ffmpeg (H.263)
    ffmpeg -s cif -vcodec h263 -r 30 -b 48000 -bt 2400 -g 30 -i akiyo_cif.yuv -f h263 a05.263

MP4-Container

Following command lines create ISO MP4 files containing the video samples (frames) and a hint track which describes how to packetize the frames for the transport with RTP.
      MP4Box -hint -mtu 1024 -fps 30 -add a01.m4v a01.mp4
      MP4Box -hint -mtu 1024 -fps 30 -add a02.m4v a02.mp4
      MP4Box -hint -mtu 1024 -fps 30 -add a03.264 a03.mp4
      MP4Box -hint -mtu 1024 -fps 30 -add a04.264 a04.mp4
      MP4Box -hint -mtu 1024 -fps 30 -add a05.263 a05.mp4
    
MP4Box is part of GPAC (very good framework, btw). You should install the SVN version of GPAC:
  svn co https://gpac.svn.sourceforge.net/svnroot/gpac/trunk/gpac gpac 
If you have trouble compiling MP4Box, try my pre-compiled binaries.

Creating Reference Videos

For some (most) video quality assessment methods you need a reference video. This is either the original YUV file before the encoding or the YUV file created by decoding the coded video. Wether you need the "decoded" YUV depends on what you want to achieve. If you want to assess the video quality of a video over network system and not only the encoder quality you should create the "decoded" YUV file. To produce these YUVs, you should use the corresponding video decoder (xvid_decraw, ffmpeg, ldec). If it doesn't exist or can't output YUV, you can try this:
      ffmpeg -i a01.mp4 a01_ref.yuv
      ffmpeg -i a02.mp4 a02_ref.yuv
      ffmpeg -i a03.mp4 a03_ref.yuv
      ffmpeg -i a04.mp4 a04_ref.yuv
      ffmpeg -i a05.mp4 a05_ref.yuv
    

Sender and Receiver

The mp4trace tool from EvalVid is able to send a hinted mp4-file per RTP/UDP to a specified destination host.

  mp4trace -f -s 192.168.0.2 12346 a01.mp4 
sends the H.264 track of a01.mp4 to the UDP port 12346 of host 192.168.0.2. You can watch the video with, e.g., QuickTime Player from Apple. For this purpose you need a SDP-file including the lines:
      m=video 12346 RTP/AVP 96
      a=rtpmap:96 H264/90000 (or H263-1998/90000 or whatever)
    
If you have MP4Box, you can extract the SDP-file with:
      MP4Box -std -sdp a01.mp4 > a01.sdp
    
and replace port 0 by, e.g., 12346 in the "m=video" line of a01.sdp.

If you are only interested in off-line evaluation you should use something like

  netcat -l -u -p 12346 > /dev/null 
to avoid unnecessary ICMP traffic. Furthermore, the output of mp4trace will be needed later, so it should be redirected to a file:
  mp4trace -f -s 192.168.0.2 12346 a01.mp4 > st_a01 

In order to evaluate the transmission you have to trace the IP packets at the sender and at the receiver. For this purpose I recommend tcpdump/windump. The command line on the sender is:

  tcpdump -n -tt -v udp port 12346 > sd_a01 
and on the receiver respectively:
  tcpdump -n -tt -v udp port 12346 > rd_a01 
Alternatively it is possible to use the RTP sequence number instead of the IPv4 sequence number. This is mandatory in case of IPv6 traces, since there is no sequence number in IPv6 anymore. The command line in this case is:
 tcpdump -n -tt -v -T rtp udp port 12346 

If there are both a video and an audio track encapsulated in the mp4 file mp4trace will send both streams to the given host address. The audio track is delivered on the given port plus 2. The resulting trace file will include packets/frames from both streams. It can be split by

 grep A st > sta 
 grep -v A st > stv 
The corresponding dump files can be generated/recorded for instance as follows:
 tcpdump -w all.cap 
 tcpdump -r all.cap -n -tt -v udp port 12346 > [s/r]dv 
 tcpdump -r all.cap -n -tt -v udp port 12348 > [s/r]da 
The evaluation with etmp4 can be done separately for the audio and video tracks. There is no difference in the command line options. For the evaluation of the perceptual quality of the audio track I recommend PESQ and PEAQ respectively.

Video and Trace Files

When mp4traced finished the transmission of the video, press ^C in the shells where tcpdump runs at the sender and at the receiver (or "killall tcpdump" or the like). Now you have the MP4 file and the corresponding trace files for both the sending and the receiving side. With the original YUV file this is all you need to evaluate the QoS and perceptual quality with EvalVid.
akiyo_cif.yuv (and a01_ref.yuv) raw source files (before and after encoding)
a01.mp4 encoded, encapsulated and hinted video file
sd_a01 sender dump (IP packet dump sender)
rd_a01 receiver dump (IP packet dump receiver)
st_a01 sender trace (information about frame types, packet segmentation, ...)

Error generation

There are three possibilities to generate errors without a real network transmission:
  1. Manually delete lines (IP packets) from the rd file. This is useful, e. g., to test the behaviour of a codec in case of determined losses.
    A complete rd file (no losses) could be obtained by transmitting the video over a wired link with low traffic (optimal is a direct link between sender and transmitter with a cross-connect cable).
  2. The eg (error generator from EvalVid) tool takes a sd and st file and, given a bit error rate and an error distribution model, generates a rd file, where the lost packets are marked.
      eg sd_a01 rd_a01g st_a01 AWGN 250000
    This generates the rd_a01g file assuming an AWGN error model and a BER of 4E-6 (1/250000).
  3. Integrate the framework in a network simulation, e. g., Omnet++ or NS-2. If you are interested in this possibility drop me a mail.

Evaluation

The first step in the evaluation process is the calculation of the reference PSNR. That's the PSNR of the coded and then decoded video without transmission errors/losses in relation to the uncoded raw video source.

  psnr 352 288 420 akiyo_cif.yuv a01_ref.yuv > ref_psnr.txt 

The next step is the reconstruction of the transmitted video as it is seen by the receiver. For this, the video and trace files are processed by etmp4 (Evaluate Traces of MP4-file transmission):

  etmp4 -F -x sd_a01 rd_a01 st_a01 a01.mp4 a01e 
This generates a (possibly corrupted) video file, where all frames that got lost or were corrupted are deleted from the original video track. Actually, two files are saved, a MP4-file containing the damaged video track (a01e.mp4), and a raw video file containing only the undamaged frames (a01e.264). These files are decoded to produce the YUV-file as seen at the receiver. If the appropriate decoder is not able to produce a YUV with as much frames as the original, you can try:
  ffmpeg -i a01e.mp4 a01e.yuv 
The resulting YUV file should contain exactly as much frames as the original YUV file. Unfortunately most codecs are not able to decode corrupted video files properly. E.g., ffmpeg often produces less frames or even crashes. There is nothing I can (or rather want to) do about this. However, in my tests the decoding mostly worked. Sometimes the last frame is not decoded, but this is not really a problem. The PSNR of the received video is calculated by:
  psnr 352 288 420 akiyo_cif.yuv a01e.yuv > psnr_a01e.txt 

Etmp4 also creates some more files:
loss_a01e.txt contains I, P, B and overall frame loss in %)
delay_a01e.txt containes frame-nr., lost-flag, end-to-end delay, inter-frame gap sender, inter-frame gap receiver, and cumulative jitter in seconds
rate_s_a01e.txt contains time, bytes per second (current time interval), and bytes per second (cumulative) measured at the sender
rate_r_a01e.txt contains time, bytes per second (current time interval), and bytes per second (cumulative) measured at the receiver

If you are interested in delay or jitter distributions, the hist tool could be of interest. E.g.,

  awk '{print $3}' delay_a01.txt | hist - 0 .05 50 
gives the time, PDF and CDF of the end-to-end delay of transmission a01.

Video Quality

Since the PSNR alone does not mean much, you might want to use a quality metric, that calculates the difference between the quality of the encoded video and the received (possibly corrupted) video. For this purpose EvalVid provides the MOS and MIV tools. They calculate the "Mean Opinion Score" of every single frame of the received video and compares it to the MOS of every single frame of the original video. It counts (within a given intervall) the amount of frames with a MOS worse than original. If you've made two measurements and calculated, e.g., ref_psnr.txt, psnr_01.txt and psnr_02.txt and put these files into the directory "/work", the command line:
  mos /work ref_psnr.txt 25 > mos.txt
calculates the average MOS of every psnr file (last column of mos.txt) and the percentage of frames with a MOS worse than in the reference file in a sliding interval of 25 frames. These percentages are stored in miv_a01e.txt. Finally the MIV command calculates the maximum percentage of frames with a MOS worse than original:
  miv /work > miv.txt
For further explanations of this metric have a look at this paper.

Valid HTML 4.01 Transitional