You can use this short video, with the object bounding box as [320.8751 175.1776 103.5404 129.0504](top left width height) at the first frame. Track the center of the box and see how it works.
You can also use built-in Matlab function to detect face every k frame. See how it helps and comment on what you observe.