Video programmes will have many different types of shot boundaries, and those where the shot boundary occurs over a number of frames may not be detected by the method above. For example, a fade or dissolve occurring over a 3 second period in, say, a cookery or gardening program will span a total of 75 frames and the incremental difference between adjacent frames in this sequence will be quite small. Furthermore, it is possible that the two separate shots in such a transition may have similar colouring and hence colour histograms, anyway. Such omissions are difficult to avoid using colour histogram based segmentation, but nevertheless this method is very popular and if used correctly it is very reliable and accurate. On the downside it is slow to compute because each frame of the digital video has to be decoded and calculations run on it to extract color values.




Many algorithms have been proposed for detecting video shot boundaries and classifying shot and shot transition types. Few published studies compare available algorithms, and those that do have looked at limited range of test material. A comparison of several shot boundary detection and classification techniques can be done using their variations including histograms, discrete cosine transform, motion vector, and block matching methods. The performance and ease of selecting good thresholds for these algorithms are evaluated based on a wide variety of video sequences with a good mix of transition types. Threshold selection requires a trade-off between recall and precision that must be guided by the target application.



Different features that can be used to measure visual discontinuity Pixel differences

Two common approaches:

• Calculate pixel-to-pixel difference & Compare the sum with a threshold
• Count the number of pixels that change in value more than some threshold & Compare the total number against a second threshold.




In the first step of this process, feature extraction is performed, where the features depict various aspects of the visual content of a video. Then, a metric is used to quantify the feature variation from k frame to frame k+l. The discontinuity value z(k, k+l) is the magnitude of this variation and serves as an input into the detector. There, it is compared against a threshold T. If the threshold is exceeded, a shot boundary between frames k and k+l is detected. To be able to draw reliable conclusions about the presence or absence of a shot boundary between frames k and k+l , we need to use the features and metrics for computing the discontinuity values z(k, k+l) that are as discriminating as possible.



Problem: Given a video V consisting of n shots, find the beginning and the end of each shot. It is fundamental to any kind of video analysis and video application since it enables segmentation of a video into its basic components.




Shot-boundary detection is the first step towards scene extraction in videos, which is useful for video content analysis and indexing. Shots are basic units of a video. There are many types of transitions between shots. Shot boundaries can be classified into two main categories: cut and gradual. A cut is an abrupt shot change that occurs over a single frame while a gradual is a slow change that occurs in a number of consecutive frames. With the gradual type, fades and dissolves are common. A fade is usually a change in brightness with one or several solid black frames in between, while a dissolve occurs when the images in the current shot get dimmer and the images of the next shot get brighter.




