video clustering in matlab

https://stackoverflow.com/questions/18035978

23-06-2022
|

Question

i'm trying to write the code of this article:"Improving Cluster Selection and Event Modeling in Unsupervised Mining for Automatic Audiovisual Video Structuring"
a part of the it is about video clustering:
"The video stream is segmented into shots based on color histograms to detect abrupt changes and progressive transitions. Each of the resulting shot is summarized by a key frame, taken in the middle of the shot, in turn represented as a RGB histogram with 8 bins per color. Bottom-up clustering relies on the Euclidean distance between the 512-dimension color histograms using Ward’s linkage."
i've done this and reached to an array of numbers like this:
1.0e+03 *

that after performing the dendrogram part, the result became:

but according to a toy example that was given by authors of the article the result should be intervals like:
{47000, 50000}, {143400, 146400}, {185320, 187880},{228240, 231240}, {249440, 252000}, {346000, 349000} what is wrong here?

Solution

You should have 512 dimensional vectors at the first step, one such vector per frame, or equivalently a 512 x n matrix.

Then in the second step I don't think they use the plain built-in hierarchical clustering - which is not time aware, and will not produce intervals, plus it will scale O(n^3) which is really bad - but instead they use a customized clustering algorithm, inspired by hierarchical clustering and using Ward's linkage, but which operates on time intervals; starting with single frames, but only joining neighboring intervals, not arbitrary intervals like regular hierarchical clustering would.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow