08-Mar-2007

Dave Zes, UCLA grad stats student. Ph.D. Advisor, Professor Yingnian Wu.

The following is with reference to Doretto, Chiuso, Wu, Soatto, "Dynamic Textures", International Journal of Computer Vision 51(2), 91-109, 2003.

I transcribed the original MatLab code into R. Image processing under R requires the addition of libtif on Unix, and rtiff and pixmap packages in R.

Using the algorithm given in the paper I processed a number of videos.

The process goes as follows:

Video frames are descretized; from each frame RGB pixel data is separated; each bitmap is compressed by PCA; the resulting matrices are processed by the method of Wu, Soatto in the above cited paper; the extrapolated textures are expanded into their original dimensionality; finally the frames are sequenced back into video.

The single most interesting result is that sometimes the Dynamic Texture Extrapolation will show evidence of "splicing", but sometimes not.

The process allows the user to supply a Training video, and returns a video of arbitrary length including the training portion.

In every video below, except "Star", the original video is given at 24 fps, the extrapolated video is given at 8 fps.

The most interesting result is the "Clouds" video:

Original (61 frames)

Extrapolated (150 frames, total)

Notice there is no evidence of repetition; the training portion smoothly transitions into synthesized textures.

It is likewise difficult to see evidence of splicing in "Flame":

Original (89 frames)

Extrapolated (150 frames, total)

Splicing is evident in "Fire":

Original (32 frames)

Extrapolated (100 frames, total)

And also "Ocean":

Original (32 frames)

Extrapolated (100 frames, total)

 

I also constructed some strongly cyclic video, with which I was able to compare the Dynamic Texture process with Adaptive Least Squares (ALS).

For each pixel ALS views values as an autoregressive series -- correlation through time is easy to since there is no moving average component (either in the video or in the model).

The original star video is not shown; it is 18 frames (one-forth of a rotation).

Star Dynamic Texture (100 frames)

Star ALS (100 frames)

In the ALS process, the (rather acute) problem of singularity of the Var-Covar matrix was handled easily by slight "diagonal loading", aka regularization.