Запис Детальніше

Модель и метод поиска ключевых кадров для реферирования видеоданных

Електронного архіву Харківського національного університету радіоелектроніки (Open Access Repository of KHNURE)

Переглянути архів Інформація
 
 
Поле Співвідношення
 
Creator Михнова, Е. Д.
 
Date 2015-03-13T09:43:52Z
2015-03-13T09:43:52Z
2014
 
Identifier Михнова, Е. Д. Модель и метод поиска ключевых кадров для реферирования видеоданных : автореф. дис. ... канд. техн. наук : 05.13.23 "Системы и средства искусственного интеллекта" / Е. Д. Михнова ; М-во образования и науки Украины, Харьк. нац. ун-т радиоэлектроники. – Харьков, 2014. – 147 с.
http://hdl.handle.net/123456789/1909
 
Description Дисертаційна робота присвячена розвитку актуального напрямку обробки
відеоданих – пошуку ключових кадрів у різних за жанром відеопослідовностях із
17
залученням методів просторово-часової сегментації. З точки зору сегментації відео у
часі досліджено переваги та недоліки існуючих методів визначення границь сцен. З
точки зору просторової сегментації розбиття поля зору кожного кадру реалізовано
за допомогою діаграм Вороного з урахуванням кольору, текстури та форми
областей. Вдосконалено метод визначення місцезнаходження опорних точок, за
якими будуються діаграми Вороного. Проаналізовано можливості та особливості
побудови діаграм Вороного різних порядків для представлення змісту відео.
Розроблено спеціалізовані метрики, які дозволили порівняти діаграми Вороного та
визначити значні зміни змісту в усьому відео.
На основні запропонованої математичної моделі представлення та аналізу
змісту відео побудовано метод пошуку ключових кадрів, який дозволяє не тільки
видобувати кадри зі значимим змістом, але й усувати близькі за змістом кадри.
Проведені експериментальні дослідження підтвердили конкурентоспроможність
нової моделі та методу, які було успішно впроваджено на промисловому
підприємстві.Integration of video endoscopes and cameras into industrial, medical, security and
other tracking systems has caused great interest to intellectual video processing. As a rule,
monitoring of any object assumes huge amount of time and human resources needed not
only for decision making, that influences on the state of objects under control, but also for
long lasting tracking and identification of interesting or non-ordinary conditions. Video
summarization is an excellent tool for redundancy elimination in video, which is achieved
by key frame extraction. These static images with significant content provide a brief
overview on what was going on in video for hours. Key frames can also facilitate
indexing, archiving, searching and cataloging video information.
Despite of variety of existing key frame extraction methods and content presentation
models, the main problem they face is the gap between information retrieved from video at
a low level and semantic interpretation at a high level required for efficient summarization.
Another challenge consists in different lighting conditions and camera characteristics, with
which a video is shot. That is why video summarization attracts more and more research
and development efforts.
To overcome the aforementioned problems a novel model and method have been
developed. The main mathematics which lies on its basis is Voronoi diagrams applied for
spatial segmentation of frames. The diagram corresponds to decomposition of a plane
(frames, in our case) into Voronoi tessellations or regions. Each tessellation of the first
order Voronoi diagram is connected with its own salient point and built under the
following rule: the distance from any point located at the same tessellation to its salient
point is less or equal to its distance to any other salient point. Voronoi diagrams of higher
order (or generalized Voronoi diagrams) are built according to the following rule: the
distance between the farthest point of one Voronoi tessellation to its corresponding
generator points is closer or equals to the distance to any nearest generator point of another
tessellation. Arbitrary Voronoi tessellation of order k may contain from 0 to k generator
points. A simple Voronoi diagram of the first order is a particular case of higher order
Voronoi diagrams.
Application of Voronoi diagrams has several advantages over traditional
segmentation into objects. First of all, Voronoi diagram takes much less computational
resources to build and, unlike object tracing, it is strictly defined. Higher order Voronoi
diagrams are more stable to dynamic changes in video content compared with salient
points as well. The latter gives less information and their changes from frame to frame are
more significant and hard to identify. In addition, Voronoi diagrams have never been used
19
for content presentation and key frame extraction, which offers great interest from the
point of competitiveness with other methods and models of this kind.
Throughout the thesis, the constructed model is defined in terms of several metrics
for comparing Voronoi diagrams which present content in each frame from different
points of view, namely color, texture and shape. Other frame characteristics, such as
motion and area of segments, have not been taken into consideration because of some
reasons which have been provided. The resulting metric has also been proposed, uniting
frame characteristics for machine level understanding. Each of the metrics has been
presented using Voronoi tessellations and salient points only. The model aims at providing
full presentation and description of frame content. A new method based on the model has
been described in details.
Various improvement measures have been designed for the method. The first one
consists in enhancement of Harris method (for salient point selection) via k-means
clustering that brings to invariance of initial placement of salient points as a result (only
from the point of quality, but not time needed for a number of clustering iterations). The
second one lies in stabilization of key frame descriptions with higher order Voronoi
diagrams. The third one implies integration of shot boundary detection method into the
proposed key frame extraction procedure. Voronoi diagrams of highest order turned out to
be the most stable for changes in content. Highest order Voronoi diagrams constructed for
frames from the same shot may look almost the same and can also be used for the purpose
of shot boundary detection along with some other methods being analyzed. Frame size and
resolution optimization and centroid-based clustering techniques have been studied to
select the most reasonable solution for key frame extraction.
The obtained results have been verified on test samples taken from open source
TRECVid collection, Internet Archive, Movie Content Analysis Project, Open Video
Project. Among the test samples are several commercials and self-made high definition
videos (shot at the city centre). Different estimators have been applied to check validity,
performance and quality of the proposed model and method. The latter has also been
compared with some of the existing methods used for key frame extraction. Scientific
results have been adopted in an industrial enterprise ELTA, which decreased experts’ time
needed for endoscopic material processing and enabled to add key frames as images for
the reports. Scientific results have been also testified by UkrSEPRO and adopted in
academic activity at Kharkiv National University of Radio Electronics.
 
Language uk
 
Subject реферування відео
ключовий кадр
просторово-часова сегментація
діаграма Вороного
опорна точка
video summarization
key frame
spatio-temporal segmentation
Voronoi diagram
salient point
 
Title Модель и метод поиска ключевых кадров для реферирования видеоданных
 
Type Abstract