This paper proposes a framework to aid video analysts in detecting suspicious
activity within the tremendous amounts of video data that exists in today’s
world of omnipresent surveillance video. Ideas and techniques for closing the
semantic gap between low-level machine readable features of video data and
high-level events seen by a human observer are discussed. An evaluation of the
event classification and diction technique is presented and future an experiment
to refine this technique is proposed. These experiments are used as a lead to a
discussion on the most optimal machine learning algorithm to learn the event
representation scheme proposed in this paper. 1.