Codecs work in two ways – using temporal and spatial
compression. Both schemes generally work with "lossy" compression,
which means information that is redundant or unnoticeable to the viewer gets
discarded (and hence is not retrievable).
Temporal compression is a method of compression which
looks for information that is not necessary for continuity to the human eye It
looks at the video information on a frame-by-frame basis for changes between
frames. For example, if you're working with video of a section of freeway,
there's a lot of redundant information in the image. The background rarely
changes and most of the motion involved is from vehicles passing through the
scene. The compression algorithm compares the first frame (known as a key
frame) with the next (called a delta frame) to find anything that changes.
After the key frame, it only keeps the information that does change, thus
deleting a large portion of image. It does this for each frame. If there is a
scene change, it tags the first frame of the new scene as the next key frame
and continues comparing the following frames with this new key frame. As the
number of key frames increases, so does the amount of motion delay. This will
happen if an operator is panning a camera from left to right.
Spatial compression uses a different method to delete
information that is common to the entire file or an entire sequence within the
file. It also looks for redundant information, but instead of specifying each
pixel in an area, it defines that area using coordinates.
Both of these compression methods reduce the overall
transmission bandwidth requirements. If this is not sufficient, one can make a
larger reduction by reducing the frame rate (that is, how many frames of video
go by in a given second). Depending on the degree of changes one makes in each
of these areas, the final output can vary greatly in quality.
Hardware codecs are an efficient way to compress and
decompress video files. Hardware codecs are expensive, but deliver high-quality
results. Using a hardware-compression device will deliver high-quality source
video, but requires viewers to have the same decompression device in order to
watch it. Hardware codecs are used often in video conferencing, where the
equipment of the audience and the broadcaster are configured in the same way. A
number of standards have been developed for video compression – MPEG, JPEG, and
video conferencing.
Video Compression
MPEG stands for the Moving Picture Experts Group. MPEG is
an ISO/IEC working group, established in 1988 to develop standards for digital
audio and video formats. There are five MPEG standards being used or in
development. Each compression standard was designed with a specific application
and bit rate in mind, although MPEG compression scales well with increased bit
rates.
Following is a list of video compression standards:
•MPEG-1 – designed for transmission rates of up to 1.5
Mbit/sec – is a standard for the compression of moving pictures and audio. This
was based on CD-ROM video applications, and is a popular standard for video on
the Internet, transmitted as .mpg files. In addition, level 3 of MPEG-1 is the
most popular standard for digital compression of audio—known as MP3. This
standard is available in most of the video codec units supplied for FMS and
traffic management systems.
•MPEG-2 – designed for transmission rates between 1.5 and 15
Mbit/sec – is a standard on which Digital Television set top boxes and DVD
compression is based. It is based on MPEG-1, but designed for the compression
and transmission of digital broadcast television. The most significant
enhancement from MPEG-1 is its ability to efficiently compress interlaced
video. MPEG-2 scales well to HDTV resolution and bit rates, obviating the need
for an MPEG-3. This standard is also provided in many of the video codecs
supplied for FMS.
•MPEG-4 – a standard for multimedia and Web compression -
MPEG-4 is an object-based compression, similar in nature to the Virtual Reality
Modeling Language (VRML). Individual objects within a scene are tracked
separately and compressed together to create an MPEG4 file. The files are sent
as data packages and assembled at the viewer end. The result is a high quality
motion picture. The more image data that is sent the greater the lag-time (or
latency) before the video begins to play. Currently, this compression standard
is not suited for real-time traffic observation systems that require
pan-tilt-zoom capability. The "forward and store" scheme used in this
system inhibits eye-hand coordination. However, this is an evolving standard.
The latency factor between image capture and image viewing is being reduced.
The latency factor can be reduced to a minimum if the image and motion quality
do not have to meet commercial video production standards. Most surveillance
systems can function without this quality and can use pan-tilt-zoom functions.
•MPEG-7 – this standard, currently under development, is
also called the Multimedia Content Description Interface. When released, it is
hoped that this standard will provide a framework for multimedia content that
will include information on content manipulation, filtering and
personalization, as well as the integrity and security of the content. Contrary
to the previous MPEG standards, which described actual content, MPEG-7 will
represent information about the content.
•MPEG-21 – work on this standard, also called the Multimedia
Framework, has just begun. MPEG-21 will attempt to describe the elements needed
to build an infrastructure for the delivery and consumption of multimedia
content, and how they will relate to each other.
•JPEG – stands for Joint Photographic Experts Group. It is
also an ISO/IEC working group, but works to build standards for continuous tone
image coding. JPEG is a lossy compression technique used for full-color or
gray-scale images, by exploiting the fact that the human eye will not notice
small color changes. Motion JPEG is a standard that is used for compression of
images transmitted from CCTV cameras. It provides compressed motion in the same
manner as MPEG, but is based on the JPEG standard.
•H.261 – is an ITU standard designed for two-way
communication over ISDN lines (video conferencing) and supports data rates
which are multiples of 64Kbit/s.
•H.263 – is based on H.261 with enhancements that improve
video quality over modems.
•H.264 – is the latest MPEG standard for video encoding that
is geared to take video beyond the realms of DVD quality by supporting Hi
Definition CCTV video. H.264 can also reduce the size of digital video by more
than 80% compared with M-JPEG and as much as 50% with MPEG-4, all without
compromising image quality. This means that much less network bandwidth and
storage space are required. Since the typical storage costs for surveillance
projects represent between 20 and 30 percent of the project cost significant
savings can be made.
Advantage:-
1. H.264 cameras
is that they reduce the amount of bandwidth needed.if your megapixel camera
needed 10 Mb/s before (with MJPEG), it might now need only 1.5 Mb/s. So for
each camera, you will save a lot of bandwidth.
2. Eliminates
barriers: Enables many more networks to support megapixel cameras.
3. The bitstream
is fully compatible with existing decoders with no error/drift.
Disadvantages:-
1. Using analytics
with these cameras reduces the H.264 benefit.
2. Costs few
hundred dollars more per camera.