A streaming media and video glossary that contains definitions of video terms, technologies and techniques related to live streaming, broadcasting and video hosting.
These video terms are relevant for both new techniques and legacy methods, which still have ramifications today when handling older media. The glossary will be continuously updated as the industry evolves.
2 3 Pull Down (aka: Three-two Pulldown)
A process used to convert material from film to interlaced NTSC display rates, from 24 to 29.97 frames per second. This is done by duplicating fields, 2 from one frame and then 3 from the next frame or vice-a-versa.
608 Captions (aka: line 21 captions, EIA-608, CEA-608)
These captions contain white text against a black box that surrounds the text. It appears on top of video content and has support for four caption tracks.
708 Captions (aka: CEA-708)
These captions were designed with digital distribution of content in mind. They are a more flexible version of captions over the older 608 caption approach, allowing for more caption tracks, more character types and the ability to modify the appearance.
AAC (aka: Advanced Audio Coding)
This audio coding format is lossy, featuring compression that does impact the audio quality. It offers better compression and increased sample frequency when compared to MP3.
AC-3 (aka: Audio Codec 3, Advanced Codec 3, Acoustic Coder 3)
A Dolby Digital audio format found on many home media releases. Dolby Digital is a lossy format, featuring compression that will impact audio quality. The technology is capable of utilizing up to six different channels of sound. The most common surround experience is a 5.1 presentation.
Adaptive Streaming (aka: Adaptive Bitrate Streaming)
This streaming approach offers multiple streams of the same content at varying qualities. These streams are served inside the same video player and often differ based on bitrate and resolution. Ideally the player should serve the viewer the bitrate most appropriate to their setup, based on qualifications like download speed.
B-frames (aka: bi-directional Predicted Frames)
These frames follow another frame and only contain part of the image in a video. B-frames look backward and forward to a previous or later p-frame or keyframe (i-frame) and only contain new information not already presented.
In relation to video, bandwidth is used to describe an internet connection speed or as a form of consumption in relation to web hosting. For speed, it is used as a point of reference for an internet connection. When it comes to streaming content, this is important as a viewer has to have enough bandwidth in order to watch. For web hosting, bandwidth can be used as a measure of consumption.
Bit Rate (aka: data rate or bitrate)
The amount of data per unit of time. For streaming, this is in the context of video and audio content and often given in a unit of seconds, often expressed in terms of kilobits (kbps) and megabits (Mbps).
Video streaming involves sending over video chunks of data to an end user. The video player will then create a buffer involving chunks that have not yet been viewed. This process is intended to let the viewer watch from the buffer in the event a video chunk is lost. Ideally the lost video chunk will be received before the buffer is emptied, causing no disruption in viewing. However, it’s quite possible for the viewer to have a connection speed that is poor enough that the video chunk does not arrive before the buffer is empty. If this occurs the video content will stop and the player will generally wait until more data is received. This will generally provide a buffering message while the player will wait for the lost video chunk and will attempt to rebuild the buffer.
CDN (aka: Content Delivery Network)
These are large networks of servers that have copies of data, pulled from an origin server, and are often geographically diverse in their location. The end user pulls the needed resources from the server that is closest to them, which is called an edge server. This process is done to decrease any delays that might be caused due to server proximity to the end user, as larger physical distances will result in longer delays, and ideally avoid congestion issues. Due to the resource intensive process of video streaming, most streaming platforms utilize a CDN.
CRTP (aka: Compressed Real Time Transport Protocol)
This is a compressed form of RTP. It was designed to reduce the size of the headers for the IP, UDP (User Datagram Protocol) and RTP. For best performance, it needs to work with a fast and dependable network or can experience long delays and packet loss.
Deinterlacing filters combine the two alternating fields found in interlaced video to form a clean shot in a progressive video. Without deinterlacing, the interlaced content will often display motion with a line-like appearance.
This is a media player that is enclosed in a web source, which can range dramatically from being seen in an HTML document on a website to a post on a forum. Players will vary based on appearance, features and available end user controls. An iframe embed, which can be used to embed a variety of content, is one of the most common methods of embedding a video player.
H.264 (aka MPEG-4 Part 10, Advanced Video Coding, MPEG-4 AVC)
A video compression technology, commonly referred to as a codec, that is defined in the MPEG-4 specification. The container format for H.264 is defined as MP4.
Adobe’s HTTP Dynamic Streaming is an HTTP-based technology for adaptive streaming. It segments the video content into smaller video chunks, allowing switching between bit rates when viewing.
Apple’s HTTP Live Streaming is an adaptive streaming technology. It functions by breaking down the stream into smaller MPEG2-TS files. These files vary by bitrate and often times resolution, and ideally are served to the viewer based on the criteria of their setup such as download speed.
A technique used for television video formats, such as NTSC and PAL, in which each full frame of video actually consists of alternating lines taken from two separate fields captured at slightly different times. The two fields are then interlaced or interleaved into the alternating odd and even lines of the full video frame. When displayed on television equipment, the alternating fields are displayed in sequence, depending on the field dominance of the source material.
IP Camera (aka: Internet Protocol Camera)
A digital camera that can both send and receive data via the Internet or computer network. These cameras are designed to support a limited number of users that could connect directly to the camera to view. They are RTSP (Real Time Streaming Protocol) based, and for that reason are not largely supported by broadcasting platforms without using special encoders.
Keyframe (aka: i-frame, Intra Frame)
This is the full frame of the image in a video. Subsequent frames only contain the information that has changed between frames. This process is done to compress the video content.
Key Frame Interval (aka: Keyframe Interval)
Set inside the encoder or when the video is being encoded, the key frame interval controls how often a keyframe is created in the video. The keyframe is a full frame of the image. Other frames will generally only contain the information that has changed.
Relates to media content being delivered live over the Internet. The process involves a source (video camera, screen captured content, etc), an encoder to digitize the feed (Teradek VidiU, Telestream Wirecast, etc), and a platform such as Ustream or another provider that will typically take the feed and publish it over a CDN (Content Delivery Network). Content that is live streamed will typically have a delay in a magnitude of seconds compared to the source.
Lossless encoding is any compression scheme, especially for audio and video data, that uses a nondestructive method that retains all the original information. Consequently, lossless compression does not degrade sound or video quality meaning the original data could be completely reconstructed from the compressed data.
Lossy encoding is any compression scheme, especially for audio and video data, that removes some of the original information in order to significantly reduce the size of the compressed data. Lossy image and audio compression schemes such as JPEG and MP3 try to eliminate information in subtle ways so that the change is barely perceptible, and sound or video quality is not seriously degraded.
MPEG-DASH (aka: Dynamic Adaptive Streaming over HTTP)
An adaptive bitrate streaming technology. Contains both the encoded audio and video streams along with manifest files that identify the streams. This process involves breaking down the video stream into small HTTP sequence files. These files allow the content to be switched from one state to another.
MPEG-TS (aka: Transport Stream, MTS, TS)
A container format that hosts packetized elementary streams for transmitting MPEG video muxed with other streams. It can also have separate streams for video, audio and closed captions. It’s commonly used for digital television and streaming across networks, including the internet.
P-frames (aka: Predictive Frames, Predicted Frames)
The p-frame follows another frame and only contain part of the image in a video. P-frames look backwards to a previous p-frame or keyframe for redundancies.
Program Stream (aka: PS)
These streams are optimized for efficient storage. They contain elementary streams without an error detection or correction process. It assumes the decoder has access to the entire stream for synchronization purposes. Consequently, programs streams are often found in physical media formats, such as DVDs or Blu-rays.
A video track that consists of complete frames without interlaced fields. Each individual frame is a coherent image at a single moment in time. This means a video could be paused and the entire image could be seen. All streaming files are progressive, and this should not to be confused with the process of keyframes and p or b frames.
Reverse Telecine (aka: Inverse Telecine, IVTC)
This is a process used to reverse the effect of 3 : 2 pull down. This is achieved through removing the extra fields that were inserted to stretch 24 frame per second film to 29.97 frames per second interlaced video.
RTMP (aka: Real Time Messaging Protocol)
Is a TCP-based protocol that allows for low-latency communication. In the context of video, it allows for delivering live and on demand media content that can be viewed over Adobe Flash applications, although the source can be modified for other playback methods.
RTP (aka: Real Time Transport Protocol)
A network protocol designed to deliver video and audio content over IP networks and runs on top of UDP. The components of RTP include a sequence number, a payload identification, frame indication, source identification, and intramedia synchronization.
RTSP (aka: Real Time Streaming Protocol)
A method for streaming video content through controlling media sessions between end points. This protocol uses port 554. Using this method, data is often sent via RTP. RTSP is a common technology found in IP cameras. However, some encoders, like Wirecast, can actually take the IP camera feed and deliver it in an RTMP format.
Microsoft’s Silverlight is both a video playback solution and an authoring environment. The user interface and description language is Extensible Application Markup Language (XAML). The technology is natively compatible with the Windows Media format.
Smooth Streaming (aka: IIS)
Microsoft’s Smooth Streaming for Silverlight is an adaptive bitrate technology. It’s a hybrid media delivery method that is based on HTTP progressive download. The downloads are sent in a series of small video chunks. Like other adaptive technology, Smooth Streaming offers multiple encoded bitrates of the same content that can then be served to a viewer based on their setup.
Streaming Video (aka: Streaming Media)
Refers to video and/or audio content that can be played directly over the Internet. Unlike progressive download, an alternative method, the content does not need to be downloaded onto the device first in order to be viewed or heard. It allows for the end user to begin watching as additional content is constantly being transmitted to them.
The process of transcoding involves converting one video type into another format. This is often done to make a file compatible over a particular service.
UDP (aka: User Datagram Protocol)
The most universal way to transmit or receive audio or video via a network card or modem. In terms of real-time protocol, RTMP (Real Time Messaging Protocol) is based on TCP (Transmission Control Protocol), which led to the creation of RTMFP (Real Time Media Flow Protocol) that is based on UDP.
This process uses codecs to present video content in a less resource intensive format. Due to the high data rate of uncompressed video, most video content is compressed. Compression techniques can feature overt processes such as image compression or sophisticated techniques such as inter frame, which will look for redundancies between different frames in the video and only present changes via delta frames from a keyframe point.
A process to reduce the size of video data, often times with audio data included, through the use of a compression scheme. This compression can be for the purpose of storage, known as program stream (PS), or for the purpose of transmission, known as transport stream (TS).
Video Scaling (aka: Trans-sizing)
A process to either reduce or enlarge an image or video sequence by squeezing or stretching the entire image to a smaller or larger image resolution. While this sometimes can just involve a resolution change, it can also involve changing the aspect ratio, like converting a 4:3 image to a “widescreen” 16:9 image.
VOD (aka: Video On Demand)
VOD refers to content that can be viewed on demand by an end user. The term is commonly used to differentiate between live content, as VODs are previously recorded. That said, content can be presented in a way that is not on demand but using previously recorded content, such as televised programming that does not give the end user control over what they are trying to watch.