An Introduction To Digital TV Technology

This page is for people who need to understand what happens in a digital TV system, but who don’t know the details of how a digital TV signal is put together.

This is not an introduction to MPEG, and if you don’t know what MPEG is, take a look at the Tektronix guide to MPEG. Instead, this tutorial will concentrate on what makes a digital TV signal special, the terminology used to describe a DVB digital TV signal and generally helping you to understand what the hell people talking about digital TV actually mean.

It’s also not an introduction to broadcast engineering. Another page describes the basics of broadcast engineering, and shows you how an MPEG stream gets from the encoder to the viewer.

Disclaimer: some of this information is specific to the DVB standard. The map below (originally from the DVB web site) shows which parts of the world currently use one of the DVB standards – if you’re interested in the systems used in the US or Canada, the ATSC (Advanced Television Systems Committee) web site is a good place to start. The ATSC service information tutorial else where on this site describes the major differences between DVB services and digital services in the USA or Canada, but does not include information about the basic details of the transmission mechanism.

Anatomy of an MPEG-2 stream

OK, now that we’ve got that out of the way, lets start talking TV. A digital TV signal is transmitted as a stream of MPEG-2 data known as a transport stream. Each transport stream has a data rate of up to 40 megabits/second for a cable or satellite network, which is enough for seven or eight separate TV channels, or approximately 25 megabits/second for a terrestrial network.

Each transport stream consists of a set of sub-streams (known as elementary streams), where each elementary stream can contain either MPEG-2 encoded audio, MPEG-2 encoded video, or data encapsulated in an MPEG-2 stream. Each of these elementary streams has a ‘packet identifier’ (usually known as a PID) that acts as a unique identifier for that stream within the transport stream.

The only restriction on the number of elementary streams in any transport stream is that each elementary stream must have a unique PID value within its containing transport stream. Since this is stored as a 13-bit value, this is not a major restriction. In practise, the number of elementary streams is limited by the total bitrate of the transport stream. Transmission issues mean that transport streams with bitrates much above 40 megabits/second can’t usually be transmitted reliably.

A transport stream consists of a number of audio and video streams that are multiplexed together. First, each service in the transport stream will have its audio and video components encoded using MPEG-2 compression. The result of this process is a set of MPEG-2 elementary streams, each containing one video channel or one (mono or stereo) audio track. These streams are simply a continuous set of video frames or audio data, which is not really suitable for multiplexing. Therefore, we split these streams into packets in order to make the multiplexing process easier. The result of this is a packetized elementary stream, or PES.

To create a transport stream, each of these packetized elementary streams is packetized again and the data from the stream is stored in transport packets. Each transport packet has a length of 188 bytes, which is much smaller than a PES packet, and so a single PES packet will be split across several transport packets. This extra level of packetization allows the stream to support much more powerful error correcting techniques – PES packets are used to provide a way of multiplexing several elementary streams into one bigger stream, and are more concerned with identifying the type of data contained in the packet and the time and which it should be decoded and displayed. Transport packets, on the other hand, are almost purely concerned with providing error correction.

So far, we have just considered audio and video data. We may also want to include data streams as part of our service, for applications, Teletext information or other reasons. This is not too hard, because MPEG provides a well-defined way of carrying non-AV data inside transport packets. These are called private sections, and we will look at them in more detail little later. Since most of the equipment that generates this data will produce a stream of transport packets containing private sections, multiplexing them in to our transport stream is easy.

Once we have a complete set of transport packets for the different parts of our services, we can insert them into our final transport stream. When doing this, we have to be careful to insert packets in the correct order. This is not just a case of ensuring that all of the packets within the stream come in the right order – MPEG defines a strict buffering model for MPEG decoders, and so we have to take care that each elementary stream in our transport stream is given a data rate that is constant enough to ensure that the receiver can decode that stream smoothly, with no buffer underruns of overruns. We al so have to take care because video streams will use a much larger proportion of the final transport stream than audio streams, and so we can’t simply insert a packet from each stream in turn. A ratio of ten video packets to every audio packet is fairly close to what we would likely see.

If we just multiplexed these transport packets together, we would have a transport stream that contains a number of elementary streams with no indication of what type of data is in these streams or how to reconstruct these streams into something that a receiver can present to the user. To solve this problem, MPEG and DVB both specify that other information should be added to the transport stream. This data is encoded in a number of elementary streams that are added to the transport stream during the multiplexing process, and is known as service information.

So what does this service information look like? Basically, it’s a fairly simple database that describes the structure of the transport stream. We will examine this in more detail later, but at the simplest level it contains a number of tables that each describe one service in the transport stream. These tables list each stream in the service and give its PID and the type of data contained in the stream.

The anatomy of a DVB transport stream.

Anatomy of a DVB transport stream

Information about the types of stream in a service allow the receiver to not only identify which streams are audio and video, but also to identify different types of data stream – separating teletext information from service information from broadcast filesystems for instance. This makes it easy for the receive to know which streams it should pass on to different parts of its software stack for decoding.

Describing the structure in this way, rather than embedding it into the elementary streams, means that we can re-use elementary streams across services. This is shown in the example in the next section of a real transport stream – two elementary streams (the streams with PID values of 32 and 1102) appear in more than one service. This allows efficient re-use of streams across services, and is most commonly used for data streams.

If our transport stream was to contain more than one service, we can simply multiplex all the audio, video and data streams for all the services together. The service information describes which elementary stream belonged to which service, as well as carrying some other information that is more for the benefit of the viewer than the receiver. This may include channel names and descriptions, information about the TV schedules, and parental ratings information.

So, if we take a look at this from a different perspective, we get this picture:

Elementary streams within a transport stream.

Elementary streams within a transport stream

In this case, we have a transport stream containing eight elementary streams, split across two services. PIDs 100, 200 and 201 contain video, while the other elementary streams contain audio tracks in different languages. PID 204 contains an MHP application.

As we can see, for both services, the elementary streams containing the video and one audio track continue across services – this is typically done simply to make life easier for the broadcaster and is not required. When there are multiple audio tracks, (or multiple camera angles, as in the case of service 2), then there will be several different elementary streams on other PIDs.

This does not have to happen at an event boundary. As we can see from the case of the MHP application, this application is only available for part of the event. When it is no longer available, the elementary stream that contains it need not be broadcast any more. The ability to update the contents of a transport stream in this way offers a great deal of flexibility to the broadcasters.

A transport stream is different from the type of stream used in DVDs (which is known as a program stream). They are both MPEG-2 streams, and both of them contain multiplexed MPEG-2 audio and video data. However, there are two major differences between them.

The first difference is that a program stream does not contain as much service information as a transport stream. The reason for this is that a program stream has a simpler structure, and can’t contain more than one service. Every elementary stream in an MPEG program stream belongs to the same service.

Secondly, transport streams are used in environments where there is much more chance of data corruption. Program streams don’t have to worry about this since they are usually stored on optical disks or hard drives. Transport streams, on the other hand, may be transmitted to and from satellites, over terrestrial TV networks or over cable TV networks. This means that they have to be much more resilient, and so transport streams have extra levels of packetization and error-correcting information to help cope with the challenges of the environment that they are used in.

We’ve skipped a number of points in this discussion of MPEG and transport streams, especially issues such as synchronization and timing. For a more thorough discussion of these and the rest of the MPEG standard, take a look at the guide to MPEG from Tektronix.

Networks, bouquets, services, events & multiplexes

In DTV-speak, each transport stream is also known as a multiplex, because it consists of a number of services multiplexed together. Every multiplex is broadcast on a single frequency, and only one multiplex can be broadcast on each frequency. As I said earlier, each multiplex typically has a data rate of around 40 megabits/second for satellite or cable systems, and around 25 megabits/second for terrestrial networks.

Within a multiplex, each group of elementary streams that makes up a single TV channel is called a service. The number of elementary streams in a service doesn’t have to stay constant. This can vary between TV shows on that service (for instance, some shows may be broadcast in multiple languages or with multiple camera angles), or it may even change within an TV show. These changes are all perfectly legal in MPEG – not common, but legal.

In digital TV systems, each TV show is known as an event. Thus, from one point of view each service consists of a number of elementary streams that are transmitted simultaneously, but from another point of view the service consists of a series of individual events broadcast one after another.

The image below should give you an idea of what a real transport stream looks like. This is a screen grab taken from a transport stream analyzer, showing one of the multiplexes being broadcast on the Astra satellite:

An example of a real transport stream.

An example of a real transport stream

One thing that you will notice is that in this screenshot, services are referred to as programs. This is an MPEG term, and basically means the same thing as a service. One of the reasons an MPEG-2 program stream is so named is the fact that it only contains a single program.

As you can see from this image, the multiplex contains a number of different services, where each service contains at least one audio stream, at least one video stream and usually several data streams. The first column indicates the type of the stream (some can’t be identified by the software in this case) – the number that prefixes the type is the number of the service (or program).

The second column shows the PID value for each elementary stream. Although it’s not obvious in this screenshot, it’s normally good practise to give elementary streams belonging to the same service similar PID values. That way, it’s easy to identify which service a given stream belongs to. The final two columns show (graphically and numerically) the bit-rate of each elementary stream in megabits/sec.

As the screenshot shows, digital TV video signals are usually coded at 3-5 megabits/sec for standard-definition video , with the total data rate of a service being 4-6 megabits/sec. This is a little less than DVD-quality, but it does allow the broadcaster to fit a reasonable number of services in each multiplex. Broadcasters like to fit as many services as possible into a multiplex, since this means that they can fit more channels into every frequency band (remember, every multiplex is broadcast on a single frequency band) and so they can use fewer transponders on a satellite to broadcast the same number of services.

OK, so now we know what goes into a transport stream. There are some other things that are worth knowing, however. The transport stream physically groups a set of services together, but services in DVB systems can also be logically grouped as well. A logical group of services is called a bouquet. Why do we need this? Assume for a minute that you work at a large broadcaster, where you broadcast 50 channels. You broadcast over satellite, so each of your transport streams is limited to 40 megabits/sec (around 8 channels). So, you need 7 transport streams to contain all your services.

You sell access to these services in packages, so that a consumer can choose to buy your basic package (which contains 13 channels) sports package (which contains 8 sports channels) or your movie package (which contains 5 movie channels). How can you identify in a machine-readable way which channels are part of which package? You could group them in transport streams, but your basic package is too big to fit in one TS. Instead, by assigning a bouquet to each package, you can group the services into transport streams in the most efficient way while still having a mechanism for grouping the services in a logical way.

Digital TV systems also have the concept of a network. This is not a computer network: instead, it is a set of transport streams that share some common service information. These transport streams will often be broadcast by the same company (e.g. your satellite or cable operator), and more than one network may be available at any time. This is especially true in terrestrial systems, where there may be several networks operating at the same time in the same area (e.g. several national networks, plus one or more regional operators depending on coverage). In this case, the receiver will normally use automatic channel scanning to find all of the available channels, rather than relying on service information.

The company that owns the network may or may not own the actual delivery medium – in the case of a cable TV system, for instance, the owner of the cable infrastructure is usually also the network operator. In the case of a satellite TV system, however, the network operator (e.g. SES Astra) is typically does not run any of the TV networks that are broadcast over that satellite. Similarly in terrestrial systems, a network (e.g. Freeview) may not own any of the transmitters or distribution mechanism that are used to get its signals to the viewer’s home.

So, what we have is:

A network consists of one or more transport streams that are broadcast by the same entity
A transport stream is an MPEG-2 stream containing several services
Each service is a TV channel, and consists of a series of events one after the other
Each event is a single TV show, and consists of a number of elementary streams
Each elementary stream is a packetized MPEG-2 stream containing MPEG-2 encoded audio, video or binary data.
Several services (possibly from several different transport streams) can be grouped together logically in a bouquet.

The structure of a DVB transport stream.

Structure of a DVB transport stream

Every service in a DVB network can be uniquely identified by three values. These values are the original network ID (the ID of the network that originally broadcast the service), the transport stream ID (to identify a particular transport stream from that network) and a service ID to identify a service within that transport stream. We can actually go further than this. Each elementary stream in a service may have a component tag, that allows the unique identification of a given elementary stream. This is used by the receivers to decide what service to play, and by interactive applications that want to switch the receiver to decode a different audio or video stream.

DVB transport streams will have both an original network ID and a network ID, which identify the network that originally produced the transport stream (e.g. the BBC) and the one that is currently transmitting it (e.g. BSkyB).

ATSC systems use a combination of the transport stream ID and the source ID to identify a particular service. Depending on the value of the source ID, it may be unique only within the current transport stream or it may be unique at a regional level (or at the network level for satellite signals). This allows some co-ordination of source IDs across different terrestrial networks.

Service information

So, you have a transport stream, which contains several services, where each service contains several elementary streams. How do we tell what services we’re broadcasting? How do we tell which elementary stream belongs to which service? How do we even tell what types of elementary stream we’re broadcasting?

As we saw briefly above, the answer is a special set of elementary streams that contain a set of database tables describing the structure of transport stream, the services within it and some useful information that digital TV receivers can show the user, such as the name of the service and schedule information for the services. These tables are collectively known as service information (SI). Every transport stream (DVB or not) has some service information that the MPEG standard declares mandatory, but DVB defines several extra SI tables in addition to the standard ones.

These tables are broadcast as elementary streams within the transport stream. Some of them are tied to specific services within the transport stream, while some are more general and describe either the structure of the transport stream itself or properties of the network. In some cases, elementary streams containing SI are broadcast on a fixed PID to make it easier for decoders to find it, while in other cases the PID on which an SI table is broadcast is stored in another SI table.

ATSC service information is covered in a separate tutorial, and so we won’t consider it here. The SI tables that are commonly found in a DVB transport stream are:

Program Association table (PAT) – defined by the MPEG standard
Program Map Table (PMT) – defined by the MPEG standard
Network Information Table (NIT)
Service Description Table (SDT)
Event Information Table (EIT)
Conditional Access Table (CAT)
Bouquet Association Table (BAT)
Time and Date Table (TDT)
Time Offset Table (TOT)

The Program Association Table is the fundamental table for service information. It describes which PID contains the Program Map Table for each service (see below) as well as the Network Information Table for the transport stream in those networks that use it.

The Network Information Table describes how transport streams are organized on the current network, and also describes some of the physical properties of the network itself. The NIT also contains the name of the network, and the network ID. This is a value that uniquely identifies the network that is currently broadcasting the transport stream, and may be different from the original network ID that we discussed earlier, if the transport stream is being rebroadcast.

The Conditional Access Table describes the CA systems that are in use in the transport stream, and provides information about how do decode them.

The Program Map Table is the table that actually describes how a service is put together. This table describes all the streams in a service, and tells the receiver which stream contains the MPEG Program Clock Reference for the service. The PMT is not broadcast on a fixed PID, and a transport stream will contain one PMT for each service it contains.

Together, the PAT, PMT, and CAT are known as Program Specific Information (PSI) and are defined by MPEG. All other tables are specific to DVB systems.

The Service Description Table gives more user-oriented information about services in a transport stream. Unlike the PMTs, there is only one SDT in a transport stream, and that contains the information for every service. The SDT typically contains information such as the name of the service, the service ID, the status of the service (e.g. running/not running/starting in a few seconds) and whether the service is scrambled or not.

The Event Information Table provides schedule information about events on a service. This includes the event name, start time duration and the status of the event. This table is actually split in to two separate tables: the EIT-present/following, which contains information about the current and next events, and the EIT-schedule, which contains other schedule information. These in turn can be split into tables describing the current (actual) transport stream and other transport streams. It is mandatory for the transport stream to contain the EIT-present/following for the actual transport stream, while other EIT tables are optional.

The Bouquet Association Table lists and describes the services in a bouquet. This does not provide very detailed information, since this can be gained from other SI tables. Instead, it just provides a list of the services contained in a bouquet.

The Time and Date Table and the Time Offset Table provide a time reference for the stream. The TDT contains the current UTC (Universal/GMT) time, while the TOT contains both this and the offset from UTC for local time. This can be used to calculate schedule information accurately, if needed.

Some of these tables may contain information about other transport streams, as well as information about the current transport streams. The NIT, SDT and EIT must contain information about the current transport stream (and these tables are known as the NIT-actual, SDT-actual and EIT-actual respectively), but the transport stream may also contain versions of these tables that refer to other transport streams. These are known as the NIT-other, SDT-other and EIT-other.

The table below describes which of these tables in mandatory and which is optional.

Service information tables in a DVB system.
Mandatory (MPEG)	Mandatory (DVB)	Optional (DVB)	Reserved PID
PAT			0x0000
PMT (one per service)
CAT			0x0001
	NIT-actual	NIT-other	0x0010
	SDT-actual	SDT-other	0x0011
	EIT-present/following (actual)	EIT-schedule (actual & other) EIT-present/following (other)	0x0012
	TDT		0x0014
		TOT	0x0014
		BAT	0x0011

Of course, life is not as simple as it appears. Although DVB requires that the mandatory tables are present, it doesn’t require that they are populated. That would be too easy. So, it’s not entirely unusual to see transport streams containing empty SI tables, either because the broadcaster was too lazy to add the information, or because they are squeezing every last drop of bandwidth from a broadcast. The contents of an SI table may change over time – for instance, the contents of the EIT will change when the current event changes. Each SI table has an associated version number, which is incremented with every change. The receiver will track these version numbers, and load new versions of the tables when they are broadcast.

Structure of an SI table

Each SI table is fairly similar in structure. They all consist of a header, followed by zero or more descriptor loops. Each descriptor loop contains one or more descriptors that provides the information for one row in the table. Each descriptor may only contain some information for a given row in the table, and some descriptors that contain more generic information may appear in different types of table. This re-use of descriptors makes SI parsers slightly simpler, since each descriptor has its own header (which includes the descriptor ID), as well as the actual information it contains.

A full list of descriptors can be found in the MPEG-2 systems specification, the DVB SI specification, and the ATSC PSIP specification, depending on the system in use, but some of the most common and useful DVB descriptors are:

The service_descriptor, which is found in the SDT and gives the name and type of a service.
The linkage_descriptor, which is found in several of the tables and provides a reference to a source of more information about an element of the service information, as well as a reference to a replacement service if the current service is not running or is scrambled.
The component_descriptor, which is found in the EIT and gives information about an elementary stream such as the content type and format and in some cases (e.g. audio streams) the language of the stream.
The data_broadcast_id_descriptor is found in the PMT and gives information about the type of encoding used for a data stream.
The stream_identifier_descriptor is also found in the PMT and is used to attach a component tag to an elementary stream so that individual streams may be uniquely identified.
The CA_identifier_descriptor appears in several tables and identifies the scrambling system (if any) that is used for a given service or event

MPEG Sections

Since these tables can sometimes be quite large, there is a need to split them to fit inside a transport packet. Each chunk of data is knows as a section, and these can be used to hold any type of binary data, not just SI tables.

Sections that contain data and not audio or video streams are typically known as private sections, even when the data format is publicly known. These sections mostly follow a standard format:

The MPEG-2 private section format. Source: ISO 13818-1:2000 (MPEG-2 systems specification).
Syntax	No. of bits	Identifier
private_section() {
table_id	8	uimsbf
section_syntax_indicator	1	bslbf
private_indicator	1	bslbf
Reserved	2	bslbf
private_section_length	12	uimsbf
if(section_syntax_indicator == ‘0’) {
for(i=0; i<N; i++) {
private_data_byte	8	uimsbf
}
}
else {
table_id_extension	16	uimsbf
Reserved	2	bslbf
version_number	5	bslbf
current_next_indicator	1	bslbf
section_number	8	uimsbf
last_section_number	8	uimsbf
for(i=0; i<private_section_length-9; i++) {
private_data_byte	8	uimsbf
}
CRC_32	32	rpchof
}
}

The table ID is used to identify the contents of the elementary stream (whereas the stream type listed in the PMT only describes the type of stream – the PMT will tell you it’s a data stream, but the table ID may help tell you what the data actually is). In the case of SI tables, this identifies which table is being broadcast (as you’d expect) but it is also used for identifying other private data streams.

The table ID extension is used to sub-class this information, where the table ID may identify a given class of data,and the table ID extension may identify a particular data stream within that class. In the case of SI tables, this is used to identify those tables which can contain information about the current transport stream and others. For instance, the table ID for the SDT-actual is different from the table ID for the SDT-other, and in both cases, the table ID extension contains the transport stream ID, identifying which transport stream the SDT refers to.

Presenting video – decoder format conversion

MPEG-2 video will typically be encoded at PAL or NTSC resolutions (depending on the country of origin). In either case, the encoded video may have an aspect ratio of 4:3, 16:9, or some other aspect ratio, (with the former being more common at the time of writing). This aspect ratio, together with other information, is broadcast in the MPEG stream as part of the data. This data is known as the active format description.

There is no guarantee, however, that a TV used to view the signal will have the same aspect ratio as the encoded video. This is where decoder format conversion comes in. Basically, this is a conversion operation carried out within the receiver to convert the video signal from the original aspect ratio to the aspect ratio of the display. Usually, this is carried out in the analogue stage of the receiver, after the signal has been generated but before it is sent to the TV.

Although the conversion from 16:9 to 4:3 (or vice versa) is most common, there are other common conversion operations to support the different types of conversion such as letterboxing, shoot-and-protect and pillarboxing. The UK Digital Terrestrial Group has published a set of implementation guidelines in PDF format detailing how receivers can handle these conversions. These conversions are also covered in the EACEM E-Book.

This article at TVTechnology.com describes a little more about decoder format conversion and active format descriptions from a US perspective – it’s not quite identical to the DVB situation, but it does discuss DVB and the similarities are far more important than the differences.

DSM-CC

DSM-CC is a standard for data broadcasting (amongst other things) based on MPEG streams. It provides a great deal of functionality that is used in MHP and OCAP – as well as being one of the main methods for transmitting applications to a receiver, it can provide data streaming, timecode information for MPEG streams and a host of other functionality.

DSM-CC is a complex topic, and so it is covered in a separate DSM-CC tutorial. While only middleware developers need to know the gory details of DSM-CC, application developers and head-end manufacturers can also benefit from knowing something about its inner workings.