Robert Joseph, Ph.D.,
Jörgen Rosengren, M.Sc.
The MHEG-5 standard was developed to support the distribution
of interactive multimedia applications in a client/server architecture
across platforms of different types and brands. MHEG-5 defines
a final-form representation for application interchange. The applications
consist mainly of declarative code, but provisions for calling
procedural code have been made. MHEG-5 applications need only
be authored once and can run on any platform that is MHEG-5 compliant.
This paper gives the reader an overview of the current Draft International
Standard (DIS) of the MHEG-5 standard (ISO/IEC DIS 13522-5, dated
December, 1995). This is an informal description; in the event
of a discrepancy between this text and the actual DIS text, the
latter should be given authority.
The global scope of MHEG-5 is to define the syntax and semantics
of a set of object classes that can be used for interoperability
of multimedia applications across minimal-resources platforms.
The developed applications will reside on a server, and as portions
of the application are needed, they will be downloaded to the
client. In a broadcast environment, this download mechanism could
rely, for instance, on cyclic rebroadcasting of all portions of
the application. It is the responsibility of the client to have
a runtime that interprets the application parts, presents the
application to the user, and handles the local interaction with
the user.
Figure 1: MHEG application on a server, with parts downloaded
to the client
Some of the application types that MHEG-5 is targeted for are:
video on demand, near video on demand, news on demand, navigation,
and interactive home shopping.
The major goals of MHEG-5 are:
The MHEG-5 work item was created in November 1994, in order to
meet the urgent need of the international community for a standardized
application format adapted to minimal resources platforms. In
December 1995, it reached the status of Draft International Standard
(DIS), and it is scheduled for promotion to International Standard
in July 1996.
This section describes the functionality that is provided by the
MHEG-5 standard in terms of the MHEG-5 classes.
An MHEG-5 application is made up of Scenes and objects that are
common to all Scenes. A Scene contains a group of objects, called
Ingredients, that represent information (graphics, sound, video,
etc.) along with localized behavior based on events firing (e.g.,
the 'Left' button being pushed activating a sound). At most one
Scene is active at any one time. Navigation in an application
is done by transitioning between Scenes.
An MHEG engine has the ability to display visual objects in a
rectangular coordinate system with a fixed size, and to play audible
objects. A user input device (e.g., remote control, game controller,
etc.) can be used with the runtime to allow interaction with the
applications.
The most important classes in an MHEG applications are:
An Application object contains objects that are shared between
several Scenes. A Scene is a collection of objects that are intended
for coordinated presentation. Both Applications and Scenes can
contain objects of the class Ingredients. The Ingredient class
has the subclasses Link, Procedure, Variable, Presentable, Palette,
Font, and CursorShape.
A Link object consists of a LinkCondition and a LinkEffect.
The LinkEffect, which is a list of elementary actions, is executed
when the LinkCondition becomes true. A LinkCondition is always
triggered by the occurrence of an event.
These are objects that can be presented to the user as a part
of a Scene. Presentables are either visible or audible, or combine
visible and audible objects. The Presentable class has the subclasses
Visible, Audio, Stream, TokenGroup, and TemplateGroup.
These are objects that can be displayed on the screen. The various
types of Visibles are Lineart (including Rectangle), RTGraphics,
Bitmap, Button (including SwitchButton, Hotspot, and PushButton),
Slider, Text (including EntryField and HyperText), and Video.
Some of these items allow the runtime to control the user interaction
with them. These objects are called Interactible. For example,
the EntryField object is an interactible.
Variable object can hold a value (e.g., an integer or a text string).
They are typically used to store the state of an object or to
communicate with the outside world, for example with a server.
Procedure objects encapsulate calls to procedural code, the format
of which is not directly defined by MHEG-5. The Procedure objects
make it possible to escape from the MHEG-5 paradigm to perform
a task that is better expressed in procedural code.
Below is a very simple scene that displays a bitmap and text.
The user can press the 'Left' button on the input device and a
transition is made from the current scene, InfoScene1, to a new
scene, InfoScene2.
Figure 2: InfoScene1 is an example of a scene with presentable
information and a simple link.
The pseudo-code from the above scene may look like the following:
The MHEG application is event-driven, in the sense that all actions
are called as the result of an event firing a link. Events can
be divided into two main groups: synchronous events and asynchronous
events. Asynchronous events are events that occur asynchronously
to the processing of Links in the MHEG engine. These include timer
events and user input events. An application area of MHEG-5 (such
as DAVIC) must specify the permissible UserInput events within
that area. Synchronous events are events that can only occur as
the result of an MHEG-5 action being targeted to some objects.
A typical example of a synchronous event is #IsSelected, which
can only occur as the result of the MHEG-5 action Select being
invoked. Synchronous events are always dealt with immediately;
asynchronous events are queued.
The mechanism at the heart of the MHEG engine, therefore, is the
following:
When all events have been processed, the process starts again
at 1.
Before doing anything to an object, the MHEG-5 engine must prepare
it. Preparing an object typically entails retrieving it from the
server, decoding the interchange format and creating the corresponding
internal data structures, and making the object available for
further processing. The preparation of an object is asynchronous;
its completion is signalled by an IsAvailable event.
All objects that are part of an application or a scene have a
RunningStatus, which is either true or false. Objects
whose RunningStatus is true are said to be running, which
means that they perform the behaviour they are programmed for.
More concretely:
The MHEG-5 mix-in class Interactible groups some functionality
associated with user interface-related objects (Slider, HyperText,
EntryField, Buttons). These objects can all be highlighted (by
setting their HighlightStatus to True). They also have the attribute
InteractionStatus, which, when set to true, allows the
object to interact directly with the user, thus bypassing the
normal processing of UserInput events by the MHEG-5 engine. Exactly
how an Interactible reacts when its InteractionStatus is true
is implementation-specific. As an example, the way that a user
enters characters in an EntryField can be implemented in different
ways in different MHEG-5 engines.
At most one Interactible at a time can have its InteractionStatus
set to True.
For objects that are visible on the screen, the following rules
apply
It is possible within MHEG-5 to share objects between some or
all scenes of an application. As an example, this can be used
to have variables retain their value over scene changes, or to
have an audio stream play on across a scene change. Shared objects
are alway contained in an Application object. Since there is always
exactly one Application object running whenever a scene is running,
the objects contained in an application object are visible to
each of its scenes.
The MHEG-5 specification does not prescribe any specific formats
for the encoding of content. For example, it is conceivable that
a Video object is encoded as MPEG or as motion-JPEG. This means
that the group using MHEG-5 must define which content encoding
schemes to apply for the different objects in order to achieve
interoperability.
However, MHEG-5 does specify a final-form encoding of the MHEG-5
objects themselves. This encoding is an instance of ASN.1, using
the Basic Encoding Rules (BER).
The issue of conformance, though of crucial importance, has not
yet been extensively addressed by the MHEG committee. It is expected
that a conformance definition for the standard will have to be
drafted, probably lagging the standard itself by some period of
time.
This section describes certain MHEG-5 leaf classes in more
detail. Classes that are described in detail above are left out.
Link objects express the dynamic aspects of an MHEG-5 application.
A Link object consists of a LinkCondition and a LinkEffect. The
LinkCondition, in turn, consists of an EventSourceCondition, an
EventTypeCondition, and an EventDataCondition. A Link object can
be fired as the result of an event occuring. An event always emanates
from exactly one object (for example, an IsSelected event can
be generated by a Button object). Some events also have a piece
of data associated with them, called the EventData.
A Link fires if an only if the right type of event emanates from
the right object and carries the right associated data. When a
Link is fired, it executes its LinkEffect, which is an Action
object.
MHEG-5 Action objects consist of a sequence of elementary actions.
Elementary actions are comparable to methods in an object-oriented
paradigm. The execution of an Action object means that each of
its elementary actions are invoked sequentially.
As an example, consider the following Link, which transitions
to another Scene when the character "A" is entered in
the EntryField EF1.
Note that MHEG-5 does not specify the actual encoding of the data
structure representing a colour look-up table, font, or cursor.
All these classes inherit an abstract mixin class, TokenManager,
which has the ability to administer a token over a group of items.
A typical use of this feature is to let the item that has the
token be highlighted. TokenManager also implements a MovementTable,
which makes it possible to translate UserInput events into movements
of the token between the members of the group.
A HyperText object is different from the Text objects in that
it supports the concept of hyperlinks, i.e. the possibility to
associate words or groups of words with a link to, for instance,
another page. More formally, it is possible to program an MHEG
link to be fired when a particular hypertext link has been selected
by the user.
An EntryField is a Text object which can be interacted with by
the user to change the text it contains. In addition to the attributes
of the Text class, it has attributes to control the type of input
expected, whether the input is echoed to the user or not, and
what the length of the input is expected to be.
Table of Contents
Introduction
Description of the MHEG-5 Functionality
Basic concepts
(scene:InfoScene1
<other scene attributes here>
group-items:
(bitmap: BgndInfo
content-hook: #bitmapHook
original-box-size: (320 240)
original-position: (0 0)
content-data: referenced-content: "InfoBngd"
)
(text:
content-hook: #textHook
original-box-size: (280 20)
original-position: (40 50)
content-data: included-content: "1. Lubricate..."
)
links:
(link: Link1
event-source: InfoScene1
event-type: #UserInput
event-data: #Left
link-effect: action: transition-to: InfoScene2
)
)
Interaction within a Scene
Availability; Running Status
Interactibles
Visual Representation
Object Sharing Between Scenes
Object Encoding
Conformance
Detailed Description of the MHEG-5 Classes
Link and Action
Example:
(link: Link1
event-source: EF1
event-type: #NewChar
event-data: 'A'
link-effect:
(action: transition-to: Scene2)
)
Procedure
As was noted above, a Procedure object is an abstraction of a
piece of procedural code that performs a specific task. The task
will typically be something that is not easy to express within
the MHEG-5 paradigm. There are three types of Procedure objects:
custom, remote, and script. Custom Procedure objects
are used for calls to the runtime environment. Remote Procedure
object are used to encapsulate an RPC mechanism. Script Procedure
objects, finally, are used to encapsulate a piece of exchanged
procedural code, that could be written in a scripting language.
Palette, Font, and CursorShape
These three classes are used to encapsulate fonts, color look-up
tables and cursors, respectively. The objects of these classes
are not used stand-alone, but rather referenced from objects of
other classes. For example, the only way to use a Font object
is to reference it from a Text object. The CursorShape class has
a similar relationship with the Scene class, and the Palette
class is used by classes that have colour attributes.
Variable
As was noted above, Variables are objects that can be used to
store a value. That value can be either an integer, a byte string,
a boolean, a content reference, or an object reference. There
are actions to set the value of variables and to test the value.
Variables can also be used as parameters to actions; for instance,
the TransitionTo action can take a variable as the parameter that
denotes the Scene to transition to.
TokenGroup, TemplateGroup and List
These classes provide the functionality of logically grouping
Visibles and other objects. The TokenGroup class can be used to
implement simple jumping-highlight navigation. The TemplateGroup
class does the same, but also supports the description of behaviour
and content that is common to all members of the group in a so-called
template. The List class, finally, is a TemplateGroup that also
allows certain dynamic features, such as scrolling and adding
or deleting members.
Stream
The Stream class provides the functionality of a multiplex of
one or more audio or video objects that are to be presented in
synchronization. It also supports the handling of stream events,
which makes it possible to link an MHEG behaviour to a specific
point in a stream, or to the occurrence of a specific, user-defined
event.
Audio
The Audio class is used to encapsulate an audible object (such
as an audio clip encoded in MPEG-2). The audio object can be either
part of a stream or stand-alone.
Rectangle
Rectangles are objects which, when run, draw a rectangle on the
screen. The line size and color as well as the fill color (if
any) are attributes of the Rectangle object.
Example:
(rectangle: Rect1
original-position: ( 0 0 )
line-colour: 15
box-size: ( 40 50 )
)
Bitmaps
Bitmaps are objects which, when run, draw a bitmap on the screen.
A "bitmap", in this context, is any two-dimensional
static picture, and therefore can be used also to represent for
instance an MPEG-2 still picture.
Example:
(bitmap: BgndInfo
content-hook: #bitmapHook
content-data: referenced-content: "Info.bitmap"
box-size: ( 320 240 )
original-position: ( 0 0 )
)
Video and RTGraphics
The Video and RTGraphics classes are used to encapsulate a visible
object which changes in real time (such as a video clip encoded
in MPEG-2, or a subtitle stream). Objects of both these classes
can be either part of a stream or stand-alone. The difference
between the classes is minimal: whereas a Video object has the
action ScaleVideo, this is not true for the RTGraphics class.
Text; HyperText; EntryField
Text objects, when run, render a text string on the screen according
to some attributes. The most important attributes are the Font
and colours to be used. In addition, justification and more esoteric
features can be specified. The text class also has two subclasses:
HyperText and EntryField.
Button
Buttons come in three varieties. All three varieties provide the
functionality of rendering a button (of a certain type) and of
changing the appearance of the button depending on the two internal
attributes HighlightStatus (inherited from Interactible)
and SelectionStatus. It is the responsibility of the application
(not the MHEG-5 engine) to make sure that UserInput actions are
coupled to state changes in a Button. This can be done using the
MHEG-5 actions SetHighlightStatus and Select/Deselect/Toggle.
The different types of buttons are:
This type of button has a two-state representation, which depends
on its SelectionStatus. In other words, it can store one
bit of information. Depending on its "Style" attribute,
it can be rendered as a CheckBox, PushButton, or RadioButton.
This type of button acts as a flip-flop, i.e., its SelectionStatus
is only true just after the PushButton has been selected,
and then returns to False.
This type of button is similar to the PushButton, but is invisible
unless highlighted or selected.
Example:
(switchbutton: Switch1
style: #radiobutton
position: ( 50 70 )
label: "On"
)
Slider
Sliders are objects which represent in a graphical form the relationship
between a value and an interval. For instance, the value 1 in
the interval [0,2] would be represented by an indicator which
is positioned in the middle. The value represented by a slider
can be changed by the user. Since it is an interactible, it also
has the HighlightStatus attribute, which makes it possible
for the engine to control the highlighting of the object. Sliders
come in different styles, i.e., thermometer, normal, and proportional.
Example:
(slider: Slider1
box-size: ( 40 5 )
original-position: ( 100 100 )
max-value: 20
orientation: #right
)
Acknowledgements
The authors of this paper would like to thank all the people on
the MHEG ISO committee. We would especially like to thank the
chairman of the MHEG-5 group, Klaus Hofrichter, and the MHEG-5
editor, Christine Théot.