MPEG-4 Systems and DMIF Doug Young Suh, Ph.D. Kyung Hee University [email protected] 21 세기...
-
Upload
dennis-hodges -
Category
Documents
-
view
217 -
download
0
Transcript of MPEG-4 Systems and DMIF Doug Young Suh, Ph.D. Kyung Hee University [email protected] 21 세기...
MPEG-4 Systems and DMIF
Doug Young Suh, Ph.D.Kyung Hee University
21 세기 유망핵심부품 기술 세미나
Outline
• Overview• ISO/IEC 14496-1 MPEG-4 Systems• ISO/IEC 14496-6 DMIF
Overview
14496-1 MPEG-4 Systems
14496-2MPEG-4 Video
14496-3MPEG-4 Audio
• MPEG-4 Systems : interactive audio-visual scene
Server
DMIFCallSetupControl
Client
DMIFCallSetupControl
Authoring Tool
MP4 File
Video ESAudio ES
FlexMux
DMIF TransMux
RTP/UDP/IP RTP/UDP/IP
DMIF TransMux
FlexMuxSLSL SLSL
BIFS Encoder BIFS Encoder BIFSComposition
BIFSComposition
MP4File
VideoEncoderVideo
EncoderVideo
DecoderVideo
Decoder
AudioEncoderAudio
EncoderAudio
EncoderAudio
Encoder
Interactive VOD Based on MPEG-4
Concept 1 : Layered Model
한국 철학자 네팔 철학자
통역한국어 영어
통역네팔어 영어
한국 통신 네팔 통신통신 프로토콜
통역 프로토콜
철학자 프로토콜
철학자 / 통역 인터페이스
통역 / 통신 인터페이스
Concept 2: Object-oriented
• Encapsulation : data, method• Inheritance • Not object-based Human
NameAgeCall()
Customer
BalanceRegister()
Employee
SalaryFire()
Multiplexed Streams
Interactive AudiovisualScene
Elementary Streams
Composition and Rendering
Display andUser
Interaction
Transmission/Storage Medium
(RTP)UDP
IP
H223PSTN
DABMux
DeliveryLayer
FlexMux FlexMux
DMIF Application Interface
SL SLSL SL ... SyncLayer
Elementary Stream Interface
AV Objectdata
SceneDescriptionInformation
ObjectDescriptor
... CompressionLayer
SL
SL-Packetized Streams
(PES)MPEG-2
TS
AAL2ATM
UpstreamInformation
SL
SL
FlexMux
...
The ISO/IEC 14496 terminal architecture 14496-2 video14496-3 audio
14496-1 Systems
14496-6 DMIF
Tools in Systems
• Terminal model with time and buffer management
• BIFS (Binary Format for Scenes)• OD (Object Descriptor)• Interface to IPMP systems• SL (Sync Layer)• FlexMux• MPEG-Java : an application engine
14496-1 Terminology
• A scene is composed of one or more than one objects.
예 ) 일기예보장면 (scene) 에서 사람 (object1) 과 배경(object2) 이 있고 , 소리 (object3) 가 나온다 .
• ES : 압축된 media data, 대개 object 와 1:1
• AU : 대개 영상은 한 VOP, audio 는 한 frame (e.g. 10ms)
• CU : decoding 후 독립적으로 다룰 수 있는 가장 작은 단위
Systems Decoder Model
DecodingBuffer DB
1
Decoder
(encapsulatesDemultiplexer)
DMIF Appli-cation Interface
DecodingBuffer DBn
DecodingBuffer DB
2 DecoderMemoryCB
2
Compositor
Elementary Stream Interface
DecodingBuffer DB
3
MemoryCB
1
Composition
Composition
MemoryCB
n
CompositionDecoder
1
2
n
Systems Buffer Model
• DB : bitrate 변화 및 network jitter 흡수• CM : prediction (P-, B-VOP) 용 , CU decoding time 차이 흡수• DB, CM 으로 초기 지연이 결정됨• CM 은 최소화하여야 ( 특히 , PDA)
Time Model
• 필요한 이유- Lip synchronization : CTS, DTS- Clock recovery : e.g. broadcast, IMT-2000
• Assumption- DTS 순간 decoding 되고 , DB 에서 지워지면서 , deco
ding 된 CU 는 CM 에 저장됨- 현재 CTS 에서 다음 CTS 사이에 composition 됨 ( 한
CU 는 적어도 다음 CU 의 CTS 까지는 CM 에 있어야 )
DTS and CTS
CompositionMemory
DecodingBuffer
AU0
AU1
CU0
CU1
Arrival(AU0)
Arrival(AU1)
DTS (AU0)DTS (AU1)
CTS (CU0) CTS (CU1)= available for composition
...................
...................
Time Base
• STB in the decoder system• OTB for media source systems- Video : 60 times in a second- Audio : 44100 times in a second
• Mapping OTB to STB
STARTSTBSTARTOTBOTB
STBOCT
OTB
STBSCT tt
t
tt
t
tt
MP4 File
• Self-contained cf. *.asf of MS Media Player
• Include IOD, OD, BIFS, ES
IODmoov
mp4 file
mdattrak (BIFS)
trak (OD)
trak (video)
trak (audio)
... other atoms
Interleaved, time-ordered, BIFS, OD, video, and audio access units
OD Framework• Basic syntaxabstract aligned(8) expandable(228-1) class BaseDescriptor : bit(8) tag=0 {
// empty. To be filled by classes extending this class.
}
abstract aligned(8) expandable(228-1) class BaseCommand : bit(8) tag=0 {
// empty. To be filled by classes extending this class.
}
• IPMP : IPMP OD, IMMP ES• Command : OD stream, OD as an ES (convey, update, and remove ODs)
• Descriptor : OD components (Object, IOD, ES, Decoder, QoS)
OD Stream
• Command 전달 (convey, update, and remove)
• Examplesclass ObjectDescriptorUpdate extends BaseCommand : bit(8) tag=ObjectDes
crUpdateTag {ObjectDescriptorBase OD[1 .. 255]; }
class ObjectDescriptorRemove extends BaseCommand : bit(8) tag=ObjectDescrRemoveTag {bit(10) objectDescriptorId[(sizeOfInstance*8)/10]; }
class ES_DescriptorRemove extends BaseCommand : bit(8) tag=ES_DescrRemoveTag {bit(10) objectDescriptorId;aligned (8) bit(16) ES_ID[1..255]; }
class IPMP_DescriptorRemove extends BaseCommand : bit(8) tag=IPMP_DescrRemoveTag {bit(8) IPMP_DescriptorID[1..255]; }
100
Visual Stream (e.g. temporal enhancement)
Visual Stream (e.g. base layer)
Scene Description Stream
Object Descriptor Stream
e.g. MovieTexture
Scene Description
ObjectDescriptorID
ES_ID
ES_ID
ES_ID
ES_ID
ObjectDescriptor
:
ES_Descriptor
ES_Descriptor
initialObjectDescriptor
:
ES_Descriptor
ES_Descriptor
ObjectDescriptor
ObjectDescriptor
ObjectDescriptorUpdate
ES_DES_D
ES_D
... ...
......
BIFS Command (Replace Scene)
e.g. AudioSource
Audio Stream
Object descriptors linking scene description to elementary streams
OD Component 1: IOD// BIFS 와 media 별 OD 에 대한 Es_Descriptor 를 가진 OD// Call-setup 을 위하여 필요함class InitialObjectDescriptor extends BaseDescriptor : bit(8) tag=InitialObjectDescr
Tag bit(10) ObjectDescriptorID;bit(1) URL_Flag;bit(1) includeInlineProfileLevelFlag;const bit(4) reserved=0b1111;if (URL_Flag){ bit(8) URLlength; bit(8) URLstring[URLlength]; } else {
bit(8) ODProfileLevelIndication;bit(8) sceneProfileLevelIndication;bit(8) audioProfileLevelIndication;bit(8) visualProfileLevelIndication;
// e.g. Simple, Simple Scalable, Core, Main, etc.bit(8) graphicsProfileLevelIndication;ES_Descriptor ESD[1 .. 255]; // 한 개 이상 있어야OCI_Descriptor ociDescr[0 .. 255]; // 없어도 됨IPMP_DescriptorPointer ipmpDescrPtr[0 .. 255];
}ExtensionDescriptor extDescr[0 .. 255];
}
OD Component 2 : ODclass ObjectDescriptor extends BaseDescriptor : bit(8) tag=ObjectDescrTag {
bit(10) ObjectDescriptorID;
bit(1) URL_Flag;
const bit(5) reserved=0b1111.1;
if (URL_Flag) {
bit(8) URLlength;
bit(8) URLstring[URLlength]; //point to another OD
} else {
ES_Descriptor esDescr[1 .. 255];
// an array of ES_Descriptors, 한 개 이상 있어야
OCI_Descriptor ociDescr[0 .. 255];
IPMP_DescriptorPointer ipmpDescrPtr[0 .. 255];
}
ExtensionDescriptor extDescr[0 .. 255];
}
OD Component 3 : ES_Descriptorclass ES_Descriptor extends BaseDescriptor : bit(8) tag=ES_DescrTag {
bit(16) ES_ID;
bit(1) streamDependenceFlag;
bit(1) URL_Flag;
const bit(1) reserved=1;bit(5) streamPriority;
if (streamDependenceFlag) bit(16) dependsOn_ES_ID;
if (URL_Flag){ bit(8) URLlength; bit(8) URLstring[URLlength]; }
DecoderConfigDescriptor decConfigDescr;SLConfigDescriptor slConfigDescr;QoS_Descriptor qosDescr[0 .. 1]; // 있으면 , 한 개까지IPMPDescriptor ipmpDescrPtr[0 .. 1];
………………………… 중략 ……………………… ..
ExtensionDescriptor extDescr[0 .. 255];}
OD Component 4 : DecoderConfigDescriptor
class DecoderConfigDescriptor extends BaseDescriptor : bit(8) tag=DecoderConfigDescrTag {
bit(8) objectTypeIndication; // MPEG-1,-2 video, audio, etc.
bit(6) streamType;
bit(1) upStream;
const bit(1) reserved=1;
bit(24) bufferSizeDB;
bit(32) maxBitrate;
bit(32) avgBitrate;
DecoderSpecificInfo decSpecificInfo[0 .. 1];
}
Other OD Components
• QoS_Descriptor : delay, loss, AU_Size, etc.
• DecoderSpecificInfo• SLConfigDescriptor• ContentIdentificationDescriptor
BIFS• Binary information needed to combine, recons
truct, and present audio-visual data at the client side (not at the server side)
• spatio-temporal location/scale/orientation of audio-visual objects
• largely based on VRML (ISO/IEC 14772-1)• BIFS_ES, BIFS AU (BIFS-Command, BIFS-Ani
m), BIFS SL, BIFS time base, BIFS decoder• For interactivity, SENSOR node
Object-based
multimedia Scene
multiplexeddownstream control / data
multiplexedupstream control / data
audiovisualpresentation
3D objects
2D background
voice
sprite
hypothetical viewer
projection
videocompositor
plane
audiocompositor
scenecoordinate
systemx
y
z user events
audiovisualobjects
speaker displayuser input
Logical structure of the scene
scene
globe desk
person audiovisualpresentation
2D background furniture
voice sprite
• a graph with links and nodes (refer to graph theory.)
startTime1
startTime stopTime2
startTimestartTime+
duration3
startTime stopTime
4 set StopTime
startTime5
Set loop = FALSE
startTime+2*duration
startTime+duration
startTime+duration
Time
ParametersLoop, duration,
startTime,stopTime
1. 한번 play
2. Play 도중 stop
3. 계속 되풀이(loop=TRUE, stopTime<=startTime)
(loop=FALSE, startTime<stopTime<startTime+duration)
BIFS-Command
• Modify properties of the scene graph, its nodes, and behaviors
• applied to conditional nodes1. ReplaceEntireScene(new_scene_graph) // random access point2. Insertion(nodeID,event,ROUTE)3. Deletion(nodeID,event,ROUTE)4. Replace(nodeID,event,ROUTE)
BIFS-Anim
• update of the certain fields of nodes in the scene graph
• meshes, 2D/3D positions, rotations, scale factors, and color attributes
• Separate ESs for BIFS-Command (CommandFrames) and BIFS-Anim (AnimationFrames)
Composite Texture2D example (projected on 3D
cube)
CompositeTexture2D{
eventIn MFNode addChildren
eventIn MFNode removeChildren
exposedField MFNode children
exposedField SFInt32 pixelWidth
exposedField SFInt32 pixelHeight
exposedField SFNode background
exposedField SFInt32 viewport
}
Sync layer (SL)
• defines a syntax for the packetization of each ES into AUs or parts of AU
• SPS (SL packet stream) : the sequence of SL packets from one ES
DMIF Application Interface
Elementary Stream Interface
SL-Packetized Streams
Elementary Streams
Sync LayerSL SLSL SL.............
class SLConfigDescriptor extends BaseDescriptor : bit(8) tag=SLConfigDescrTag {bit(8) predefined;if (predefined==0) {
bit(1) useAccessUnitStartFlag; bit(1) useAccessUnitEndFlag;bit(1) useRandomAccessPointFlag;
bit(1) hasRandomAccessUnitsOnlyFlag;bit(1) usePaddingFlag; bit(1) useTimeStampsFlag;bit(1) useIdleFlag; bit(1) durationFlag;bit(32) timeStampResolution; bit(32) OCRResolution;bit(8) timeStampLength; // must be 64bit(8) OCRLength; // must be 64bit(8) AU_Length; // must be 32bit(8) instantBitrateLength; bit(4) degradationPriorityLengt
h;bit(5) AU_seqNumLength; // must be 16bit(5) packetSeqNumLength; // must be 16bit(2) reserved=0b11; }
if (durationFlag) {bit(32) timeScale; bit(16) accessUnitDuration;bit(16) compositionUnitDuration; }
if (!useTimeStampsFlag) {bit(timeStampLength) startDecodingTimeStamp;bit(timeStampLength) startCompositionTimeStamp; } }
SLConfigDescriptor in ES_Descriptor
SL Packet Header
• packetSequenceNumber• degradationPriority• objectClockReference• decodingTimeStamp• compositionTimeStamp• accessUnitLength• instantBitrate
MPEG-Java
• Flexible programmatic control system
(not parametric)• Capability for graceful degradation
under limited or time varying resources
• Capability to respond to user interaction and provide enhanced multimedia functionality
MPEG-J System
• Combine MPEG-media and safe executable code (Java code)
• Components of MPEG-4 player- Execution and presentation resources- Decoders- Network resources- Scene graph • Downloadable decoder????
MPEG-J enabled MPEG-4 System
DEMUX
M P E G - JA p p l i c a t i o n
B u f f e r
S c e n e G r a p hM a n a g e r
R e s o u r c eM a n a g e r
I / OD e v i c e s
N e t w o r kM a n a g e r
C l a s sL o a d e r
D M I F S c e n eG r a p h
B I F SD e c o d e r
D e c o d i n gB u f f e r s 1 . . n
M e d i aD e c o d e r s 1 . . n
C o m p o s i t i o nB u f f e r s 1 . . n
C o m p o s i t o ra n d R e n d e r e r
V e r s i o n 1p l a y e r
N W A P I S G A P I R M A P I
L e g e n d
I n t e r f a c e
C o n t r o ld a t a
B a c kC h a n n e l
C h a n n e l
M D A P I
FlexMux (optional)
• Multiplexing or separate channel?- Multiplexing : circuit switching- Separate channels : packet switching
• Multiplexing : low overhead- RTP/UDP/IP header size (40 bytes > )
compared to audio packet payload (20 bytes)
- Simpler than MPEG-2 TS
Simple Mode
FlexMux-PDU
PayloadHeader
SL-PDUlengthindex
MuxCode Mode
.......SL-PDUSL-PDUversion SL-PDUlengthindex
.......H PayloadH Payld H Payload
FlexMux-PDU
MP4 File format• (normally) self-contained file cf. *.asf• Protocol-unaware, media-unaware
IODmoov
mp4 file
mdattrak (BIFS)
trak (OD)
trak (video)
trak (audio)
... other atoms
Interleaved, time-ordered, BIFS, OD, video, and audio access units, and hintinstructions
hint
MP4 File Usage
• Interchange • Content creation : authoring• Preparation for streaming :
interleaving• Local presentation : CD, DVD-ROM• Streamed presentation (not yet, in
IM1)
MP4 Terminology• atom : ‘object’ in sense of object-oriente
d concept e.g. ‘iods’ OD atom, ‘moov’ movie atom,
‘mdat’ media data atom etc. • trak : ES + [hint trak] e.g. video trak, au
dio trak • hint trak : packetization information • Container : file‘moov’ ‘mvhd’ ‘mdhd’
Hint track• Bridge between MPEG-4 and a protocol• Each TransMux has its own hint track format.
(ES over TransMuxes)
• aligned(8) class HintMediaHeaderAtom extends FullAtom(‘hmhd’, version = 0, 0) {unsigned int(16)maxPDUsize;unsigned int(16)avgPDUsize;unsigned int(32)maxbitrate;unsigned int(32)avgbitrate;unsigned int(32)slidingavgbitrate;
}
DMIF
Compression Layer
media awaredelivery unawareISO/IEC 14496-2VisualISO/IEC 14496-3Audio
ElementaryStreamInterface(ESI)
Sync Layermedia unawaredelivery unawareISO/IEC 14496-1Systems
Delivery Layer
DMIFApplicationInterface(DAI)media unaware
delivery awareISO/IEC 14496-6 DMIF
DMIF Usage
Originating
App
Flows between independent systems (normative)
Flows internal to a single system (either informative or out of DMIF scope)
Originating DMIF
for Broadcast
Originating DMIF
for Remote srv
Originating DMIF
for Local Files
Target DMIF
Target DMIF
Network
DNI
Broadcastsource
DM
IF F
ilter
Sigmap
Target App.
Target App.
DAI
Sigmap
Target DMIF TargetApp
LocalStorage
InteractiveNetwork
DNI DAIServer
Client
DMIF Terminology
• Service : DMIF provides a service to an application(or user).
• Service session : local association between DMIF instance and a service
• Network session : an association between two DMIF peers
• Channel over which a DMIF user sends or receives data
DMIF user DMIF user
DMIFInstance
uu
ddDMIF
Instance
service
Service
session
Network session
service
Service
session
Network
TransMuxchannels
DMIF Terminology
Network service primitives
Network
User User
1. Request 2. Indication4. Confirm
3. Response
DMIF-Application Interface
• Service primitives e.g. DA_ServiceAttach(IN: URL, uuDataInBuffer, uuDataInLen; OUT: response,
serviceSessionId, uuDataOutBuffer, uuDataOutLen)
• Channel primitives e.g. DA_ChannelDelete(IN: loop(channelHandle,reason) OUT: loop(response))
• Data primitives e.g. DA_Data(IN: channelHandle, streamDataBuffer, streamDataLen)
DMIF Network Interface
• Session primitives : setup and release- DN_SessionSetup(), DN_SessionRelease()
• Service primitives : attach and detach- DN_ServiceAttach(), DN_ServiceDetach()
• Transmux primitives : setup, release, and config
- DN_TransMuxSetup(), DN_TransMuxRelease(), DN_TransMuxConfig()
• Channel primitives : add and delete
the applicationinitiates
the service DA_ServiceAttach
(IN: DMIF_URL,uuData)
DN_SessionSetup(IN: nsId, CalledAddr,
CallingAddr, CapDescr)
(OUT: rsp, CapDescr)
DN_ServiceAttach(IN: nsId, serviceId,
serviceName, ddData)
(OUT: rsp, ddData)
DA_ServiceAttach(IN: ssId,
serviceName,uuData)
(OUT: rsp, uuData)(OUT: rsp, ssId,uuData)
determinewhether a
new networksession
is needed
attach to theservice
Connect to theapplication
runningthe service
the applicationrunning
the service replies
1
2
3
4
5
67
8
Application ApplicationDAIDAI DMIF Layer
DMIF Layer
DNI + Network + DNI
Origin DMIF Terminal (Client) Target DMIF Terminal (Server)
Conclusion
• Semantic and syntax• General or specific applications• Multimedia over [mobile] Internet- ATM => All IP- QoS Issues (time varying and limited)• Imlementation 시작은 IM1 으로 mpeg4.nist.gov/IM1
Future Works
• Downloadable decoder cf. SDR (software defined radio) • All IP (<= all ATM)• QoS control : transport layer => network layer IETF (RSVP, diffServ, intServ, MPLS)
Abbreviations• AU access unit• AV audio-visual• BIFS binary format for scene• CM composition memory• CTS composition time stamp• CU composition unit• DAI DMIF-application interface• DB decoding buffer • DNI DMIF-network interface• DTS decoding time stamp• ES elementary stream• ESI elelmentary stream interface• ESID elementary stream identifier• IPMP intellectual property managem
ent and protection
• OCI object content information• OCR object clock reference• OD object description• OTB object time base• PLL phase locked loop• QoS quality of service• SDM system decoder model • SL synchronization layer• SPS SL-packetized stream• STB system time base• URL universal resource locator• VOP video object plane• VRML virtual reality modeling langua
ge