DiaStar Multimedia IVVR
From ProjectDiaStar
Contents |
What is IVVR?
The next step beyond Interactive Voice Response (IVR) is Interactive Voice and Video Response. (IVVR) Both allow user interaction with the media to control the flow of the application through menus and data entry into the app with DTMF tones. But IVVR allows for much more information to be displayed, allowing sophisticated applications that would be impossible in a voice-only world.
In the same way that IVR systems enabled companies to create self-help telephony applications and reduce contacts with their agents, IVVR will extend that paradigm, allowing companies to build self-help audio/video applications that address significantly more complex tasks. Complex instructions can be delivered to precisely the location where they are needed.
For some applications, a small web browser interface on a hand-held device may be too complex. Here, video menus can provide a simplified user interface on a mobile phone.
Here are some good examples of viable mobile IVVR applications:
- Mobile Banking - access account info, transfer between accounts, pay bills, etc. Operations are clearer with video menus. Figures remain on the screen until the user dismisses them
- Store Locator - provides address, telephone number (with one button calling) and maps of various scales to find the address. Adds for current specials can be viewed
- Some Assembly Required - an interactive, pictorial/video walk-through of putting together a product such as an iPod docking station and its speakers
DiaStar has the capabilities to play and record multimedia (simultaneous audio and video) files when used in conjunction with an Asterisk client. While audio-only media with DiaStar and Asterisk is handled on Asterisk, DiaStar controls the playing and recording of multimedia files. The media terminates/originates on DiaStar, rather than being passed through to Asterisk. Control of the IVVR app flow is still accomplished through the Asterisk dialplan or AGI - inbound calls are answered, outbound calls made and multimedia sessions initiated.
DiaStar delivers mutlimedia to any SIP-compliant endpoint. This includes desktop video SIP phones, video SIP softphones running on stationary or laptop PCs, SIP devices on handheld devices with WiFi internet connectivity. 3G-324M delivery of multimedia to 3G mobile phones is also possible, either through a 3G gateway or directly from the DiaStar server. Video codec support includes H.263 and MPEG4, with H.264 support available soon.
For further ideas on IVVR applications and information on how mobile video fits into the world of telecom in general, look through these links to application notes and white papers about mobile video on Dialogic products:
- An Introduction to Multimedia Services
- Addressing Video Processing Challenges with the IP Multimedia Subsystem
- Bringing Video to the Mobile Handheld Market
- Increasing Customer Loyalty with Enhanced Video Services
- 3G Video Applications Enter a New Era
Mobile Video
Mobile video is, by the nature of the devices used, viewed on a small screen. Devices include 3G mobile phones, Smart Phones and Personal Digital Assistants. In addition to handhelds, small desktop devices such as dedicated video SIP phones and soft SIP phones running on a laptop or desktop PC are likely video endpoints.
All of these devices use formats known as Common Intermediate Format (CIF) or Quarter Intermediate Format. (QCIF) These formats are sized as follows:
- CIF 352 x 288 pixels
- QCIF 176 x 144 pixels
The physical size of each of these formats depends on the size of an individual pixel, but each look best at around 100 pixels/inch.
Certain video and audio codecs are standard for mobile video. More information on these will be found below.
Multimedia Production for Mobile Video
As with IVR, there is a production process that needs to be followed for IVVR. The following diagram shows the what goes into this process:
- The show is first planned and scripted
- Video clips are shot or purchased
- Clips are edited and menus created
- Post-production of the clips are done to get the proper formats for mobile video
- An Asterisk Dialplan is written
- Media files are stored on the DiaStar server
- The production is ready for use
Scripting & Storyboard
A storyboard should first be drawn up. This outlines the flow of the interactive multimedia application, and includes menus, data entry screens and video clips. The clips themselves will usually include spoken dialog, either from the character in the clip, or overlaid after the filming is done. Once the script for the clips has been finished, filming can begin.
Multimedia Capture
Video elements can be created from stock video footage or by filming orginal content.
When using stock footage (or any video created by someone other than yourself) make sure that you have secured the proper permission to use that footage as part of your project.
When filming original footage you should film in the highest possible resolution for your video camera. While a professional quality camera is usually not necessary for the resolutions needed for mobile video, you may need to create higher resolution source files if your application will be used on a wireline SIP phones.
Paying attention to basic production values can make your application appear very professional, without going to the expense of using a profession videographer.
- Control the Lighting - In general, bright, even lighting is good. However, you need to make sure not to create any "hot spots, where the light is so bright that it confuses your camera. Hot spots can be created by very shinny or very white objects that get caught in the frame. The same thing goes for areas of darkness as you pan across your subject. Some cameras cannot adjust to sudden changes in lighting, resulting in missing key moments of the action.
- Patterns - Certain patterns on clothing,furniture or as a font-fill look great in person or at a particular resolution, and can be completely unviewable at another. Small patters can seem to "strobe" or seem out of focus all the time.
- Use a Tripod or Camera Stand - Even the smallest amount of unsteadyness can cause problems. Use a stand of some sort.
- Get Close to Your Subject - Be as close as possible to the subject of your shot, especially if you are creating an IVVR that is used to deliver physical instructions (like assembling parts). Remember that your video may be viewed on a very small screen, and unless the subject takes up the full frame, it may not be watchable at all.
- Motion - Fast motion and/or fast panning can be hard to view on a small screen. Move, pan and zoom slowly.
- Do Not Use the Built in Microphone on the Camera - You should always use an external mike when filming. That way the microphone can stay the same distance from the subjects, even if the camera is moving. If you do not have the need for exactly synchronzation of audio and video (that is, you don't see people talking in your video) it is best to record the audio tracks separately from the video. That way you can totally control the recording of audio and then just add it to the video in a basic editing tool.
Output from the filming will usually be in a common video format suitable for transfer to a PC, such as .WMV. (Windows Movie) or .AVI (Audio/Video Interleave. Once downloaded to a PC, editing can take place.
Video Editing
This is another area where spending a little time can make your IVVR look very professional. An expensive editing tool is not neccessary. The early IVVR demos were created for DiaStar using a $20 video editing tool. All you really need is the very basics of being able to trim and arrange video clips, create text overlays (for menues) and adding (or adding to) the video track.
One of the major offenses in video editing is going with the mindset that "if the tool can do it, then I should use it." This is often seen when developers first start to use PowerPoint or other presenation creation tools. Don't try to pack every video effect or transition into your IVVR. Pick one transition that works and stick with it. Remember that the point of an IVVR is to get to the required information, and not about an entertainng video preseantion.
Also keep in mind that many IVVR systems may be used in mobile situations, so you want to keep menu options short and to the point. Don't present the user with a menu that has a dozen choices or options.
Because the creation of IVVR systems is still a realatively new artform, the industry has not yet generated a comprehensive set of real style guidelines. Until such a time as IVVR style guides are created and adopted, you can do a search on "tips for creating good presentations" (without the quotes) and apply many of the tips provided to your IVVR video segments.
Here are several editors:
- OpenShot Non-Linear Video Editor for Linux
- AVIdemux
- Kdenlive free and open-source video editor for GNU/Linux and FreeBSD
At this point the video production is complete and post-production for mobile formats can be done.
Post-Production Processing for Mobile Delivery
WMV and AVI are formats designed for higher bandwidth data paths and larger, higher resolution screens. Correct formatting and bandwidth limitations must be imposed, as well as packaging the audio and video in a standard mobile video format. This "multimedia container format" is known as 3GP.
The format is specified by the 3rd Generation Partnership Project, (3GPP) an organization that unites telecommunications standards bodies around mobile broadband standards. The group's website can be found [here].
The following codecs are used in the 3GP format, with those in italics currently supported by DiaStar:
Video Codecs
- MPEG-4 Part 2
- H.263
- MPEG-4 Part 10 (AVC/H.264)
Audio Codecs
- AMR-NB
- AMR-WB
- AMR-WB+
- AAC-LC
- HE-AAC v1 or v2
Bandwidth limitations for multimedia are especially important if the content will be delivered over a 3G-324M connection. In 3G-324M, audio, video and multiplexing/control information needs to fit into a standard TDM 64Kbps digital channel. Using AMR (adaptive multi-rate codec) for audio consumes about 12.5Kbps, and control about the same. The video stream must be processed so that it runs at a steady rate at around 38Kbps, without higher bursts. Exceeding available bandwidth can literally "back up the pipe" and cause gaps, losses of synchronization and other undesirable effects. The preferred format for 3G-324M is QCIF, with a frame rate of 10 fps.
Delivery over either wired or wireless Ethernet is less stringent. It is usually not difficult to get a good quality CIF format that handles motion well with an available bandwidth of around 384Kbps.
There are a number of options available for mobile video postprocessing. Proprietary encoders that work with DiaStar are:
A well-known open source encoder that is able to produce 3GP files is:
It is described as "a complete, cross-platform solution to record, convert and stream audio and video". However, it runs as a command line tool with a lot of options and getting all the parameters straight can be a challenge.
Conversion to Proprietary DiaStar Format
A final step is necessary to convert a standard .3gp file into the proprietary format used by DiaStar. There is a Dialogic utility that may be freely downloaded, "hmp3gp", that will do the conversion. It is a relatively simple process. The conversion direction is given and audio output type (linear PCM) is specified, as well as proprietary and .3gp file names. For example:
> hmp3gp -d1 -aMM07 main_menu.vid main_menu.pcm main_menu.3gp
The .vid and .pcm files are now ready for use by DiaStar.
Linux Multimedia Offline Conversion Tools and full documentation may be downloaded here.
Note that direct .3gp play/record will be available in a future release of DiaStar.
DiaStar Media Files
While audio files are located on an Asterisk system, multimedia files are located on the DiaStar system. The are expected to be in a predefined file system structure, as the server searches for various types of media that are specified by an incoming call. If it cannot find a file in a called-for format, it will transcode to that format if possible.
The four video file types expected are:
- MPEG-4, CIF format
- H.263, CIF format
- MPEG-4, QCIF format
- H.263, QCIF format
Audio files are, by default, 8kHz, 16bit, signed-linear PCM. This is the same as Asterisk’s .sln format. This may prove useful if existing Asterisk audio files, e.g. files containing background music, are reused.
The file system structure is similar to that used by Asterisk:
/var/lib/diastar/media/language/audio_video_codec/video_format/application_name/file.extension
A set of multimedia files for the main menu in the verification_demo application would be stored as follows:
/var/lib/diastar/media/en_US/h263/CIF/verification_demo_demo/main_menu_000.vid /var/lib/diastar/media/en_US/h263/QCIF/verification_demo/main_menu_000.vid /var/lib/diastar/media/en_US/mpv4-es/CIF/verification_demo/main_menu_000.vid /var/lib/diastar/media/en_US/mpv4-es/QCIF/verification_demo/main_menu_000.vid /var/lib/diastar/media/en_US/vox/verification_demo/main_menu_000.pcm
See the Media Files section in the DiStar wiki for more infomation.
A DiaStar tech Brief is planned that will cover the details of video post production.
Asterisk Dialplan Development
Asterisk dialplan development can begin either when video production is finished, or done with a series of "dummy" media files. These files can be simple multimedia announcements of menus and clips specified by the storyboard. They are most easily done with a PC camera and a video SIP softphone on the PC. A DiaStar/Asterisk application such as the verification demo can be used to record these dummy clips. Once the clips are in place in the directory structure and names assigned according to the storyboard, the dialplan programming to create menus and play and record clips can be done.
The main difference between standard dialplan programming and multimedia programming with DiaStar is the use of the 3 multimedia play and record commands:
exten => s,n,WoomeraPlayback(audio_file,video_file) exten => s,n,WoomeraBackground(audio_file,video_file) exten => s,n,WoomeraRecord(audio_file,video_file)
These work the same ways as their Asterisk audio counterparts:
- WoomeraPlayback plays multimedia without allowing interruption by DTMF
- WoomeraBackground plays multimedia as a background task while monitoring for DTMF
- WoomeraRecord records multimedia into a pair of files until a # is detected
The dialplan for the verification demo that comes with DiaStar is shown below to illustrate many of Diastar's IVVR features:
;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Asterisk Dialplan for DiaStar System Multimedia Verification Demo
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;
;
[globals]
;
; Media files are selected based on incoming SDP requests from SIP.
; Audio and video files are kept in standard directories organized by
; codec and resolution, followed by a directory named after the application.
FILE_LOC=verification_demo
[dialogic]
; Always start here. Any extension starts the demo
exten => _X.,1,Answer
exten => _X.,n,Goto(greeting,s,1)
[play_nointerrupt]
exten => s,1,Set(LOCAL(CALLING_CONTEXT)=${ARG1})
exten => s,n,Set(LOCAL(NEXT_CONTEXT)=${ARG2})
exten => s,n,Set(LOCAL(COMMENT)=${ARG3})
exten => s,n,Verbose(Context: ${CALLING_CONTEXT})
exten => s,n,Verbose(Next Context: ${NEXT_CONTEXT})
exten => s,n,Verbose(Comment: ${COMMENT})
; Play without interruption
exten => s,n,WoomeraPlayback(${FILE_LOC}/${CALLING_CONTEXT},${FILE_LOC}/${CALLING_CONTEXT})
exten => s,n,Goto(${NEXT_CONTEXT},s,1)
exten => s,n,Return
[greeting]
; Beginning of the Demo - Start Here
exten => s,1,Set(COMMENT=Verification Demo Intro)
exten => s,n,Set(NEXT_CONTEXT=main_menu)
exten => s,n,Gosub(play_nointerrupt,s,1(${CONTEXT},${NEXT_CONTEXT},${COMMENT}))
[main_menu]
; Menu listing demo options
exten => s,1,Verbose(Context: main_menu)
exten => s,n,Verbose(Press 1 to play a video clip, 2 to record and replay a clip, 3 to hang up)
exten => s,n,WoomeraBackground(${FILE_LOC}/${CONTEXT},${FILE_LOC}/${CONTEXT})
exten => s,n,WaitExten(20)
exten => 1,1,Verbose(Choice is "1" )
exten => 1,n,Goto(play_menu,s,1)
exten => 2,1,Verbose(Choice is "2" )
exten => 2,n,Goto(record_intro,s,1)
exten => 3,1,Verbose(Choice is "3" )
exten => 3,n,Goto(hangup,s,1)
exten => t,1,Goto(main_menu,s,1)
exten => i,1,Goto(main_menu,s,1)
[play_menu]
; Menu listing clips to play
exten => s,1,Verbose(Context: play_menu)
exten => s,n,Verbose(Press 1 to play a XXX video clip, 2 to play a YYY clip)
exten => s,n,WoomeraBackground(${FILE_LOC}/${CONTEXT},${FILE_LOC}/${CONTEXT})
exten => s,n,WaitExten(20)
exten => 1,1,Verbose(Choice is "1" )
exten => 1,n,Goto(video_clip_callcenter,s,1)
exten => 2,1,Verbose(Choice is "2" )
exten => 2,n,Goto(RaceFootage,s,1)
exten => t,1,Goto(play_menu,s,1)
exten => i,1,Goto(play_menu,s,1)
[record_intro]
; Instructions on how to record. Then, go to context to make recording
exten => s,1,Verbose(Context: record_intro)
exten => s,n,Verbose(Press 1 to start the video recording, # to stop and replay)
exten => s,n,WoomeraBackground(${FILE_LOC}/${CONTEXT},${FILE_LOC}/${CONTEXT})
exten => s,n,WaitExten(20)
exten => 1,1,Verbose(Choice is "1" )
exten => 1,n,Goto(record_replay,s,1)
exten => t,1,Goto(record_intro,s,1)
exten => i,1,Goto(record_intro,s,1)
[video_clip_callcenter]
exten => s,1,Set(COMMENT=Play call center video clip)
exten => s,n,Set(NEXT_CONTEXT=main_menu)
exten => s,n,Gosub(play_nointerrupt,s,1(${CONTEXT},${NEXT_CONTEXT},${COMMENT}))
exten => s,n,Goto(main_menu,s,1)
[RaceFootage]
exten => s,1,Set(COMMENT=Play NASCAR video clip)
exten => s,n,Set(NEXT_CONTEXT=main_menu)
exten => s,n,Gosub(play_nointerrupt,s,1(${CONTEXT},${NEXT_CONTEXT},${COMMENT}))
exten => s,n,Goto(main_menu,s,1)
[record_replay]
exten => s,1,Verbose(Context: record_replay)
exten => s,n,Set(NEXT_CONTEXT=main_menu)
exten => s,n,WoomeraRecord(${FILE_LOC}/audio_recording,${FILE_LOC}/video_recording)
exten => s,n,Set(COMMENT=Replaying recording just made)
exten => s,n,WoomeraPlayback(${FILE_LOC}/audio_recording,${FILE_LOC}/video_recording)
exten => s,n,Goto(main_menu,s,1)
[hangup]
exten => s,1,Verbose(Context: hangup)
exten => s,n,Wait(1)
exten => s,n,Hangup()
Setting Up DiaStar for Multimedia
Setting up DiaStar for IVVR is one of the simplest configurations, as it need no extra PSTN hardware. A DiaStar license is needed and it is tied to the system's MAC address. See the DiaStar Licenses section for details.
Once Diastar is configured and licensed, a sip-related configuration file is generated that specifies all the devices available for processing IVVR calls. See the Configuring the DiaStar Server section of the User's Guide for further information.
In addition, the config file contains all the parameters needed for the SIP configuration of the server:
- IP address
- SIP port
- SIP registrar and outbound proxy
- Security - SIP Digest Authentication
See the SIP Configurationsection of the User's Guide for details.
The DiaStar ISO installation CD also contains a verification demo that gives users a single port multimedia license. It contains examples of the the 3 extensions to the Asterisk dialplan that can be used for IVVR.
Two full IVVR demos, including dialplans and multimedia files, are available from DiaStar SVN. Check them out as follows:
> svn checkout svn://svn.projectdiastar.org/tools/trunk/demos


