Multimedia: 2009

Wednesday, December 2, 2009

The MPEG Standard

Finally, we come to the heart of the matter: the MPEG (Motion Picture Experts Group) standards. These are the main algorithms used to compress videos and have been international standards since 1993. Because movies contain both images and sound, MPEG can compress both audio and video. We have already examined audio compression and still image compression, so let us now examine video compression.

The first standard to be finalized was MPEG-1 (International Standard 11172). Its goal was to produce video-recorder-quality output (352 × 240 for NTSC) using a bit rate of 1.2 Mbps. A 352 × 240 image with 24 bits/pixel and 25 frames/sec requires 50.7 Mbps, so getting it down to 1.2 Mbps is not entirely trivial. A factor of 40 compression is needed. MPEG-1 can be transmitted over twisted pair transmission lines for modest distances. MPEG-1 is also used for storing movies on CD-ROM.

The next standard in the MPEG family was MPEG-2 (International Standard 13818), which was originally designed for compressing broadcast-quality video into 4 to 6 Mbps, so it could fit in a NTSC or PAL broadcast channel. Later, MPEG-2 was expanded to support higher resolutions, including HDTV. It is very common now, as it forms the basis for DVD and digital satellite television.

The basic principles of MPEG-1 and MPEG-2 are similar, but the details are different. To a first approximation, MPEG-2 is a superset of MPEG-1, with additional features, frame formats, and encoding options. We will first discuss MPEG-1, then MPEG-2. MPEG-1 has three parts: audio, video, and system, which integrates the other two, as shown in Fig. 7-20. The audio and video encoders work independently, which raises the issue of how the two streams get synchronized at the receiver.

The JPEG Standard

A video is just a sequence of images (plus sound). If we could find a good algorithm for encoding a single image, this algorithm could be applied to each image in succession to achieve video compression. Good still image compression algorithms exist, so let us start our study of video compression there. The JPEG (Joint Photographic Experts Group) standard for compressing continuous-tone still pictures (e.g., photographs) was developed by photographic experts working under the joint auspices of ITU, ISO, and IEC, another standards body. It is important

for multimedia because, to a first approximation, the multimedia standard for moving pictures, MPEG, is just the JPEG encoding of each frame separately, plus some extra features for interframe compression and motion detection. JPEG is defined in International Standard 10918.

JPEG has four modes and many options. It is more like a shopping list than a single algorithm. For our purposes, though, only the lossy sequential mode is relevant, and that one is illustrated in Fig. 7-15. Furthermore, we will concentrate on the way JPEG is normally used to encode 24-bit RGB video images and will leave out some of the minor details for the sake of simplicity.

Digital Systems

The simplest representation of digital video is a sequence of frames, each consisting of a rectangular grid of picture elements, or pixels. Each pixel can be a single bit, to represent either black or white. The quality of such a system is similar to what you get by sending a color photograph by fax—awful. (Try it if you can; otherwise photocopy a color photograph on a copying machine that does not rasterize.)

The next step up is to use 8 bits per pixel to represent 256 gray levels. This scheme gives high-quality black-and-white video. For color video, good systems use 8 bits for each of the RGB colors, although nearly all systems mix these into composite video for transmission. While using 24 bits per pixel limits the number of colors to about 16 million, the human eye cannot even distinguish this many colors, let alone more. Digital color images are produced using three scanning beams, one per color. The geometry is the same as for the analog system of

Fig. 7-14 except that the continuous scan lines are now replaced by neat rows of

discrete pixels. To produce smooth motion, digital video, like analog video, must display at

least 25 frames/sec. However, since good-quality computer monitors often rescan

the screen from images stored in memory at 75 times per second or more, interlacing

is not needed and consequently is not normally used. Just repainting (i.e., redrawing)

the same frame three times in a row is enough to eliminate flicker. In other words, smoothness of motion is determined by the number of different images per second, whereas flicker is determined by the number of times the screen is painted per second. These two parameters are different. A still image painted at 20 frames/sec will not show jerky motion, but it will flicker because one frame will decay from the retina before the next one appears. A movie with

20 different frames per second, each of which is painted four times in a row, will

not flicker, but the motion will appear jerky.

The significance of these two parameters becomes clear when we consider the bandwidth required for transmitting digital video over a network. Current computer monitors most use the 4:3 aspect ratio so they can use inexpensive, massproduced picture tubes designed for the consumer television market. Common configurations are 1024 × 768, 1280 × 960, and 1600 × 1200. Even the smallest of these with 24 bits per pixel and 25 frames/sec needs to be fed at 472 Mbps. It would take a SONET OC-12 carrier to manage this, and running an OC-12 SONET carrier into everyone’s house is not exactly on the agenda. Doubling this rate to avoid flicker is even less attractive. A better solution is to transmit 25 frames/sec and have the computer store each one and paint it twice. Broadcast television does not use this strategy because television sets do not have memory. And even if they did have memory, analog signals cannot be stored in RAM without conversion to digital form first, which requires extra hardware. As a consequence,

interlacing is needed for broadcast television but not for digital video.

Analog Systems

To understand video, it is best to start with simple, old-fashioned black-andwhite television. To represent the two-dimensional image in front of it as a onedimensional voltage as a function of time, the camera scans an electron beam rapidly across the image and slowly down it, recording the light intensity as it goes. At the end of the scan, called a frame, the beam retraces. This intensity as a function of time is broadcast, and receivers repeat the scanning process to reconstruct the image. The exact scanning parameters vary from country to country. The system used in North and South America and Japan has 525 scan lines, a horizontal-tovertical

aspect ratio of 4:3, and 30 frames/sec. The European system has 625 scan lines, the same aspect ratio of 4:3, and 25 frames/sec. In both systems, the top few and bottom few lines are not displayed (to approximate a rectangular image on the original round CRTs). Only 483 of the 525 NTSC scan lines (and 576 of the 625 PAL/SECAM scan lines) are displayed. The beam is turned off during the vertical retrace, so many stations (especially in Europe) use this time to broadcast

TeleText (text pages containing news, weather, sports, stock prices, etc.). While 25 frames/sec is enough to capture smooth motion, at that frame rate many people, especially older ones, will perceive the image to flicker (because the old image has faded off the retina before the new one appears). Rather than increase the frame rate, which would require using more scarce bandwidth, a different approach is taken. Instead of the scan lines being displayed in order, first

all the odd scan lines are displayed, then the even ones are displayed. Each of these half frames is called a field. Experiments have shown that although people notice flicker at 25 frames/sec, they do not notice it at 50 fields/sec. This technique is called interlacing. Noninterlaced television or video is called progressive.

Note that movies run at 24 fps, but each frame is fully visible for 1/24 sec.

Comparison of H.323 and SIP

H.323 and SIP have many similarities but also some differences. Both allow two-party and multiparty calls using both computers and telephones as end points. Both support parameter negotiation, encryption, and the RTP/RTCP protocols. A summary of the similarities and differences given. Although the feature sets are similar, the two protocols differ widely in philosophy. H.323 is a typical, heavyweight, telephone-industry standard, specifying

the complete protocol stack and defining precisely what is allowed and what is forbidden. This approach leads to very well defined protocols in each layer, easing the task of interoperability. The price paid is a large, complex, and rigid standard that is difficult to adapt to future applications.

In contrast, SIP is a typical Internet protocol that works by exchanging short lines of ASCII text. It is a lightweight module that interworks well with other Internet protocols but less well with existing telephone system signaling protocols. Because the IETF model of voice over IP is highly modular, it is flexible and can be adapted to new applications easily. The downside is potential interoperability problems, although these are addressed by frequent meetings where different

implementers get together to test their systems.

Voice over IP is an up-and-coming topic. Consequently, there are several books on the subject already. A few examples are . The May/June 2002 issue of Internet Computing has several articles on this topic.

SIP—The Session Initiation Protocol

H.323 was designed by ITU. Many people in the Internet community saw it as a typical telco product: large, complex, and inflexible. Consequently, IETF set up a committee to design a simpler and more modular way to do voice over IP. The major result to date is the SIP (Session Initiation Protocol), which is described in RFC 3261. This protocol describes how to set up Internet telephone calls, video conferences, and other multimedia connections. Unlike H.323, which is a complete protocol suite, SIP is a single module, but it has been designed to interwork well with existing Internet applications. For example, it defines telephone numbers as URLs, so that Web pages can contain them, allowing a click on a link to initiate a telephone call (the same way the mailto scheme allows a click on a link to bring up a program to send an e-mail message).

SIP can establish two-party sessions (ordinary telephone calls), multiparty sessions (where everyone can hear and speak), and multicast sessions (one sender, many receivers). The sessions may contain audio, video, or data, the latter being useful for multiplayer real-time games, for example. SIP just handles setup, management, and termination of sessions. Other protocols, such as RTP/RTCP, are used for data transport. SIP is an application-layer protocol and can run over UDP or TCP.

SIP supports a variety of services, including locating the callee (who may not be at his home machine) and determining the callee’s capabilities, as well as handling the mechanics of call setup and termination. In the simplest case, SIP sets up a session from the caller’s computer to the callee’s computer, so we will examine that case first.

Telephone numbers in SIP are represented as URLs using the sip scheme, for example, sip:ilse@cs.university.edu for a user named Ilse at the host specified by the DNS name cs.university.edu. SIP URLs may also contain IPv4 addresses, IPv6 address, or actual telephone numbers.

Saturday, November 7, 2009

H.323

One thing that was clear to everyone from the start was that if each vendor designed its own protocol stack, the system would never work. To avoid this problem, a number of interested parties got together under ITU auspices to work out standards. In 1996 ITU issued recommendation H.323 entitled ‘‘Visual Telephone Systems and Equipment for Local Area Networks Which Provide a Non- Guaranteed Quality of Service.’’ Only the telephone industry would think of such a name. The recommendation was revised in 1998, and this revised H.323 was the basis for the first widespread Internet telephony systems. H.323 is more of an architectural overview of Internet telephony than a specific protocol. It references a large number of specific protocols for speech coding, call setup, signaling, data transport, and other areas rather than specifying these things itself. The general model is depicted . At the center is a gateway that connects the Internet to the telephone network. It speaks the H.323 protocols on the Internet side and the PSTN protocols on the telephone side. The communicating devices are called terminals. A LAN may have a gatekeeper, which controls the end points under its jurisdiction, called a zone.

Voice over IP

Once upon a time, the public switched telephone system was primarily used for voice traffic with a little bit of data traffic here and there. But the data traffic grew and grew, and by 1999, the number of data bits moved equaled the number of voice bits (since voice is in PCM on the trunks, it can be measured in bits/sec).

By 2002, the volume of data traffic was an order of magnitude more than the volume of voice traffic and still growing exponentially, with voice traffic being almost flat (5% growth per year).

As a consequence of these numbers, many packet-switching network operators suddenly became interested in carrying voice over their data networks. The amount of additional bandwidth required for voice is minuscule since the packet networks are dimensioned for the data traffic. However, the average person’s phone bill is probably larger than his Internet bill, so the data network operators saw Internet telephony as a way to earn a large amount of additional money without having to put any new fiber in the ground. Thus Internet telephony (also known as voice over IP), was born.

Streaming Audio

Let us now move from the technology of digital audio to three of its network applications. Our first one is streaming audio, that is, listening to sound over the Internet. This is also called music on demand. In the next two, we will look at Internet radio and voice over IP, respectively.

The Internet is full of music Web sites, many of which list song titles that users can click on to play the songs. Some of these are free sites (e.g., new bands looking for publicity); others require payment in return for music, although these often offer some free samples as well (e.g., the first 15 seconds of a song).The process starts when the user clicks on a song. Then the browser goes into action. Step 1 is for it to establish a TCP connection to the Web server to which the song is hyperlinked. Step 2 is to send over a GET request in HTTP to request the song. Next (steps 3 and 4), the server fetches the song (which is just a file in MP3 or some other format) from the disk and sends it back to the browser. If the file is larger than the server’s memory, it may fetch and send the music a block at a time.

Using the MIME type, for example, audio/mp3, (or the file extension), the browser looks up how it is supposed to display the file. Normally, there will be a helper application such as RealOne Player, Windows Media Player, or Winamp, associated with this type of file. Since the usual way for the browser to communicate with a helper is to write the content to a scratch file, it will save the entire music file as a scratch file on the disk (step 5) first. Then it will start the media player and pass it the name of the scratch file. In step 6, the media player starts fetching and playing the music, block by block. In principle, this approach is completely correct and will play the music. The only trouble is that the entire song must be transmitted over the network before

the music starts. If the song is 4 MB (a typical size for an MP3 song) and the modem is 56 kbps, the user will be greeted by almost 10 minutes of silence while the song is being downloaded. Not all music lovers like this idea. Especially since the next song will also start with 10 minutes of download time, and the one after that as well.

The MBone—The Multicast Backbone

While all these industries are making great—and highly publicized—plans for future (inter)national digital video on demand, the Internet community has been quietly implementing its own digital multimedia system, MBone (Multicast Backbone). In this section we will give a brief overview of what it is and how it works.

MBone can be thought of as Internet television. Unlike video on demand, where the emphasis is on calling up and viewing precompressed movies stored on a server, MBone is used for broadcasting live video in digital form all over the world via the Internet. It has been operational since early 1992. Many scientific conferences, especially IETF meetings, have been broadcast, as well as newsworthy scientific events, such as space shuttle launches. A Rolling Stones concert

was once broadcast over MBone as were portions of the Cannes Film Festival. Whether this qualifies as a newsworthy scientific event is arguable. Technically, MBone is a virtual overlay network on top of the Internet. It consists of multicast-capable islands connected by tunnels,

In this figure, MBone consists of six islands, A through F, connected by seven tunnels. Each island (typically a LAN or group of interconnected LANs) supports hardware multicast to its hosts. The tunnels propagate MBone packets between the islands. Some day in the future, when all the routers are capable of handling multicast traffic directly, this superstructure will no longer be needed, but for the moment, it does the job.

Each island contains one or more special routers called mrouters (multicast routers). Some of these are actually normal routers, but most are just UNIX workstations running special user-level software (but as the root). The mrouters are logically connected by tunnels. MBone packets are encapsulated within IP packets and sent as regular unicast packets to the destination mrouter’s IP address. Tunnels are configured manually. Usually, a tunnel runs above a path for

which a physical connection exists, but this is not a requirement. If, by accident, the physical path underlying a tunnel goes down, the mrouters using the tunnel will not even notice it, since the Internet will automatically reroute all the IP traffic between them via other lines. When a new island appears and wishes to join MBone, such as G in Fig. 7-25, its administrator sends a message announcing its existence to the MBone mailing list. The administrators of nearby sites then contact him to arrange to set up tunnels. Sometimes existing tunnels are reshuffled to take advantage of the new island to optimize the topology. After all, tunnels have no physical existence. They are defined by tables in the mrouters and can be added, deleted, or moved simply by changing these tables. Typically, each country on MBone has a backbone,

with regional islands attached to it. Normally, MBone is configured with one or two tunnels crossing the Atlantic and Pacific oceans, making MBone global in scale.

The Distribution Network

The distribution network is the set of switches and lines between the source and destination. As we saw in Fig. 7-22, it consists of a backbone, connected to a local distribution network. Usually, the backbone is switched and the local distribution network is not.

The main requirement imposed on the backbone is high bandwidth. It used to be that low jitter was also a requirement, but with even the smallest PC now able to buffer 10 sec of high-quality MPEG-2 video, low jitter is not a requirement anymore.

Local distribution is highly chaotic, with different companies trying out different networks in different regions. Telephone companies, cable TV companies, and new entrants, such as power companies, are all convinced that whoever gets there first will be the big winner. Consequently, we are now seeing a proliferation of technologies being installed. In Japan, some sewer companies are in the Internet business, arguing that they have the biggest pipe of all into everyone’s house (they run an optical fiber through it, but have to be very careful about precisely where it emerges). The four main local distribution schemes for video on demand go by the acronyms ADSL, FTTC, FTTH, and HFC. We will now explain each of these in turn. ADSL is the first telephone industry’s entrant in the local distribution sweepstakes. We studied ADSL in Chap. 2 and will not repeat that material here. The idea is that virtually every house in the United States, Europe, and Japan already has a copper twisted pair going into it (for analog telephone service). If these wires could be used for video on demand, the telephone companies could clean up.

The problem, of course, is that these wires cannot support even MPEG-1 over their typical 10-km length, let alone MPEG-2. High-resolution, full-color, full motion video needs 4–8 Mbps, depending on the quality desired. ADSL is not really fast enough except for very short local loops.

The second telephone company design is FTTC (Fiber To The Curb). In FTTC, the telephone company runs optical fiber from the end office into each residential neighborhood, terminating in a device called an ONU (Optical Network Unit). On the order of 16 copper local loops can terminat e in an ONU.

These loops are now so short that it is possible to run full-duplex T1 or T2 over them, allowing MPEG-1 and MPEG-2 movies, respectively. In addition, videoconferencing for home workers and small businesses is now possible because FTTC is symmetric.

The third telephone company solution is to run fiber into everyone’s house. It is called FTTH (Fiber To The Home). In this scheme, everyone can have an OC-1, OC-3, or even higher carrier if that is required. FTTH is very expensive and will not happen for years but clearly will open a vast range of new possibilities when it finally happens. In Fig. 7-7 we saw how everybody could operate his or her own radio station. What do you think about each member of the family operating his or her own personal television station? ADSL, FTTC, and FTTH are all point-to-point local distribution networks, which is not surprising given how the current telephone system is organized.

A completely different approach is HFC (Hybrid Fiber Coax), which is the preferred solution currently being installed by cable TV providers. It is illustrated in Fig. 2-47(a). The story goes something like this. The current 300- to 450-MHz coax cables are being replaced by 750-MHz coax cables, upgrading the capacity from 50 to 75 6-MHz channels to 125 6-MHz channels. Seventy-five of the 125 channels will be used for transmitting analog television.

The 50 new channels will each be modulated using QAM-256, which provides about 40 Mbps per channel, giving a total of 2 Gbps of new bandwidth. The headends will be moved deeper into the neighborhoods so that each cable runs past only 500 houses. Simple division shows that each house can then be allocated a dedicated 4-Mbps channel, which can handle an MPEG-2 movie.

While this sounds wonderful, it does require the cable providers to replace all the existing cables with 750-MHz coax, install new headends, and remove all the one-way amplifiers—in short, replace the entire cable TV system. Consequently, the amount of new infrastructure here is comparable to what the telephone companies need for FTTC. In both cases the local network provider has to run fiber into residential neighborhoods. Again, in both cases, the fiber terminates at an optoelectrical converter. In FTTC, the final segment is a point-to-point local loop using twisted pairs. In HFC, the final segment is a shared coaxial cable. Technically, these two systems are not really as different as their respective proponents often make out.

Video on Demand

Video on demand is sometimes compared to an electronic video rental store. The user (customer) selects any one of a large number of available videos and takes it home to view. Only with video on demand, the selection is made at home using the television set’s remote control, and the video starts immediately. No trip to the store is needed. Needless to say, implementing video on demand is a wee bit more complicated than describing it. In this section, we will give an overview of the basic ideas and their implementation.

Is video on demand really like renting a video, or is it more like picking a movie to watch from a 500-channel cable system? The answer has important technical implications. In particular, video rental users are used to the idea of being able to stop a video, make a quick trip to the kitchen or bathroom, and then resume from where the video stopped. Television viewers do not expect to put programs on pause.

If video on demand is going to compete successfully with video rental stores, it may be necessary to allow users to stop, start, and rewind videos at will. Giving users this ability virtually forces the video provider to transmit a separate copy to each one.

On the other hand, if video on demand is seen more as advanced television, then it may be sufficient to have the video provider start each popular video, say, every 10 minutes, and run these nonstop. A user wanting to see a popular video may have to wait up to 10 minutes for it to start. Although pause/resume is not possible here, a viewer returning to the living room after a short break can switch to another channel showing the same video but 10 minutes behind. Some material will be repeated, but nothing will be missed. This scheme is called near video on demand. It offers the potential for much lower cost, because the same feed from the video server can go to many users at once. The difference between video on demand and near video on demand is similar to the difference between driving your own car and taking the bus.

Watching movies on (near) demand is but one of a vast array of potential new services possible once wideband networking is available. The general model that many people use is illustrated . Here we see a high-bandwidth (national or international) wide area backbone network at the center of the system.

Connected to it are thousands of local distribution networks, such as cable TV or telephone company distribution systems. The local distribution systems reach into people’s houses, where they terminate in set-top boxes, which are, in fact, powerful, specialized personal computers.

Video Compression

It should be obvious by now that transmitting uncompressed video is completely out of the question. The only hope is that massive compression is possible. Fortunately, a large body of research over the past few decades has led to many compression techniques and algorithms that make video transmission feasible. In this section we will study how video compression is accomplished.

All compression systems require two algorithms: one for compressing the data at the source, and another for decompressing it at the destination. In the literature, these algorithms are referred to as the encoding and decoding algorithms, respectively. We will use this terminology here, too.

These algorithms exhibit certain asymmetries that are important to understand. First, for many applications, a multimedia document, say, a movie will only be encoded once (when it is stored on the multimedia server) but will be decoded thousands of times (when it is viewed by customers). This asymmetry means that it is acceptable for the encoding algorithm to be slow and require expensive hardware provided that the decoding algorithm is fast and does not require expensive hardware. After all, the operator of a multimedia server might be quite willing to rent a parallel supercomputer for a few weeks to encode its entire video library, but requiring consumers to rent a supercomputer for 2 hours to view a video is not likely to be a big success. Many practical compression systems go to great lengths to make decoding fast and simple, even at the price of making encoding slow and complicated. On the other hand, for real-time multimedia, such as video conferencing, slow encoding is unacceptable. Encoding must happen on-the-fly, in real time. Consequently, real-time multimedia uses different algorithms or parameters than storing videos on disk, often with appreciably less compression.

A second asymmetry is that the encode/decode process need not be invertible. That is, when compressing a file, transmitting it, and then decompressing it, the user expects to get the original back, accurate down to the last bit. With multimedia, this requirement does not exist. It is usually acceptable to have the video signal after encoding and then decoding be slightly different from the original. When the decoded output is not exactly equal to the original input, the system is said to be lossy. If the input and output are identical, the system is lossless. Lossy systems are important because accepting a small amount of information loss can give a huge payoff in terms of the compression ratio possible.

Digital Systems

Fig. 7-14 except that the continuous scan lines are now replaced by neat rows of

discrete pixels. To produce smooth motion, digital video, like analog video, must display at least 25 frames/sec. However, since good-quality computer monitors often rescan the screen from images stored in memory at 75 times per second or more, interlacing is not needed and consequently is not normally used. Just repainting (i.e., redrawing) the same frame three times in a row is enough to eliminate flicker.

In other words, smoothness of motion is determined by the number of different images per second, whereas flicker is determined by the number of times the screen is painted per second. These two parameters are different. A still image painted at 20 frames/sec will not show jerky motion, but it will flicker because one frame will decay from the retina before the next one appears. A movie with 20 different frames per second, each of which is painted four times in a row, will not flicker, but the motion will appear jerky.

And even if they did have memory, analog signals cannot be stored in RAM without conversion to digital form first, which requires extra hardware. As a consequence, interlacing is needed for broadcast television but not for digital video.

Internet Radio

Once it became possible to stream audio over the Internet, commercial radio stations got the idea of broadcasting their content over the Internet as well as over the air. Not so long after that, college radio stations started putting their signal out over the Internet. Then college students started their own radio stations. With current technology, virtually anyone can start a radio station. The whole area of Internet radio is very new and in a state of flux, but it is worth saying a little bit about.

There are two general approaches to Internet radio. In the first one, the programs are prerecorded and stored on disk. Listeners can connect to the radio station’s archives and pull up any program and download it for listening. In fact, this is exactly the same as the streaming audio we just discussed. It is also possible to store each program just after it is broadcast live, so the archive is only running, say, half an hour, or less behind the live feed. The advantages of this approach are that it is easy to do, all the techniques we have discussed work here too, and listeners can pick and choose among all the programs in the archive. The other approach is to broadcast live over the Internet. Some stations broadcast over the air and over the Internet simultaneously, but there are increasingly many radio stations that are Internet only. Some of the techniques that are applicable to streaming audio are also applicable to live Internet radio, but there are also some key differences.

One point that is the same is the need for buffering on the user side to smooth out jitter. By collecting 10 or 15 seconds worth of radio before starting the playback, the audio can be kept going smoothly even in the face of substantial jitter over the network. As long as all the packets arrive before they are needed, it does not matter when they arrived.

One key difference is that streaming audio can be pushed out at a rate greater than the playback rate since the receiver can stop it when the high-water mark is hit. Potentially, this gives it the time to retransmit lost packets, although this strategy is not commonly used. In contrast, live radio is always broadcast at exactly the rate it is generated and played back.

Audio Compression

CD-quality audio requires a transmission bandwidth of 1.411 Mbps, as we just saw. Clearly, substantial compression is needed to make transmission over the Internet practical. For this reason, various audio compression algorithms have been developed. Probably the most popular one is MPEG audio, which has three layers (variants), of which MP3 (MPEG audio layer 3) is the most powerful and best known. Large amounts of music in MP3 format are available on the Internet, not all of it legal, which has resulted in numerous lawsuits from the artists and copyright owners. MP3 belongs to the audio portion of the MPEG video compression standard. We will discuss video compression later in this chapter; let us look at audio compression now.

Audio compression can be done in one of two ways. In waveform coding the signal is transformed mathematically by a Fourier transform into its frequency components. The amplitude of each component is then encoded in a minimal way.The goal is to reproduce the waveform accurately at the other end in as few bits as possible.

The other way, perceptual coding, exploits certain flaws in the human auditory system to encode a signal in such a way that it sounds the same to a human listener, even if it looks quite different on an oscilloscope. Perceptual coding is based on the science of psychoacoustics—how people perceive sound. MP3 is based on perceptual coding.

The key property of perceptual coding is that some sounds can mask other sounds. Imagine you are broadcasting a live flute concert on a warm summer day. Then all of a sudden, a crew of workmen nearby turn on their jackhammers and start tearing up the street. No one can hear the flute any more. Its sounds have been masked by the jackhammers. For transmission purposes, it is now sufficient to encode just the frequency band used by the jackhammers because the listeners cannot hear the flute anyway. This is called frequency masking—the ability of a loud sound in one frequency band to hide a softer sound in another frequency band that would have been audible in the absence of the loud sound. In fact, even after the jackhammers stop, the flute will be inaudible for a short period of time because the ear turns down its gain when they start and it takes a finite time to turn it up again. This effect is called temporal masking.

To make these effects more quantitative, imagine experiment 1. A person in a quiet room puts on headphones connected to a computer’s sound card. The computer generates a pure sine wave at 100 Hz at low, but gradually increasing power. The person is instructed to strike a key when she hears the tone. The computer records the current power level and then repeats the experiment at 200 Hz, 300 Hz, and all the other frequencies up to the limit of human hearing. When averaged over many people, a log-log graph of how much power it takes for a tone to

be audible looks like that of Fig. 7-2(a). A direct consequence of this curve is that it is never necessary to encode any frequencies whose power falls below the threshold of audibility. For example, if the power at 100 Hz were 20 dB.it could be omitted from the output with no perceptible loss of quality because 20 dB at 100 Hz falls below the level of audibility.

Now consider Experiment 2. The computer runs experiment 1 again, but this time with a constant-amplitude sine wave at, say, 150 Hz, superimposed on the test frequency. What we discover is that the threshold of audibility for frequencies near 150 Hz is raised The consequence of this new observation is that by keeping track of which signals are being masked by more powerful signals in nearby frequency bands, we can omit more and more frequencies in the encoded signal, saving bits. it is completely omitted from the output and no one will be able to hear the difference. Even after a powerful signal stops in some frequency band, knowledge of its temporal masking properties allow us to continue to omit the masked frequencies for some time interval as the ear recovers. The essence of MP3 is to Fourier-transform the sound to get the power at each frequency and then transmit only the unmasked frequencies, encoding these in as few bits as possible.

Friday, October 2, 2009

WEB SITE LIFE CYCLE

A web site in many ways resembles other types of corporate information

systems. Each web site has a limited life span, similar to the water fall

software life cycle model. One major difference is the emphasis on content

development in multimedia applications. The phases of web site

development are as follows: idea formulation, general web site design,

detailed design of web site, testing of an implementation and maintenance.

(1) Idea Formation: During the idea formulation phase, specific targetmarketing

program, content goals and objectives must be set. Since a web

site development project can become very time consuming and a major

capital investment to owners of small businesses, it may be more effective to

identify opportunity of specialized markets big companies have ignored.

Furthermore, the profile of netizens must be carefully studied [Choi99] to

find out who is surfing the net and what these people are looking at. Small

businesses should be aware of the dynamics of the on-line market place and

develop strategies and plan accordingly. The ideas of this phase can pave the

foundation for developing a comprehensive plan for web site design.

(2) Web Site Design: Web site should be integrated into the company's

backbone information system so that the web site can grow along with the

business. To be successful, companies must integrate e-commerce into their

overall business strategies and processes. Moreover, content needs to be

targeted to specific user's needs. Visitor's information should be collected so

that the company will be able to tailor the web pages to the specific needs of

the interested customers. Furthermore, it is important that the web site can be

surfed fast and efficiently. In addition, the users should be involved by

providing an opportunity for them to input suggestions and complaints. The

development of navigational cues and the user interface is of critical

importance. The actual design tasks can be out-sourced for a small company.

Also a new web site should be linked to as many search engines as possible.

This can increase the chance that the web site is visited. Financial

infrastructure should be developed properly as well.

(3) Testing: Once the implementation is complete, the company should

conduct a pilot to test its integrity and effectiveness. The pilot provides an

opportunity to obtain feedback from functional groups, customers and

business partners. It ensures the quality and usability of the site.

(4) Maintenance: It is essential that new content is developed and the web

site is kept refreshed. Timeliness is the key on the web. Moreover,

appointing a web master to manage the site on a day-to-day basis is

imperative. Web master can trouble-shoot any error such as a link to a

defunct web address, track the traffic of the web site, use reader feed back to

build a loyal following and ensure server maintenance and security. Also,

this person or persons should make sure that the company's web site supports

the latest versions of popular browsers.

HIGH PRESENCE AND HIGH TOUCH

Internet and multimedia are changing the rules of the economy and

redefining our businesses and our lives. It is destroying solutions such as

mass production, segmented pricing, and time and distance for big

businesses. A company can develop a web page and advertising campaign

and quickly compete in the world market. This has led to the flattening of the

economy, whereby established companies and individuals doing business on

their own can compete on an equal plane. The small companies that succeed

in challenging the large companies are the ones who can maintain a global

presence and yet make people feel that they are personal and easy to deal

with.

(1) Small companies can interact closely with their customers, so that the

customers feel that they are able to communicate to the small company what

they need, as opposed to the customers merely accepting the mass-produced

product that large companies will sell and not give much ground for

derivation from the product.

(2) Web changed from just a means of advertising, to a medium to

rapidly exchange ideas with potential customers. Since the small company

listens to what they say, it not only results in having a satisfied (and probably

a faithful) customer but increases sales significantly with time.

(3) The Internet's primary advantage in advertising is not so much in

attracting attention and conveying a brief message (the tasks assigned to

traditional advertising media), but lies instead in delivering in-depth, detailed

information. Its real power is the ability to provide almost infinite layers of

detail about a product or service, interactively, at the behest of the user.

However, small companies have to work smarter and respond more

quickly [Murr98]. They have to avoid mistakes and make the best of

possible use to everything. Corporations with big budgets can afford to lose

their investments, while a small company looks at web as survival, not as an

investment.

The small businesses also need to realize having a web site does not

automatically mean that the company will reach millions of potential

customers. It simply means that there is the potential to reach millions of

potential customers. Company has to promote the site through

advertisements, e-mail, links to other sites, and cutting edge multimedia

technology to attract lots of visitors. For a new start-up small company, a

brand new idea is always crucial. Second, multimedia technology should be

used to provide various kinds of services on the web site. Third, once the site

starts catching on and e-mails start rolling in, more and more person hours

should be put into keep up with it all.

Metadata consumption at the proxy

Multimedia adaptation is a key technology for

assuring the quality of end-to-end delivery in the

network. Adaptation can dynamically react on

different kinds of presentation devices and on

unpredictable resource availabilities. Proxy

servers situated in the middle of the delivery

chain constitute an ideal place to control multimedia

delivery and adapt the stream if resource

availability changes. In this context, we built the

Quality-Based Intelligent Proxy (QBIX), which is

a terminal capabilities- and metadata-aware

mapped (RTSP; http://www.rtsp.org) proxy server

that supports real-time adaptation in the compressed

(temporal adaptation only) and

uncompressed domains. Adaptation improves the

hit rate in the proxy cache and lets the proxy act

as a media gateway supporting transcoding in real

time based on metadata.9 One part of the metadata

is sent by the server and describes the video

variations (MPEG-7 VariationSet descriptions),

and the other part is provided by the terminal end

user (in the form of Composite Capabilities/Preferences

Profiles, which we describe later in this article) sending terminal capabilities to the proxy.

The proxy extracts the terminal capability

information from the video request and checks if

it has this music video in its cache in a quality

that matches the terminal capabilities. If the

video is already available, but its properties aren’t

in accordance with the terminal’s characteristics—

for example, the video’s bit rate is too high

to be consumed at the end user’s site—the MPEG-

7 descriptions that accompany the video are

examined by the proxy. They contain hints on

which variations of the original video should be

selected or created (if one doesn’t exist) to meet

the delivery and presentation constraints. These

hints contain the video’s expected size, bit rate,

and quality. The proxy then chooses from among

these variations the one with the highest quality

that meets the restrictions of the end user’s terminal.

The proxy can either load the required

variation from the server or generate it with a

sequence of appropriate transcoding procedures.9

We implemented metadata consumption at the

proxy and the terminal using our Video ToolKit

(ViTooKi). ViTooKi is principally a set of libraries

that support adaptive standard-compliant video

streaming, transcoding, and proxy caching. It supports

MPEG-1/2/4/7/21 and MP3/AAC file types,

stored in various containers like mp4 and avi using

standard protocols with retransmission and the

server, proxy, and player included.

Multimedia Data Cartridge

The MPEG-7 MDC is at the center of the metadata

life cycle because it must manage all the metadata

produced and deliver it to the consuming

elements. The MDC is an extension of an Oracle

database that can store, index, and query multimedia

metadata based on the MPEG-7 standard. It

currently consists of three main parts (see Figure 3,

next page). The core system consists of a multimedia

database schema based on MPEG-7, the Multimedia

Indexing Framework (MIF) supporting

query optimization, and a set of internal and external

libraries for incoming requests and queries.

The MDC has been implemented by a small

group of database programmers with experience

working with the Oracle DBMS kernel. They kept

the extensions to the Oracle database as modular

as possible. Part of these modules, mainly the

indexing framework, is in preparation of a

SourceForge project (http://sourceforge.net/

index.php), which CODAC plans to make available

to the public in Spring 2005.

MPEG-7-based database schema

The multimedia schema relies on the MPEG-7

standard to provide a complete multimedia metadata

schema for low-level descriptions (such as color, texture, and shape for images) and highlevel

descriptions (such as structural and semantic

descriptions) for all media types. We have mapped

the MPEG-7 descriptors, formulated as XML-types,

to object-relational tables to enable fine-grained

querying on these data. (A detailed explanation of

the mapping is available elsewhere.7)

Library support

A set of internal and external libraries are used

for incoming requests and queries. The set of

internal libraries is used as access points to the

core system and consists of InitLib, used for creating

new instances of the MDC data type;

InsertLib, which provides insert functionality of

MPEG-7 documents; DeleteLib, for deleting

MPEG-7 documents; UpdateLib, for updating

parts of stored MPEG-7 documents; and QueryLib,

for query services. Furthermore, external libraries

are used to offer application-specific services.

The services we described in the use case scenario

are VideoLib, for obtaining videos with

semantic search criteria, and AudioLib, for querying

with the humming functionality. Both external

libraries (VideoLib and AudioLib) rely on the

search functionality of the QueryLib, which is

basically a translation of search criteria to complex

SQL and XPATH statements on the schema tables.

Video variation and metadata for content

adaptation

The annotation framework automatically produces

metadata for video variation as a means for

adaptation. In video variation, the hope is to

generate new videos (variation videos) from the

source video, with a reduced data size and quality

by applying a variation or reduction method.

The supported variations include

temporal variation, which reduces the visual

data’s frame rate through frame dropping;

spatial variation, which encodes fewer pixels

(pixel subsampling) and thereby reduces the

level of detail in the frame; and

color variation, which reduces each pixel’s

color depth.

The CODAC project has defined and implemented

two new methods of video variations:5

Object-based variation extracts the foreground

and background objects in a source video and

re-encodes the video with two visual objects.

This facilitates object-based adaptation, which

otherwise would be impossible for a video

encoded without objects. For instance, in our

use case scenario, object-based variation

would be useful for segments with a dynamic

singer foreground and static image background.

This lets an adaptation engine discard,

in case of resource bottlenecks, the static

background image.

❚ Segment-based variation lets us apply variation

methods selectively for video segments based

on the physical characteristics of motion,

texture, and color, thereby minimizing the

quality loss and/or maximizing the reduction

in data size. We segment the source video into

its component shots and automatically select

and apply a set of appropriate methods by

analyzing the degree of physical characteristics

within the segment.

We accomplish variation creation by implementing

a server module called the Variation Factory,

which generates the variations.5 All the

variation methods—including the newly defined

object and segment-based variation methods—are

implemented in the Variation Factory. A source

video can be subjected to a single method or a

combination of them as deemed necessary—for

instance, temporal variation followed by spatial

variation, and so on. At each step, the Variation

Factory produces a variation video, thereby creating

a tree of variations. The Variation Factory is

an application programming interface (API)

including user interfaces to guide the metadata

generation process. The user can control the

adaptation results and rearrange parameters—for

instance, if the perceived quality is too low.

The Variation Factory’s architecture is developed

in Java under Java Media Framework (JMF).

The API supports the integration of audio and

video playback into Java applications and applets.6

The input is video, and the outputs are one or

more adapted variation videos and an MPEG-7

metadata document that describes the source and

the variation videos using descriptors of the VariationSet

description scheme. The MPEG-7 Document

Processor produces metadata. It uses Java

API for XML processing (JAXP), which is generally

used to parse and transform XML documents.

First, a Document Object Model tree is constructed.

Then, the DOM tree is parsed and an

XML text file is produced. The descriptors

include information on the variations’ fidelity,

data size, and priority. We store the created variation

videos in the media server along with the

source video and the MPEG-7 document in the

metadatabase. During delivery, the MPEG-7

metadata document is streamed together with

the requested video.

The Life Cycle of Multimedia

Metadata

During its lifetime, multimedia content

undergoes different stages or cycles from

production to consumption. Content is created,

processed or modified in a postproduction stage,

delivered to users, and finally, consumed. Metadata,

or descriptive data about the multimedia

content, pass through similar stages but with different

time lines.1 Metadata may be produced,

modified, and consumed by all actors involved

in the content production-consumption chain

(see the “Life-Cycle Spaces” sidebar for more

information). At each step of the chain, different

kinds of metadata may be produced by highly

different methods and of substantially different

semantic value.

Different metadata let us tie the different multimedia

processes in a life cycle together. However,

to employ these metadata, they must be appropriately

generated. The CODAC Project, led by

Harald Kosch, implements different multimedia

processes and ties them together in the life cycle.

CODAC uses distributed systems to implement

multimedia processes. Figure 1 gives the architectural

overview of this system.

The project’s core component is a Multimedia

Database Management System (MMDBMS),2

which stores content and MPEG-7-based metadata.

3 It communicates with a streaming server

for data delivery. The database is realized in the

Multimedia Data Cartridge (MDC)—which is an

extension of the Oracle Database Management

System—to handle multimedia content and

MPEG-7 metadata.

Use case scenario

To demonstrate metadata’s life cycle, let’s consider

a use case scenario. A user watches an interesting

music video on a colleague’s screen and

wants to retrieve the same music video. The only

information she retained was that the singer in

the music video was Avril Lavigne, and she

remembers the song’s melody. She can’t ask her

colleague directly, so she wants to access it from

a multimedia database.

The query service of our Multimedia Data Cartridge

(MDC) offers a solution for finding such a

music video. First, the user can enter a query by

thematic means, thus specifying that the music

video’s singer is Avril Lavigne. (See Figure 2a, p. 80,

for the query interface.) In response to the first

query, the service returns several music videos that

pertain to the singer. To narrow the search, the

user can hum the melody to the query with a

humming service, as Figure 2b shows.

The multimedia database retrieves information

on the music video that meets the query request

and delivers it to the user. Such information

includes the full title, full information on the

singer, the production date, and so on, and finally

the address of the media server where the user

can obtain the video. Thus, the user finds out that

the song she has searched for is entitled “Skater

Boy” and can now access the music video.

Luckily, the user is registered to the media

server storing the video and can request the

music video from this server. In addition, the

user specifies in the video request her mobile

device’s terminal capabilities. Unfortunately, the

media server has only copies of the music video

in a quality that doesn’t meet the terminal constraints.

The server examines metadata generated

in postprocessing of the video to generate a

variation of the video with the best possible quality

satisfying the constraints and then delivers

this variation.

Let’s further assume that the request goes over

an authorized proxy cache that examines if a

cached copy is present. If it’s there, but not in the

appropriate quality, delivery-related metadata

describing the possibility for the video to be

adapted to resource constraints is used by the

proxy cache to adapt the video accordingly.

How sound is converted into digital data

When analog information (such as from a CD or live recording) is converted into

digital data, the information is captured and measured into Hertz. The more Hertz you

can capture, the more accurate your recording will be. For example, a sampling rate of

44 kHz (kiloHertz) is far more detailed than a sampling rate of 11 kHz. It is easy to

compare this to a digital picture. The higher the scanning quality, the more detailed the

picture will be. Of course, file size will also increase with higher sampling rates. When

preparing audio for multimedia output, especially web output, it is unnecessary to go

beyond a 22 kHz sampling rate. The reason for this is that this is the average frequency

that is audible to the human ear. Some highs and lows will be lost; however, it is difficult

to detect this, especially with the quality of speakers found on most computers today.

Another technicality to be concerned about when sampling sounds is bit depth.

Bit depth is essentially the dynamic range (or amplitude) of a piece. Bit rate also controls

the resolution of the sound wave (higher bit resolution results in a smoother wave). For

example, a highly structured classical piece would have a high bit depth, because of the

great dynamic differences between a solo flute section and a full orchestral section.

Many pop tunes have less of a bit depth, because they are composed on a more equal

dynamic range. When sampling music for web format, a 16-bit rate is adequate. For

many instances, 8-bit rate may work as well and decrease the file size.

Multi-track recording software

When recording and mixing audio, perhaps the first thing to consider is whether

to go with a Digital Audio Workstation (DAW) or strictly computer software. Digital

Audio Workstations are devices sold as hardware that attach to your computer, allowing

you to store and mix audio through an outside device. Some common components are

mixing boards, multi-track recording devices, and CD burners. The advantages to this is

that processing is usually faster, allowing for more storage space of digital information.

However, the price of DAWs are usually higher than the software versions. Some

common manufactures of DAWs are Tascam and Mackie Designs. Perhaps the top

professional multi-track recording tool is Digidesign's Pro Tools. Digidesign's previous

price of $8,000 made it impossible for private studios and desktop/home studio recorders

to own and operate the equipment. However, this has all changed with the release of

Digi001, which is a simpler version including both the software and hardware interface

for under $1,000. The hardware component is a single-space box and a PCI card that

works with both Mac and PC. A downloadable pro tools software version is also

available for free. Some of the clients who use Digidesign Pro Tools include Nine Inch

Nails, Björk, Smash Mouth, Philip Glass, Third Eye Blind, Paramount Pictures, Canadian

Broadcast Corporation, and such movie projects as Nutty Professor II, The Perfect Storm,

The Matrix, American Beauty, and The Prince of Egypt.

Multi-track software packages are perhaps the most popular, varying in price and

editing power. Some common programs include Samplitude, which samples at 24 bits

and 96 kHz (Windows $69-$399), Vegas™ Audio 2.0 (Windows, $449) which has

unlimited tracks,18 effects, and output options such as streaming media files (WMA and

RM format). Cakewalk® Pro Suite™ is also a popular program (Windows $429-$599).

Cakewalk differs from other programs in that it also allows for MIDI recording and

editing as well as multi-track recording and editing.

Thursday, February 26, 2009

Career options

There is a wide variety of jobs available in the animation industry, some of which are
as follows:

Designer

Audio & video specialist

Visualiser

Graphic designer
A graphic designer is responsible for the layout and presentation of different types of media (such as a poster, a package or a website)

Multimedia author

Web designer
A web designer creates webpages. Web designers use graphic design skills as also tools like Flash, HTML, CSS, etc.

Content developer

Modeler
Modelers facilitate the filming of puppets or any form of 3D models. The puppets are positioned and filmed before being moved slightly and filmed again. This gives the impression that the models are moving. A modeler should have a solid understanding of anatomy, form and volume.

Texture artist
A texture artist applies a surface to the 3D modeled character, object or environment. Coordinates are laid out to give the model an applicable surface for colour and texture.

Rigging artist
A rigging artist takes the modeled, textured 3D character or object and sets it up with a skeletal system or joints (if required). Without this step, the 3D model would not be able to animate, talk or move fluidly and correctly.

2D animator
2D animation involves the creation of a high volume of separate drawings that define a sequence. This technique is widely used in creating characters for animations and cartoon programmes

3D animator
The 3D animator takes the sculpted (or modeled), textured and rigged 3D model and breathes life into it. This is done by putting in order the key frames such that they appear to be in motion.

Compositing artist
compositing consists of layering individual frames of animation on top of one another to create final images. These images are then strung together to create complete shots or mini animated movies.

Editor
An editor assembles various visual and audio components of the film into a coherent and effective whole

Storyboard artist
A storyboard artist creates a series of panels that contains a visual interpretation of the screenplay - much like a comic book

Character animator
A character animator brings characters to life and generally has knowledge of traditional animation, stop-motion animation as well as claymation

Effects artist
Effects artists create a believable world for the action to take place in.

In-between artist
Tweening (short for in-betweening) consists of drawings that are inserted between the 'key' or important drawings to make the first image flow smoothly into the next one.

Image editor

Multimedia developer

Digital post-production artist

Special effects artist
Special effects artists integrate live-action footage with Computer Generated Imagery (CGI) or other elements (such as model work)

Programmer

Animation careers
Animation is a booming industry with a wide variety of jobs available.

Industry overview
Animators can work on full-length animation movies, create television commercials, make DVDs, make games for Internet, mobile, PC or consoles (like PlayStation or XBox), work in the advertising industry or as web designers. The e-learning industry also uses animators and so do fields like medicine, engineering and architecture.
The entertainment industry including movies, TV programmes and Special Effects (VFX) for movies or TV is a major employer.
A typical animated film requires 700 to 800 animators. Of the 30,000 animation studios around the world, 70 percent have a turnover of US $1 million.
India, South Korea, Philippines, Singapore, Japan and China are seeing a deluge of outsourced animation work from across the globe. Indian companies are creating a number of animated films and cartoons for US and European studios.
The latest NASSCOM statistics predict the size of the Indian animation industry to be US $1.5 billion by 2009. Currently, it is at US $550 million. India has over 300 animation studios employing over 12,000 animation professionals. In addition to outsourcing, content is being made for the Indian market as well. Films like Hanuman, Krishna, My Friend Ganesha, etc. have proved that animation can make huge profits at the box office.
Animation is a global industry with a large turnover (estimated at $59 billion in 2006) and a growing demand; this is causing a shortage of skilled people....

What does it take to be a complete animator?

A good animator should have knowledge of:
~ Drawing techniques
~ Animation techniques
~ Different styles of animation such as 2D and 3D animation
~ Design and layout
~ How people move and express their feelings
~ How animals move
~ How to create different moods and feelings in characters
~ Computers and animation software applications
~ The history of art and design
~ Film and television production

Besides, he or she also needs to:

~ Be artistic, creative and innovative

~ Be a good communicator

~ Have inclination for good music

~ Be able to ideate and conceptualise

~ Be focused, self-disciplined and self-motivated

~ Be able to use knowledge of the human body and how animals move to create animations

~ Be versatile and adaptable and able to accept criticism

~ Be able to work to a deadline

~ Be observant, with an eye for detail

~ Be able to work well in a team

~ Be able to understand the comic nature of cartoons

How does animation work?

A simple theory known as persistence of vision offers an explanation. The Greek astronomer Ptolemy discovered this principle back in 130 AD. If images are flashed before the eye at a speed of at least ten frames per second, the brain thinks it is seeing a single moving image. The number of Frames Per Second (or FPS) directly correlates to how smooth the movement appears. If the frame rate is too slow, the motion will look awkward and jerky. If the frame rate is too high, the motion will blur.

Animation techniques

2D cel animation

Also known as traditional animation, 2D animation involves the creation of a high volume of separate drawings that define a sequence. These drawings are then traced with ink onto transparent celluloid sheets called cel, which are scanned and painted using a special application software. These cels are layered on each other to create a sequence. The sequence is later edited to synchronise the audio and video content. This technique is widely used in creating characters for animations and cartoon programmes.
Did you know that a full-length feature film produced using cel animation often requires a million or more drawings to complete?

3D CGI animation

This technique makes extensive use of animation software programmes. 3D objects are constructed using curves or 2D geometric figures. Software programmes are used to modify the texture, light and colour of the object surface. Virtual cameras are used to zoom, focus, illuminate and resize the 3D objects. Important frames are developed to regulate the flow of intermediate frames. This technique is commonly used to create animation for television programmes, movies and online and console games.

3D motion capture animation

This process of creating 3D characters is similar to the 3D CGI animation technique; however, the techniques differ with respect to the time when the animation effects are introduced. To produce animation effects, sensors from a computer are attached physically to a human being. These sensors help coordinate the real-time movements of the human actor with the movements of a computerised 3D character. This technique is widely used for low-resolution games, Internet characters, live TV performances and special effects for animated movies.

What is multimedia ??????

Multimedia is a combination of all the interfaces like pics, images, vedios, sounds and touch sensors, this can be broadly divided into 4 clusters namely
Graphic Designing,
Web Designing,
Visual Effects and
3D Animation.
All the designs, color combinations, which includes from 50 paise shampoo packets to crores of Rs. business logo designs etc, are a part of Graphic Designing
Web Designing is a merger of web site creation, editing, modification updating the present websites, this requires creativity and patience, Learning DHTML, Java Scripting, PHP etc would add glamour to your career.
Visual Effects deals with Sound editing & creating, vedio editing and modification etc, the entire entertainment industry depends upon the Visual Effects
3D Animation completely deals with creation of new characters, what ever u dream u can make it through the 3D Animation in terms of general sense...............

A multimedia project may involve a team of experts such as a Web Designer/Programmer, Video Producer, Audio Engineer, Graphic Artist, and Interactive Designer. CE Multimedia can supply the skillset, project manage the team, and provide tutoring in the content management of multimedia assets.

However, small projects are also catered for; e.g. the production of GIF/Flash animations or banners, podcasting news stories, highlighting product features, audio streaming music, video streaming 'vox pox', or implementing 360° photo shots.....

Tuesday, February 24, 2009

Multimedia Picture Gallery

Planning and Managing for a Better Product:

Planning. A multimedia product, such as Power Point, can really get the students going in a computer lab. Presentations can become extremely complex and help is needed often when students are in the middle of creating special effects. There really is not much to be accomplished if there has been no planning between teacher and computer tech. Students have to have research geared towards this product; it will not work if they bring in a typed report intending to produce a multimedia product. Use of graphic organizers are key to the success of a multimedia project, because students must become aware that they are NOT writing a series of paragraphs. They will be presenting a series of main ideas, facts, or short descriptions. The presentation will be a series of charts, and student must have a good idea of how their topic is to be broken up. Bibliographies should also be complete before beginning the project. Students should have all research and graphic organizers complete before beginning Power Point, or any other multimedia software, because they will not have a good plan for the total presentation. Teachers can review the drafts of the text in hand written form or typed, printed format. Power Point allows the easy printing of text only for teacher review.

Types of multimedia projects:

* Self Playing presentations. With this type, slides advance automatically and all special effects play automatically. There is no manual intervention at all, and the presentation becomes a "show."

* Manually advanced presentations with linear progression. This type of presentation will advance slides and text only on the click of the mouse. Special effects can be set to go off automatically or by a mouse click. Students required to give an oral presentation accompanying their multimedia presentation should use this type. Example of topics with linear progression are biographies or History.

* Manually advanced presentation without linear progression, or "interactive." This type of presentation is most difficult for students to plan in advance if they haven't seen it done already. The front slide will have hyper links which can be clicked in any order by the presenter. When clicked, another slide will appear which has specific information, special effects, and a place to click which takes the presentation back to the front slide or on to another. Students should be required to make a "map" in advance, with lines connecting all slides (boxes) which are to be connected by action buttons. Otherwise it becomes difficult to visualize. This type of presentation are for topics where there is no particular order that information should be presented. Examples are parts of a plant, math projects, etc. This type of presentation can be created with an accompanying oral talk in mind, or it can be created as a presentation which students will take turns sitting down and studying it interactively. This is a particularly good format for students [or teachers] who want to create interactive quizzes. Power Point has a feature for action buttons which when clicked, will take the viewer back to whatever slide brought the viewer to that point. This means that a generic "wrong answer" slide can be prepared which will work again and again to return the viewer to whichever slide they came from. Buttons can also activate sound effects, or other actions. save the image and insert it into their documents.

Pulling Multimedia In:

A multimedia presentation has a number of ways that the media can be manipulated to support or enhance the topic. Students should be taught the various ways and encouraged to create presentations which are a "total experience." The available facets of multimedia are discussed below:

* Graphical design and the use of color has a powerful effect in a multimedia presentation. Students should be taught (if time allows) how colors are associated with moods. Teachers should decide in advance how much time to allocate to this topic. Areas of the art curriculum can be emphasized here which deal with use of imagery, lines and color.

* Background music also has a powerful effect to enhance and support a topic. Teachers should think about and establish rules for use of music and how much freedom student preference should be given. Many music files are available on the internet which allows students to conform to the period/ethnic theme of the presentation. If time allows, students should be taught how to coordinate music patterns to animation patterns.

* Sound effects are available on the Internet and some in Power Point. Used sparingly and with careful selection of quality, they add a tremendous depth of feeling to a presentation. Due to the inclusion of inappropriate sounds in all sound web sites I've seen, I do not have a link of sound effects for students to browse through as I do for music. However, teachers can search on "sound effects" and come up with a number of sites having a rich variety of sound effects most of which are designated as open domain files.

* Students can also use a microphone to create their own sound effects. Unexpectedly hearing student voices during a presentation jars the "viewer" into attention with stunning exactness. Again, teachers should decide in advance how much time to allocate to this activity. Due to the need for complete silence, the recording session usually has to be the same period for everybody in the class. The students take turns using the microphones. I have also had a "free period" designated for sound recording when a whole class needed to do this.

* Video also adds fascination to a presentation. Video clips can be downloaded or student created video can be added. If students create their own clips, they must be encouraged to keep the time to 5 seconds or below as the file sizes can become enormous.

* Buttons add a fascination for the viewer. If free access to the total presentation is granted by generous application of buttons, the viewer becomes engaged in a learning experience. Students should be encouraged to explore interactive presentations.

* Animation and timing. It is critical that students be taught how important it is that the motion and progression of the multimedia be controled in a manner that allows the viewer to digest the content. This means readable text, supporting graphics, animation that directs, not distracts.

Technology Skills:

Many technology skills can be chosen selectively for introduction during a multimedia project. These include:

* Text manipulation (font, size, bold, italic, underline, justification, color, text art, text boxes, spell checking, grammar checking, copying, cutting, pasting, deleting). Text needs to be readable, inviting (ie, NOT long paragraphs) and attractive in multimedia.

* Methods to save and retrieve files (the correct server, path, file name, file extension, folder, drive)

* Object formatting (text boxes, sound, video, and graphics boxes, formatted for animation, autoplay, manual advance, autostart, loop, timing, dim/flash)

* Graphics manipulation (inserting, resizing, cropping, framing, copying, cutting, pasting, deleting, moving, grouping)

* Use of peripherals (microphone, CD ROM, scanner, digital camera, and printer)

* Other Technology Skills: tool bars, short cut icons, menu system, help system, short cut keys, task switching, and other features which can be utilized during word processing. Certainly if there is much searching the Internet for graphics, video or music, task switching can become second nature to the students.

Types of Multimedia Data:

There are number of data types that can be characterized as multimedia data types. These are typically the elements or the building blocks of ore generalized multimedia environments, platforms, or integrating tools. The basic types can be described as follows :

1. Text : The form in which the text can be stored can vary greatly. In addition to ASCII based files, text is typically stored in processor files, spreadsheets, databases and annotations on more general multimedia objects. With availability and proliferation of GUIs, text fonts the job of storing text is becoming complex allowing special effects(color, shades..).
2. Images : There is great variance in the quality and size of storage for still images. Digitalized images are sequence of pixels that represents a region in the user's graphical display. The space overhead for still images varies on the basis of resolution, size, complexity, and compression scheme used to store image. The popular image formats are jpg, png, bmp, tiff.
3. Audio : An increasingly popular datatype being integrated in most of applications is Audio. Its quite space intensive. One minute of sound can take up to 2-3 Mbs of space. Several techniques are used to compress it in suitable format.
4. Video : One on the most space consuming multimedia data type is digitalized video. The digitalized videos are stored as sequence of frames. Depending upon its resolution and size a single frame can consume upto 1 MB. Also to have realistic video playback, the transmission, compression, and decompression of digitalized require continuous transfer rate.
5. Graphic Objects: These consists of special data structures used to define 2D & 3D shapes through which we can define multimedia objects. These includes various formats used by image, video editing applications.

Multimedia

Wednesday, December 2, 2009

The MPEG Standard

The JPEG Standard

Digital Systems

Analog Systems

Comparison of H.323 and SIP

SIP—The Session Initiation Protocol

Saturday, November 7, 2009

H.323

Voice over IP

Streaming Audio

The MBone—The Multicast Backbone

The Distribution Network

Video on Demand

Video Compression

Digital Systems

Internet Radio

Audio Compression

Friday, October 2, 2009

Thursday, February 26, 2009

Tuesday, February 24, 2009

Multimedia Picture Gallery

ip

counter

Blog Archive