# HG changeset patch # User arpi # Date 1039977642 0 # Node ID e421b4ab7815cd3a304b3f26e8c90dd192b52d1f # Parent 800d7766684386b3ae9a854da9c82b6ce17a7b4b encoding tips - collected from mplayer-users list mailings by Martin Pavon (some additions/changes by me) diff -r 800d77666843 -r e421b4ab7815 DOCS/tech/encoding-tips.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/DOCS/tech/encoding-tips.txt Sun Dec 15 18:40:42 2002 +0000 @@ -0,0 +1,554 @@ + +ENCODING QUALITY - OR WHY AUTOMATISM IS BAD. + +Hi everyone. + +Some days ago someone suggested adding some preset options to mencoder. +At that time I replied 'don't do that', and now I decided to elaborate +on that. + +Warning: this is rather long, and it involves mathematics. But if you +don't want to bother with either then why are you encoding in the +first place? Go do something different! + +The good news is: it's all about the bpp (bits per pixel). + +The bad news is: it's not THAT easy ;) + +This mail is about encoding a DVD to MPEG4. It's about the video +quality, not (primarily) about the audio quality or some other fancy +things like subtitles. + +The first step is to encode the audio. Why? Well if you encode the +audio prior to the video you'll have to make the video fit onto one +(or two) CD(s). That way you can use oggenc's quality based encoding +mode which is much more sophisticated than its ABR based mode. + +After encoding the audio you have a certain amount of space left to +fill with video. Let's assume the audio takes 60M (no problem with +Vorbis), and you aim at a 700M CD. This leaves you 640M for the video. +Let's further assume that the video is 100 minutes or 6000 seconds +long, encoded at 25fps (those nasty NTSC fps values give me +headaches. Adjust to your needs, of course!). This leaves you with +a video bitrate of: + + $videosize * 8 +$videobitrate = -------------- + $length * 1000 + +$videosize in bytes, $length in seconds, $videobitrate in kbit/s. +In my example I end up with $videobitrate = 895. + +And now comes the question: how do I chose my encoding parameters +so that the results will be good? First let's take a look at a +typical mencoder line: + +mencoder -dvd 1 -o /dev/null -oac copy -ovc lavc \ + -lavcopts vcodec=mpeg4:vbitrate=1000:vhq:vqmin=2:\ + vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01:vpass=1 \ + -vop scale=640:480,crop=716:572:2:2 + +Phew, all those parameters! Which ones should I change? NEVER leave +out 'vhq'. Never ever. 'vqmin=2' is always good if you aim for sane +settings - like 'normal length' movies on one CD, 'very long movies' +on two CDs and so on. vcodec=mpeg4 is mandatory. + +The 'vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01' are parameters +suggested by D Richard Felker for non-animated movies, and they +improve quality a bit. + +But the two things that have the most influence on quality are +vbitate and scale. Why? Because both together tell the codec how +many bits it may spend on each frame for each bit: and this is +the 'bpp' value (bits per pixel). It's simply defined as + + $videobitrate * 1000 +$bpp = ----------------------- + $width * $height * $fps + +I've attached a small Perl script that calculates the $bpp for +a movie. You'll have to give it four parameters: +a) the cropped but unscaled resolution (use '-vop cropdetect'), +b) the encoded aspect ratio. All DVDs come at 720x576 but contain +a flag that tells the player wether it should display the DVD at +an aspect ratio of 4/3 (1.333) or at 16/9 (1.777). Have a look +at mplayer's output - there's something about 'prescaling'. That's +what you are looking for. +c) the video bitrate in kbit/s and +d) the fps. + +In my example the command line and calcbpp.pl's output would look +like this (warning - long lines ahead): + +mosu@anakin:~$ ./calcbpp.pl 720x440 16/9 896 25 +Prescaled picture: 1023x440, AR 2.33 +720x304, diff 5, new AR 2.37, AR error 1.74% scale=720:304 bpp: 0.164 +704x304, diff -1, new AR 2.32, AR error 0.50% scale=704:304 bpp: 0.167 +688x288, diff 8, new AR 2.39, AR error 2.58% scale=688:288 bpp: 0.181 +672x288, diff 1, new AR 2.33, AR error 0.26% scale=672:288 bpp: 0.185 +656x288, diff -6, new AR 2.28, AR error 2.17% scale=656:288 bpp: 0.190 +640x272, diff 3, new AR 2.35, AR error 1.09% scale=640:272 bpp: 0.206 +624x272, diff -4, new AR 2.29, AR error 1.45% scale=624:272 bpp: 0.211 +608x256, diff 5, new AR 2.38, AR error 2.01% scale=608:256 bpp: 0.230 +592x256, diff -2, new AR 2.31, AR error 0.64% scale=592:256 bpp: 0.236 +576x240, diff 8, new AR 2.40, AR error 3.03% scale=576:240 bpp: 0.259 +560x240, diff 1, new AR 2.33, AR error 0.26% scale=560:240 bpp: 0.267 +544x240, diff -6, new AR 2.27, AR error 2.67% scale=544:240 bpp: 0.275 +528x224, diff 3, new AR 2.36, AR error 1.27% scale=528:224 bpp: 0.303 +512x224, diff -4, new AR 2.29, AR error 1.82% scale=512:224 bpp: 0.312 +496x208, diff 5, new AR 2.38, AR error 2.40% scale=496:208 bpp: 0.347 +480x208, diff -2, new AR 2.31, AR error 0.85% scale=480:208 bpp: 0.359 +464x192, diff 7, new AR 2.42, AR error 3.70% scale=464:192 bpp: 0.402 +448x192, diff 1, new AR 2.33, AR error 0.26% scale=448:192 bpp: 0.417 +432x192, diff -6, new AR 2.25, AR error 3.43% scale=432:192 bpp: 0.432 +416x176, diff 3, new AR 2.36, AR error 1.54% scale=416:176 bpp: 0.490 +400x176, diff -4, new AR 2.27, AR error 2.40% scale=400:176 bpp: 0.509 +384x160, diff 5, new AR 2.40, AR error 3.03% scale=384:160 bpp: 0.583 +368x160, diff -2, new AR 2.30, AR error 1.19% scale=368:160 bpp: 0.609 +352x144, diff 7, new AR 2.44, AR error 4.79% scale=352:144 bpp: 0.707 +336x144, diff 0, new AR 2.33, AR error 0.26% scale=336:144 bpp: 0.741 +320x144, diff -6, new AR 2.22, AR error 4.73% scale=320:144 bpp: 0.778 + +A word for the $bpp. For a fictional movie which is only black and +white: if you have a $bpp of 1 then the movie would be stored +uncompressed :) For a real life movie with 24bit color depth you +need compression of course. And the $bpp can be used to make the +decision easier. + +As you can see the resolutions suggested by the script are all +dividable by 16. This will make the aspect ratio slightly wrong, +but no one will notice. + +Now if you want to decide which resolution (and scaling parameters) +to chose you can do that by looking at the $bpp: + +< 0.10: don't do it. Please. I beg you! +< 0.15: It will look bad. +< 0.20: You will notice blocks, but it will look ok. +< 0.25: It will look really good. +> 0.25: It won't really improve visually. +> 0.30: Don't do that either - try a bigger resolution instead. + +Of course these values are not absolutes! For movies with really lots +of black areas 0.15 may look very good. Action movies with only high +motion scenes on the other hand may not look perfect at 0.25. But these +values give you a great idea about which resolution to chose. + +I see a lot of people always using 512 for the width and scaling +the height accordingly. For my (real-world-)example this would be +simply a waste of bandwidth. The encoder would probably not even +need the full bitrate, and the resulting file would be smaller +than my targetted 700M. + +After encoding you'll do your 'quality check'. First fire up the movie +and see whether it looks good to you or not. But you can also do a +more 'scientific' analysis. The second Perl script I attached counts +the quantizers used for the encoding. Simply call it with + +countquant.pl < divx2pass.log + +It will print out which quantizer was used how often. If you see that +e.g. the lowest quantizer (vqmin=2) gets used for > 95% of the frames +then you can safely increase your picture size. + +> The "counting the quantesizer"-thing could improve the quality of +> full automated scripts, as I understand ? + +Yes, the log file analysis can be used be tools to automatically adjust +the scaling parameters (if you'd do that you'd end up with a three-pass +encoding for the video only ;)), but it can also provide answers for +you as a human. From time to time there's a question like 'hey, +mencoder creates files that are too small! I specified this bitrate and +the resulting file is 50megs short of the target file size!'. The +reason is probably that the codec already uses the minimum quantizer +for nearly all frames so it simply does not need more bits. A quick +glance at the distribution of the quantizers can be enlightening. + +Another thing is that q=2 and q=3 look really good while the 'bigger' +quantizers really lose quality. So if your distribution shows the +majority of quantizers at 4 and above then you should probably decrease +the resolution (you'll definitly see block artefacts). + + +Well... Several people will probably disagree with me on certain +points here, especially when it comes down to hard values (like the +$bpp categories and the percentage of the quantizers used). But +the idea is still valid. + +And that's why I think that there should NOT be presets in mencoder +like the presets lame knows. 'Good quality' or 'perfect quality' are +ALWAYS relative. They always depend on a person's personal preferences. +If you want good quality then spend some time reading and - more +important - understanding what steps are involved in video encoding. +You cannot do it without mathematics. Oh well, you can, but you'll +end up with movies that could certainly look better. + +Now please shoot me if you have any complaints ;) + +-- + ==> Ciao, Mosu (Moritz Bunkus) + +=========== +ANOTHER APPROACH: BITS PER BLOCK: + +> $videobitrate * 1000 +> $bpp = ----------------------- +> $width * $height * $fps + +Well, I came to similar equation going through different route. Only I +didn't use bits per pixel, in my case it was bits per block (BPB). The block +is 16x16 because lots of software depends on video width/height being +divisable by 16. And because I didn't like this 0.2 bit per pixel, when +bit is quite atomic ;) + +So the equation was something like: + + bitrate +bpb = ----------------- + fps * ((width * height) / (16 * 16)) + +(width and height are from destination video size, and bitrate is in +bits (i.e. 900kbps is 900000)) + +This way it apeared that the minimum bits per block is ~40, very +good results are with ~50, and everything above 60 is a waste of bandwith. +And what's actually funny is that it was independant of codec used. The +results were exactly the same, whether I used DIV3 (with tricky nandub's +magick), ffmpeg odivx, DivX5 on Windows or XviD. + +Surprisingly there is one advantage of using nandub-DIV3 for bitrate +starved encoding: ringing almost never apears this way. + +But I also found out, that the quality/BPB isn't constant for +drastically different resolutions. Smaller picture (like MPEG1 sizes) +need more BPB to look good than say typical MPEG2 resolutions. + +Robert + + +=========== +DON'T SCALE DOWN TOO MUCH + +Sometimes I found that encoding to y-scaled only DVD qualty (ie 704 x +288 for a 2.85 film) gives better visual quality than a scaled-down +version even if the quantizers are significantly higher than for the +scaled-down version. +Keep in mind that blocs, fuzzy parts and generaly mpeg artefacts in a +704x288 image will be harder to spot in full-screen mode than on a +512x208 image. In fact I've see example where the same movie looks +better compressed to 704x288 with an average weighted quantizer of +~3 than the same movie scaled to 576x240 with an average weighted +quantizer of 2.4. +Btw, a print of the weighted average quantizer would be nice in +countquant.pl :) + +Another point in favor of not trying to scale down too much : on hard +scaled-down movies, the MPEG codec will need to compress relatively +high frequencies rather than low frequencies and it doesn't like that +at all. You will see less and less returns while you scale down and +scale down again in desesperate need of some bandwidth :) + +In my experience, don't try to go below a width of 576 without closely +watching what's going on. + +-- +Rémi + +=========== +TIPS FOR ENCODING + +That being said, with video you have some tradeoffs you can make. Most +people seem to encode with really basic options, but if you play with +single coefficient elimination and luma masking settings, you can save lots +of bits, resulting in lower quantizers, which means less blockiness and +less ugly noise (ringing) around sharp borders. The tradeoff, however, is +that you'll get some "muddiness" in some parts of the image. Play around +with the settings and see for yourself. The options I typically use for +(non-animated) movies are: + +vlelim=-4 +vcelim=9 +lumi_mask=0.05 +dark_mask=0.01 + +If things look too muddy, making the numbers closer to 0. For anime and +other animation, the above recommendations may not be so good. + +Another option that may be useful is allowing four motion vectors per +macroblock (v4mv). This will increase encoding time quite a bit, and +last I checked it wasn't compatible with B frames. AFAIK, specifying +v4mv should never reduce quality, but it may prevent some old junky +versions of DivX from decoding it (can anyone conform?). Another issue +might be increased cpu time needed for decoding (again, can anyone +confirm?). + +To get more fair distribution of bits between low-detail and +high-detail scenes, you should probably try increasing vqcomp from the +default (0.5) to something in the range 0.6-0.8. + +Of course you also want to make sure you crop ALL of the black border and +any half-black pixels at the edge of the image, and make sure the final +image dimensions after cropping and scaling are multiples of 16. Failing to +do so will drastically reduce quality. + +Finally, if you can't seem to get good results, you can try scaling the +movie down a bit smaller or applying a weak gaussian blur to reduce the +amount of detail. + +Now, my personal success story! I just recently managed to fit a beautiful +encode of Kundun (well over 2 hours long, but not too many high-motion +scenes) on one cd at 640x304, with 66 kbit/sec abr ogg audio, using the +options I described above. So, IMHO it's definitely possible to get very +good results with libavcodec (certainly MUCH better than all the idiot +"release groups" using DivX3 make), as long as you take some time to play +around with the options. + + +Rich + +============ +ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART I: LUMA & CHROMA + + +The l/c in vlelim and vcelim stands for luma (brightness plane) and chroma +(color planes). These are encoded separately in all mpeg-like algorithms. +Anyway, the idea behind these options is (at least from what I understand) +to use some good heuristics to determine when the change in a block is less +than the threshold you specify, and in such a case, to just encode the +block as "no change". This saves bits and perhaps speeds up encoding. Using +a negative value for either one means the same thing as the corresponding +positive value, but the DC coefficient is also considered. Unfortunately +I'm not familiar enough with the mpeg terminology to know what this means +(my first guess would be that it's the constant term from the DCT), but it +probably makes the encoder less likely to apply single coefficient +elimination in cases where it would look bad. It's presumably recommended +to use negative values for luma (which is more noticable) and positive for +chroma. + +The other options -- lumi_mask and dark_mask -- control how the quantizer +is adjusted for really dark or bright regions of the picture. You're +probably already at least a bit familiar with the concept of quantizers +(qscale, lower = more precision, higher quality, but more bits needed to +encode). What not everyone seems to know is that the quantizer you see +(e.g. in the 2pass logs) is just an average for the whole frame, and lower +or higher quantizers may in fact be used in parts of the picture with more +or less detail. Increasing the values of lumi_mask and dark_mask will cause +lavc to aggressively increase the quantizer in very dark or very bright +regions of the picture (which are presumably not as noticable to the human +eye) in order to save bits for use elsewhere. + +Rich + +=================== +ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART II: VQSCALE + +OK, a quick explanation. The quantizer you set with vqscale=N is the +per-frame quantizer parameter (aka qp). However, with mpeg4 it's +allowed (and recommended!) for the encoder to vary the quantizer on a +per-macroblock (mb) basis (as I understand it, macroblocks are 16x16 +regions composed of 4 8x8 luma blocks and 2 8x8 chroma blocks, u and +v). To do this, lavc scores each mb with a complexity value and +weights the quantizer accordingly. However, you can control this +behavior somewhat with scplx_mask, tcplx_mask, dark_mask, and +lumi_mask. + +scplx_mask -- raise quantizer on mb's with lots of spacial complexity. +Spacial complexity is measured by variance of the texture (this is +just the actual image for I blocks and the difference from the +previous coded frame for P blocks). + +tcplx_mask -- raise quantizer on mb's with lots of temporal +complexity. Temporal complexity is measured according to motion +vectors. + +dark_mask -- raise quantizer on very dark mb's. + +lumi_mask -- raise quantizer on very bright mb's. +Somewhere around 0-0.15 is a safe range for these values, IMHO. You +might try as high as 0.25 or 0.3. You should probably never go over +0.5 or so. + +Now, about naq. When you adjust the quantizers on a per-mb basis like +this (called adaptive quantization), you might decrease or (more +likely) increase the average quantizer used, so that it no longer +matches the requested average quantizer (qp) for the frame. This will +result in weird things happening with the bitrate, at least from my +experience. What naq does is "normalize adaptive quantization". That +is, after the above masking parameters are applied on a per-mb basis, +the quantizers of all the blocks are rescaled so that the average +stays fixed at the desired qp. + +So, if I used vqscale=4 with naq and fairly large values for the +masking parameters, I might be likely to see lots of frames using +qscale 2,3,4,5,6,7 across different macroblocks as needed, but with +the average sticking around 4. However, I haven't actually tested such +a setup yet, so it's just speculation right now. + +Have fun playing around with it. + +Rich + +====================== +TIPS FOR ENCODING OLD BLACK & WHITE MOVIES: + +I found myself that 4:3 B&W old movies are very hard to compress well. In +addition to the 4:3 aspect ratio which eats lots of bits, those movies are +typically very "noisy", which doesn't help at all. Anyway : + +> After a few tries I am +> still a little bit disappointed with the video quality. Since it is a +> "dark" movies, there is a lot of black on the pictures, and on the +> encoded avi I can see a lot of annoying "mpeg squares". I am using +> avifile codec, but the best I think is to give you the command line I +> used to encode a preview of the result: + +> +> First pass: +> mencoder TITLE01-ANGLE1.VOB -oac copy -ovc lavc -lavcopts +> vcodec=mpeg4:vhq:vpass=1:vbitrate=800:keyint=48 -ofps 23.976 -npp lb +> -ss 2:00 -endpos 0:30 -vop scale -zoom -xy 640 -o movie.avi + +1) keyint=48 is way too low. The default value is 250, this is in *frames* +not seconds. Key frames are significantly larger than P or B frames, so the +less key frames you have, better the overall movie will be. (huh, like Yoda +I speak ;). Try keyint=300 or 350. Don't go beyond that if you want +relatively precise seeking. + +2) you may want to play with vlelim and vcelim options. This can gives you +a significant "quality" boost. Try one of these couples : + +vlelim=-2:vcelim=3 +vlelim=-3:vcelim=5 +vlelim=-4:vcelim=7 +(and yes, there's a minus) + +3) crop & rescale the movie before passing it to the codec. First crop the +movie to not encode black bars if there's any. For a 1h40mn movie +compressed to a 700 MB file, I would try something between 512x384 and +480x320. Don't go below that if you want something relatively sharp when +viewed fullscreen. + +4) I would recommend using the Ogg Vorbis audio codec with the .ogm +container format. Ogg Vorbis compress audio better than MP3. On a typical +old, mono-only audio stream, a 45 kbits/s Vorbis stream is ok. How to +extract & compress an audio stream from a ripped DVD (mplayer -dvd 1 +-dumpstream) : + +rm -f audiodump.pcm ; mkfifo -m 600 audiodump.pcm +mplayer -quiet -vc null -vo null -aid 128 -ao pcm -nowaveheader stream.dump & +oggenc --raw --raw-bits=16 --raw-chan=2 --raw-rate=48000 -q 1 -o audio-us.ogg ++audiodump.pcm & +wait + +For a nice set of utilities to manager the .ogm format, see Moritz Bunkus' +ogmtools (google is your friend). + +5) use the "v4mv" option. This could gives you a few more bits at the +expense of a slightly longer encoding. This is a "lossless" option, I mean +with this option you don't throw away some video information, it just +selects a more precise motion estimation method. Be warned that on some +very un-typical scenes this option may gives you a longer file than +without, although it's very rare and on a whole film I think it's always a +win. + +6) you can try the new luminance & darkness masking code. Play +with the "lumi_mask" and "dark_mask" options. I would recommend using +something like : +lumi_mask=0.07:dark_mask=0.10:naq: +lumi_mask=0.10:dark_mask=0.12:naq: +lumi_mask=0.12:dark_mask=0.15:naq +lumi_mask=0.13:dark_mask=0.16:naq: +Be warned that these options are really experimental and the result +could be very good or very bad depending on your visualization device +(computer CRT, TV or TFT screen). Don't push too hard these options. + +> Second pass: +> the same with vpass=2 + +7) I've found that lavc gives better results when the first pass is done +with "vqscale=2" instead of a target bitrate. The statistics collected +seems to be more precise. YMMV. + +> I am new to mencoder, so please tell me any idea you have even if it +> obvious. I also tried the "gray" option of lavc, to encode B&W only, +> but strangely it gives me "pink" squares from time to time. + +Yes, I've seen that too. Playing the resulting file with "-lavdopts gray" +fix the problem but it's not very nice ... + +> So if you could tell me what option of mencoder or lavc I should be +> looking at to lower the number of "squares" on the image, it would be +> great. The version of mencoder i use is 0.90pre8 on a macos x PPC +> platform. I guess I would have the same problem by encoding anime +> movies, where there are a lot of region of the image with the same +> color. So if you managed to solve this problem... + +You could also try the "mpeg_quant" flag. It selects a different set of +quantizers and produce somewhat sharper pictures and less blocks on large +zones with the same or similar luminance, at the expense of some bits. + +> This is completely off topic, but do you know how I can create good +> subtitles from vobsub subtitles ? I checked the -dumpmpsub option of +> mplayer, but is there a way to do it really fast (ie without having to +> play the whole movie) ? + +I didn't find a way under *nix to produce reasonably good text subtitles +from vobsubs. OCR *nix softwares seems either not suited to the task, not +powerful enough or both. I'm extracting the vobsub subtitles and simply use +them with the .ogm + +/ .avi : +1) rip the DVD to harddisk with "mplayer -dvd 1 -dumpstream" +2) mount the DVD and copy the .ifo file +2) extract all vobsubs to one single file with something like : + +for f in 0 1 2 3 4 5 6 7 8 9 10 11 ; do \ + mencoder -ovc copy -oac copy -o /dev/null -sid $f -vobsubout sous-titres ++-vobsuboutindex $f -ifo vts_01_0.ifo stream.dump +done + +(and yes, I've a DVD with 12 subtitles) +-- +Rémi + +================================ + +TIPS FOR SMOKE & CLOUDS + +Q: I'm trying to encode Dante's Peak and I'm having problems with clouds, +fog and smoke: They don't look fine (they look very bad if I watch the +movie in TVout). There are some artifacts, white clouds looks as snow +mountains, there are things likes hip in the colors so one can see frontier +curves between white and light gray and dark gray ... (I don't know if you +can understand me, I want to mean that the colors don't change smoothly) +In particular I'm using vqscale=2:vhq:v4mv + +A: Try adding "vqcomp=0.7:vqblur=0.2:mpeg_quant" to lavcopts. + +Q: I tried your suggestion and it improved the image a little ... but not +enough. I was playing with different options and I couldn't find the way. +I suppose that the vob is not so good (watching it in TV trough the +computer looks better than my encoding, but it isn't a lot of better). + +A: Yes, those scenes with qscale=2 looks terrible :-( + +Try with vqmin=1 in addition to mpeg_quant:vlelim=-4:vcelim=-7 (and maybe +with "-sws 10 -ssf ls=1" to sharpen a bit the image) and read about vqmin=1 +in DOCS/tech/libavc-options.txt. + +If after the whole movie is encoded you still see the same problem, it will +means that the second pass didn't picked-up q=1 for this scene. Force q=1 +with the "vrc_override" option. + +Q: By the way, is there a special difficult in encode clouds or smoke? + +A: I would say it depends on the sharpness of these clouds / smokes and the +fact that they are mostly black/white/grey or colored. The codec will do +the right thing with vqmin=2 for example on a cigarette smoke (sharp) or on +a red/yellow cloud (explosion, cloud of fire). But may not with a grey and +very fuzzy cloud like in the chocolat scene. Note that I don't know exactly +why ;) + +A = Rémi + +