Mercurial > mplayer.hg
view DOCS/tech/encoding-tips.txt @ 17088:f067a9de373c
mplayer.c:1928: warning: implicit declaration of function 'cache_uninit'
mplayer.c:2897: warning: implicit declaration of function 'mpcodecs_config_vo'
author | rathann |
---|---|
date | Mon, 05 Dec 2005 01:18:10 +0000 |
parents | 2de4d051dd21 |
children | 35c8d3361404 |
line wrap: on
line source
Some important URLs: ~~~~~~~~~~~~~~~~~~~~ http://www.mplayerhq.hu/~michael/codec-features.html <- lavc vs. divx5 vs. xvid http://www.ee.oulu.fi/~tuukkat/mplayer/tests/rguyom.ath.cx/ <- lavc benchmarks, options http://ffdshow.sourceforge.net/tikiwiki/tiki-view_articles.php <- lavc for win32 :) http://www.bunkus.org/dvdripping4linux/index.html <- a nice tutorial http://forum.zhentarim.net/viewtopic.php?p=237 <- lavc option comparison http://www.ee.oulu.fi/~tuukkat/mplayer/tests/readme.html <- series of benchmarks http://thread.gmane.org/gmane.comp.video.mencoder.user/1196 <- free codecs shoutout and recommended encoding settings ================================================================================ FIXING A/V SYNC WHEN ENCODING I know this is a popular topic on the list, so I thought I'd share a few comments on my experience fixing a/v sync. As everyone seems to know, mencoder unfortunately doesn't have a -delay option. But that doesn't mean you can't fix a/v sync. There are a couple ways to still do it. In example 1, we'll suppose you want to re-encode the audio anyway. This will be essential if your source audio isn't mp3, e.g. for DVD's or nasty avi files with divx/wma audio. This approach makes things much easier. Step 1: Dump the audio with mplayer -ao pcm -nowaveheader. There are various options that can be used to speed this up, most notably -vo null, -vc null, and/or -hardframedrop. -benchmark also seemed to help in the past. :) Step 2: Figure out what -delay value syncs the audio right in mplayer. If this number is positive, use a command like the following: dd if=audiodump.wav bs=1764 skip=[delay] | lame -x - out.mp3 where [delay] is replaced by your delay amount in hundredths of a second (1/10 the value you use with mplayer). Otherwise, if delay is negative, use a command like this: ( dd if=/dev/zero bs=1764 skip=[delay] ; cat audiodump.wav ) | lame -x - out.mp3 Don't include the minus (-) sign in delay. Also, keep in mind you'll have to change the 1764 number and provide additional options to lame if your audio stream isn't 44100/16bit/little-endian/stereo. Step 3: Use mencoder to remux your new mp3 file with the movie: mencoder -audiofile out.mp3 -oac copy ... You can either copy video as-is (with -ovc copy) or re-encode it at the same time you merge in the audio like this. Finally, as a variation on this method (makes things a good bit faster and doesn't use tons of temporary disk space) you can merge steps 1 and 2 by making a named pipe called "audiodump.wav" (type mkfifo audiodump.wav) and have mplayer write the audio to it at the same time you're running lame to encode. Now for example 2. This time we won't re-encode audio at all. Just dump the mp3 stream from the avi file with mplayer -dumpaudio. Then, you have to cut and paste the raw mp3 stream a bit... If delay is negative, things are easier. Just use lame to encode silence for the duration of delay, at the same samplerate and samplesize used in your avi file. Then, do something like: cat silence.mp3 stream.dump > out.mp3 mencoder -audiofile out.mp3 -oac copy ... On the other hand, if delay is positive, you'll need to crop off part of the mp3 from the beginning. If it's (at least roughly) CBR this is easy -- just take off the first (bitrate*delay/8) bytes of the file. You can use the excellent dd tool, or just your favorite binary-friendly text editor to do this. Otherwise, you'll have to experiment with cutting off different amounts. You can test with mplayer -audiofile before actually spending time remuxing/encoding with mencoder to make sure you cut the right amount. I hope this has all been informative. If anyone would like to clean this message up a bit and make it into part of the docs, feel free. Of course mencoder should eventually just get -delay. :) Rich ================================================================================ ENCODING QUALITY - OR WHY AUTOMATISM IS BAD. Hi everyone. Some days ago someone suggested adding some preset options to mencoder. At that time I replied 'don't do that', and now I decided to elaborate on that. Warning: this is rather long, and it involves mathematics. But if you don't want to bother with either then why are you encoding in the first place? Go do something different! The good news is: it's all about the bpp (bits per pixel). The bad news is: it's not THAT easy ;) This mail is about encoding a DVD to MPEG4. It's about the video quality, not (primarily) about the audio quality or some other fancy things like subtitles. The first step is to encode the audio. Why? Well if you encode the audio prior to the video you'll have to make the video fit onto one (or two) CD(s). That way you can use oggenc's quality based encoding mode which is much more sophisticated than its ABR based mode. After encoding the audio you have a certain amount of space left to fill with video. Let's assume the audio takes 60M (no problem with Vorbis), and you aim at a 700M CD. This leaves you 640M for the video. Let's further assume that the video is 100 minutes or 6000 seconds long, encoded at 25fps (those nasty NTSC fps values give me headaches. Adjust to your needs, of course!). This leaves you with a video bitrate of: $videosize * 8 $videobitrate = -------------- $length * 1000 $videosize in bytes, $length in seconds, $videobitrate in kbit/s. In my example I end up with $videobitrate = 895. And now comes the question: how do I chose my encoding parameters so that the results will be good? First let's take a look at a typical mencoder line: mencoder dvd://1 -o /dev/null -oac copy -ovc lavc \ -lavcopts vcodec=mpeg4:vbitrate=1000:vhq:vqmin=2:\ vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01:vpass=1 \ -vf crop=716:572:2:2,scale=640:480 Phew, all those parameters! Which ones should I change? NEVER leave out 'vhq'. Never ever. 'vqmin=2' is always good if you aim for sane settings - like 'normal length' movies on one CD, 'very long movies' on two CDs and so on. vcodec=mpeg4 is mandatory. The 'vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01' are parameters suggested by D Richard Felker for non-animated movies, and they improve quality a bit. But the two things that have the most influence on quality are vbitate and scale. Why? Because both together tell the codec how many bits it may spend on each frame for each bit: and this is the 'bpp' value (bits per pixel). It's simply defined as $videobitrate * 1000 $bpp = ----------------------- $width * $height * $fps I've attached a small Perl script that calculates the $bpp for a movie. You'll have to give it four parameters: a) the cropped but unscaled resolution (use '-vf cropdetect'), b) the encoded aspect ratio. All DVDs come at 720x576 but contain a flag that tells the player wether it should display the DVD at an aspect ratio of 4/3 (1.333) or at 16/9 (1.777). Have a look at mplayer's output - there's something about 'prescaling'. That's what you are looking for. c) the video bitrate in kbit/s and d) the fps. In my example the command line and calcbpp.pl's output would look like this (warning - long lines ahead): mosu@anakin:~$ ./calcbpp.pl 720x440 16/9 896 25 Prescaled picture: 1023x440, AR 2.33 720x304, diff 5, new AR 2.37, AR error 1.74% scale=720:304 bpp: 0.164 704x304, diff -1, new AR 2.32, AR error 0.50% scale=704:304 bpp: 0.167 688x288, diff 8, new AR 2.39, AR error 2.58% scale=688:288 bpp: 0.181 672x288, diff 1, new AR 2.33, AR error 0.26% scale=672:288 bpp: 0.185 656x288, diff -6, new AR 2.28, AR error 2.17% scale=656:288 bpp: 0.190 640x272, diff 3, new AR 2.35, AR error 1.09% scale=640:272 bpp: 0.206 624x272, diff -4, new AR 2.29, AR error 1.45% scale=624:272 bpp: 0.211 608x256, diff 5, new AR 2.38, AR error 2.01% scale=608:256 bpp: 0.230 592x256, diff -2, new AR 2.31, AR error 0.64% scale=592:256 bpp: 0.236 576x240, diff 8, new AR 2.40, AR error 3.03% scale=576:240 bpp: 0.259 560x240, diff 1, new AR 2.33, AR error 0.26% scale=560:240 bpp: 0.267 544x240, diff -6, new AR 2.27, AR error 2.67% scale=544:240 bpp: 0.275 528x224, diff 3, new AR 2.36, AR error 1.27% scale=528:224 bpp: 0.303 512x224, diff -4, new AR 2.29, AR error 1.82% scale=512:224 bpp: 0.312 496x208, diff 5, new AR 2.38, AR error 2.40% scale=496:208 bpp: 0.347 480x208, diff -2, new AR 2.31, AR error 0.85% scale=480:208 bpp: 0.359 464x192, diff 7, new AR 2.42, AR error 3.70% scale=464:192 bpp: 0.402 448x192, diff 1, new AR 2.33, AR error 0.26% scale=448:192 bpp: 0.417 432x192, diff -6, new AR 2.25, AR error 3.43% scale=432:192 bpp: 0.432 416x176, diff 3, new AR 2.36, AR error 1.54% scale=416:176 bpp: 0.490 400x176, diff -4, new AR 2.27, AR error 2.40% scale=400:176 bpp: 0.509 384x160, diff 5, new AR 2.40, AR error 3.03% scale=384:160 bpp: 0.583 368x160, diff -2, new AR 2.30, AR error 1.19% scale=368:160 bpp: 0.609 352x144, diff 7, new AR 2.44, AR error 4.79% scale=352:144 bpp: 0.707 336x144, diff 0, new AR 2.33, AR error 0.26% scale=336:144 bpp: 0.741 320x144, diff -6, new AR 2.22, AR error 4.73% scale=320:144 bpp: 0.778 A word for the $bpp. For a fictional movie which is only black and white: if you have a $bpp of 1 then the movie would be stored uncompressed :) For a real life movie with 24bit color depth you need compression of course. And the $bpp can be used to make the decision easier. As you can see the resolutions suggested by the script are all dividable by 16. This will make the aspect ratio slightly wrong, but no one will notice. Now if you want to decide which resolution (and scaling parameters) to chose you can do that by looking at the $bpp: < 0.10: don't do it. Please. I beg you! < 0.15: It will look bad. < 0.20: You will notice blocks, but it will look ok. < 0.25: It will look really good. > 0.25: It won't really improve visually. > 0.30: Don't do that either - try a bigger resolution instead. Of course these values are not absolutes! For movies with really lots of black areas 0.15 may look very good. Action movies with only high motion scenes on the other hand may not look perfect at 0.25. But these values give you a great idea about which resolution to chose. I see a lot of people always using 512 for the width and scaling the height accordingly. For my (real-world-)example this would be simply a waste of bandwidth. The encoder would probably not even need the full bitrate, and the resulting file would be smaller than my targetted 700M. After encoding you'll do your 'quality check'. First fire up the movie and see whether it looks good to you or not. But you can also do a more 'scientific' analysis. The second Perl script I attached counts the quantizers used for the encoding. Simply call it with countquant.pl < divx2pass.log It will print out which quantizer was used how often. If you see that e.g. the lowest quantizer (vqmin=2) gets used for > 95% of the frames then you can safely increase your picture size. > The "counting the quantesizer"-thing could improve the quality of > full automated scripts, as I understand ? Yes, the log file analysis can be used be tools to automatically adjust the scaling parameters (if you'd do that you'd end up with a three-pass encoding for the video only ;)), but it can also provide answers for you as a human. From time to time there's a question like 'hey, mencoder creates files that are too small! I specified this bitrate and the resulting file is 50megs short of the target file size!'. The reason is probably that the codec already uses the minimum quantizer for nearly all frames so it simply does not need more bits. A quick glance at the distribution of the quantizers can be enlightening. Another thing is that q=2 and q=3 look really good while the 'bigger' quantizers really lose quality. So if your distribution shows the majority of quantizers at 4 and above then you should probably decrease the resolution (you'll definitly see block artefacts). Well... Several people will probably disagree with me on certain points here, especially when it comes down to hard values (like the $bpp categories and the percentage of the quantizers used). But the idea is still valid. And that's why I think that there should NOT be presets in mencoder like the presets lame knows. 'Good quality' or 'perfect quality' are ALWAYS relative. They always depend on a person's personal preferences. If you want good quality then spend some time reading and - more important - understanding what steps are involved in video encoding. You cannot do it without mathematics. Oh well, you can, but you'll end up with movies that could certainly look better. Now please shoot me if you have any complaints ;) -- ==> Ciao, Mosu (Moritz Bunkus) =========== ANOTHER APPROACH: BITS PER BLOCK: > $videobitrate * 1000 > $bpp = ----------------------- > $width * $height * $fps Well, I came to similar equation going through different route. Only I didn't use bits per pixel, in my case it was bits per block (BPB). The block is 16x16 because lots of software depends on video width/height being divisable by 16. And because I didn't like this 0.2 bit per pixel, when bit is quite atomic ;) So the equation was something like: bitrate bpb = ----------------- fps * ((width * height) / (16 * 16)) (width and height are from destination video size, and bitrate is in bits (i.e. 900kbps is 900000)) This way it apeared that the minimum bits per block is ~40, very good results are with ~50, and everything above 60 is a waste of bandwidth. And what's actually funny is that it was independant of codec used. The results were exactly the same, whether I used DIV3 (with tricky nandub's magick), ffmpeg odivx, DivX5 on Windows or XviD. Surprisingly there is one advantage of using nandub-DIV3 for bitrate starved encoding: ringing almost never apears this way. But I also found out, that the quality/BPB isn't constant for drastically different resolutions. Smaller picture (like MPEG1 sizes) need more BPB to look good than say typical MPEG2 resolutions. Robert =========== DON'T SCALE DOWN TOO MUCH Sometimes I found that encoding to y-scaled only DVD qualty (ie 704 x 288 for a 2.85 film) gives better visual quality than a scaled-down version even if the quantizers are significantly higher than for the scaled-down version. Keep in mind that blocs, fuzzy parts and generaly mpeg artefacts in a 704x288 image will be harder to spot in full-screen mode than on a 512x208 image. In fact I've see example where the same movie looks better compressed to 704x288 with an average weighted quantizer of ~3 than the same movie scaled to 576x240 with an average weighted quantizer of 2.4. Btw, a print of the weighted average quantizer would be nice in countquant.pl :) Another point in favor of not trying to scale down too much : on hard scaled-down movies, the MPEG codec will need to compress relatively high frequencies rather than low frequencies and it doesn't like that at all. You will see less and less returns while you scale down and scale down again in desesperate need of some bandwidth :) In my experience, don't try to go below a width of 576 without closely watching what's going on. -- Rémi =========== TIPS FOR ENCODING That being said, with video you have some tradeoffs you can make. Most people seem to encode with really basic options, but if you play with single coefficient elimination and luma masking settings, you can save lots of bits, resulting in lower quantizers, which means less blockiness and less ugly noise (ringing) around sharp borders. The tradeoff, however, is that you'll get some "muddiness" in some parts of the image. Play around with the settings and see for yourself. The options I typically use for (non-animated) movies are: vlelim=-4 vcelim=9 lumi_mask=0.05 dark_mask=0.01 If things look too muddy, making the numbers closer to 0. For anime and other animation, the above recommendations may not be so good. Another option that may be useful is allowing four motion vectors per macroblock (v4mv). This will increase encoding time quite a bit, and last I checked it wasn't compatible with B frames. AFAIK, specifying v4mv should never reduce quality, but it may prevent some old junky versions of DivX from decoding it (can anyone conform?). Another issue might be increased cpu time needed for decoding (again, can anyone confirm?). To get more fair distribution of bits between low-detail and high-detail scenes, you should probably try increasing vqcomp from the default (0.5) to something in the range 0.6-0.8. Of course you also want to make sure you crop ALL of the black border and any half-black pixels at the edge of the image, and make sure the final image dimensions after cropping and scaling are multiples of 16. Failing to do so will drastically reduce quality. Finally, if you can't seem to get good results, you can try scaling the movie down a bit smaller or applying a weak gaussian blur to reduce the amount of detail. Now, my personal success story! I just recently managed to fit a beautiful encode of Kundun (well over 2 hours long, but not too many high-motion scenes) on one cd at 640x304, with 66 kbit/sec abr ogg audio, using the options I described above. So, IMHO it's definitely possible to get very good results with libavcodec (certainly MUCH better than all the idiot "release groups" using DivX3 make), as long as you take some time to play around with the options. Rich ============ ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART I: LUMA & CHROMA The l/c in vlelim and vcelim stands for luma (brightness plane) and chroma (color planes). These are encoded separately in all mpeg-like algorithms. Anyway, the idea behind these options is (at least from what I understand) to use some good heuristics to determine when the change in a block is less than the threshold you specify, and in such a case, to just encode the block as "no change". This saves bits and perhaps speeds up encoding. Using a negative value for either one means the same thing as the corresponding positive value, but the DC coefficient is also considered. Unfortunately I'm not familiar enough with the mpeg terminology to know what this means (my first guess would be that it's the constant term from the DCT), but it probably makes the encoder less likely to apply single coefficient elimination in cases where it would look bad. It's presumably recommended to use negative values for luma (which is more noticable) and positive for chroma. The other options -- lumi_mask and dark_mask -- control how the quantizer is adjusted for really dark or bright regions of the picture. You're probably already at least a bit familiar with the concept of quantizers (qscale, lower = more precision, higher quality, but more bits needed to encode). What not everyone seems to know is that the quantizer you see (e.g. in the 2pass logs) is just an average for the whole frame, and lower or higher quantizers may in fact be used in parts of the picture with more or less detail. Increasing the values of lumi_mask and dark_mask will cause lavc to aggressively increase the quantizer in very dark or very bright regions of the picture (which are presumably not as noticable to the human eye) in order to save bits for use elsewhere. Rich =================== ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART II: VQSCALE OK, a quick explanation. The quantizer you set with vqscale=N is the per-frame quantizer parameter (aka qp). However, with mpeg4 it's allowed (and recommended!) for the encoder to vary the quantizer on a per-macroblock (mb) basis (as I understand it, macroblocks are 16x16 regions composed of 4 8x8 luma blocks and 2 8x8 chroma blocks, u and v). To do this, lavc scores each mb with a complexity value and weights the quantizer accordingly. However, you can control this behavior somewhat with scplx_mask, tcplx_mask, dark_mask, and lumi_mask. scplx_mask -- raise quantizer on mb's with lots of spacial complexity. Spacial complexity is measured by variance of the texture (this is just the actual image for I blocks and the difference from the previous coded frame for P blocks). tcplx_mask -- raise quantizer on mb's with lots of temporal complexity. Temporal complexity is measured according to motion vectors. dark_mask -- raise quantizer on very dark mb's. lumi_mask -- raise quantizer on very bright mb's. Somewhere around 0-0.15 is a safe range for these values, IMHO. You might try as high as 0.25 or 0.3. You should probably never go over 0.5 or so. Now, about naq. When you adjust the quantizers on a per-mb basis like this (called adaptive quantization), you might decrease or (more likely) increase the average quantizer used, so that it no longer matches the requested average quantizer (qp) for the frame. This will result in weird things happening with the bitrate, at least from my experience. What naq does is "normalize adaptive quantization". That is, after the above masking parameters are applied on a per-mb basis, the quantizers of all the blocks are rescaled so that the average stays fixed at the desired qp. So, if I used vqscale=4 with naq and fairly large values for the masking parameters, I might be likely to see lots of frames using qscale 2,3,4,5,6,7 across different macroblocks as needed, but with the average sticking around 4. However, I haven't actually tested such a setup yet, so it's just speculation right now. Have fun playing around with it. Rich ================================================================================ TIPS FOR ENCODING OLD BLACK & WHITE MOVIES: I found myself that 4:3 B&W old movies are very hard to compress well. In addition to the 4:3 aspect ratio which eats lots of bits, those movies are typically very "noisy", which doesn't help at all. Anyway : > After a few tries I am > still a little bit disappointed with the video quality. Since it is a > "dark" movies, there is a lot of black on the pictures, and on the > encoded avi I can see a lot of annoying "mpeg squares". I am using > avifile codec, but the best I think is to give you the command line I > used to encode a preview of the result: > > First pass: > mencoder TITLE01-ANGLE1.VOB -oac copy -ovc lavc -lavcopts > vcodec=mpeg4:vhq:vpass=1:vbitrate=800:keyint=48 -ofps 23.976 -npp lb > -ss 2:00 -endpos 0:30 -vf scale -zoom -xy 640 -o movie.avi 1) keyint=48 is way too low. The default value is 250, this is in *frames* not seconds. Keyframes are significantly larger than P or B frames, so the less keyframes you have, better the overall movie will be. (huh, like Yoda I speak ;). Try keyint=300 or 350. Don't go beyond that if you want relatively precise seeking. 2) you may want to play with vlelim and vcelim options. This can gives you a significant "quality" boost. Try one of these couples : vlelim=-2:vcelim=3 vlelim=-3:vcelim=5 vlelim=-4:vcelim=7 (and yes, there's a minus) 3) crop & rescale the movie before passing it to the codec. First crop the movie to not encode black bars if there's any. For a 1h40mn movie compressed to a 700 MB file, I would try something between 512x384 and 480x320. Don't go below that if you want something relatively sharp when viewed fullscreen. 4) I would recommend using the Ogg Vorbis audio codec with the .ogm container format. Ogg Vorbis compress audio better than MP3. On a typical old, mono-only audio stream, a 45 kbits/s Vorbis stream is ok. How to extract & compress an audio stream from a ripped DVD (mplayer dvd://1 -dumpstream) : rm -f audiodump.pcm ; mkfifo -m 600 audiodump.pcm mplayer -quiet -vc null -vo null -aid 128 -ao pcm -nowaveheader stream.dump & oggenc --raw --raw-bits=16 --raw-chan=2 --raw-rate=48000 -q 1 -o audio-us.ogg +audiodump.pcm & wait For a nice set of utilities to manager the .ogm format, see Moritz Bunkus' ogmtools (google is your friend). 5) use the "v4mv" option. This could gives you a few more bits at the expense of a slightly longer encoding. This is a "lossless" option, I mean with this option you don't throw away some video information, it just selects a more precise motion estimation method. Be warned that on some very un-typical scenes this option may gives you a longer file than without, although it's very rare and on a whole film I think it's always a win. 6) you can try the new luminance & darkness masking code. Play with the "lumi_mask" and "dark_mask" options. I would recommend using something like : lumi_mask=0.07:dark_mask=0.10:naq: lumi_mask=0.10:dark_mask=0.12:naq: lumi_mask=0.12:dark_mask=0.15:naq lumi_mask=0.13:dark_mask=0.16:naq: Be warned that these options are really experimental and the result could be very good or very bad depending on your visualization device (computer CRT, TV or TFT screen). Don't push too hard these options. > Second pass: > the same with vpass=2 7) I've found that lavc gives better results when the first pass is done with "vqscale=2" instead of a target bitrate. The statistics collected seems to be more precise. YMMV. > I am new to mencoder, so please tell me any idea you have even if it > obvious. I also tried the "gray" option of lavc, to encode B&W only, > but strangely it gives me "pink" squares from time to time. Yes, I've seen that too. Playing the resulting file with "-lavdopts gray" fix the problem but it's not very nice ... > So if you could tell me what option of mencoder or lavc I should be > looking at to lower the number of "squares" on the image, it would be > great. The version of mencoder i use is 0.90pre8 on a macos x PPC > platform. I guess I would have the same problem by encoding anime > movies, where there are a lot of region of the image with the same > color. So if you managed to solve this problem... You could also try the "mpeg_quant" flag. It selects a different set of quantizers and produce somewhat sharper pictures and less blocks on large zones with the same or similar luminance, at the expense of some bits. > This is completely off topic, but do you know how I can create good > subtitles from vobsub subtitles ? I checked the -dumpmpsub option of > mplayer, but is there a way to do it really fast (ie without having to > play the whole movie) ? I didn't find a way under *nix to produce reasonably good text subtitles from vobsubs. OCR *nix softwares seems either not suited to the task, not powerful enough or both. I'm extracting the vobsub subtitles and simply use them with the .ogm / .avi : 1) rip the DVD to harddisk with "mplayer dvd://1 -dumpstream" 2) mount the DVD and copy the .ifo file 2) extract all vobsubs to one single file with something like : for f in 0 1 2 3 4 5 6 7 8 9 10 11 ; do \ mencoder -ovc copy -oac copy -o /dev/null -sid $f -vobsubout sous-titres +-vobsuboutindex $f -ifo vts_01_0.ifo stream.dump done (and yes, I've a DVD with 12 subtitles) -- Rémi ================================================================================ TIPS FOR SMOKE & CLOUDS Q: I'm trying to encode Dante's Peak and I'm having problems with clouds, fog and smoke: They don't look fine (they look very bad if I watch the movie in TVout). There are some artifacts, white clouds looks as snow mountains, there are things likes hip in the colors so one can see frontier curves between white and light gray and dark gray ... (I don't know if you can understand me, I want to mean that the colors don't change smoothly) In particular I'm using vqscale=2:vhq:v4mv A: Try adding "vqcomp=0.7:vqblur=0.2:mpeg_quant" to lavcopts. Q: I tried your suggestion and it improved the image a little ... but not enough. I was playing with different options and I couldn't find the way. I suppose that the vob is not so good (watching it in TV trough the computer looks better than my encoding, but it isn't a lot of better). A: Yes, those scenes with qscale=2 looks terrible :-( Try with vqmin=1 in addition to mpeg_quant:vlelim=-4:vcelim=-7 (and maybe with "-sws 10 -ssf ls=1" to sharpen a bit the image) and read about vqmin=1 in DOCS/tech/libavc-options.txt. If after the whole movie is encoded you still see the same problem, it will means that the second pass didn't picked-up q=1 for this scene. Force q=1 with the "vrc_override" option. Q: By the way, is there a special difficult in encode clouds or smoke? A: I would say it depends on the sharpness of these clouds / smokes and the fact that they are mostly black/white/grey or colored. The codec will do the right thing with vqmin=2 for example on a cigarette smoke (sharp) or on a red/yellow cloud (explosion, cloud of fire). But may not with a grey and very fuzzy cloud like in the chocolat scene. Note that I don't know exactly why ;) A = Rémi ================================================================================ TIPS FOR TWEAKING RATECONTROL (For the purpose of this explanation, consider "2nd pass" to be any beyond the 1st. The algorithm is run only on P-frames; I- and B-frames use QPs based on the adjacent P. While x264's 2pass ratecontrol is based on lavc's, it has diverged somewhat and not all of this is valid for x264.) Consider the default ratecontrol equation in lavc: "tex^qComp". At the beginning of the 2nd pass, rc_eq is evaluated for each frame, and the result is the number of bits allocated to that frame (multiplied by some constant as needed to match the total requested bitrate). "tex" is the complexity of a frame, i.e. the estimated number of bits it would take to encode at a given quantizer. (If the 1st pass was CQP and not turbo, then we know tex exactly. Otherwise it is calculated by multiplying the 1st pass's bits by the QP of that frame. But that's not why CQP is potentially good; more on that later.) "qComp" is just a constant. It has no effect outside the rc_eq, and is directly set by the vqcomp parameter. If vqcomp=1, then rc_eq=tex^1=tex, so 2pass allocates to each frame the number of bits needed to encode them all at the same QP. If vqcomp=0, then rc_eq=tex^0=1, so 2pass allocates the same number of bits to each frame, i.e. CBR. (Actually, this is worse than 1pass CBR in terms of quality; CBR can vary within its allowed buffer size, while vqcomp=0 tries to make each frame exactly the same size.) If vqcomp=0.5, then rc_eq=sqrt(tex), so the allocation is somewhere between CBR and CQP. High complexity frames get somewhat lower quality than low complexity, but still more bits. While the actual selection of a good value of vqcomp is experimental, the following underlying factors determine the result: Arguing towards CQP: You want the movie to be somewhere approaching constant quality; oscillating quality is even more annoying than constant low quality. (However, constant quality does not mean constant PSNR nor constant QP. Details are less noticeable in high-motion scenes, so you can get away with somewhat higher QP in high-complexity frames for the same perceived quality.) Arguing towards CBR: You get more quality per bit if you spend those bits in frames where motion compensation works well (which tends to be correlated with "tex"): A given artifact may stick around several seconds in a low-motion scene, and you only have to fix it in one frame to improve the quality of the whole sequence. Now for why the 1st pass ratecontrol method matters: The number of bits at constant quant is as good a measure of complexity as any other simple formula for the purpose of allocating bits. But it's not perfect for predicting which QP will produce the desired number of bits. Bits are approximately inversely proportional to QP, but the exact scaling is non-linear, and depends on the amount of detail/noise, the complexity of motion, the quality of previous frames, and other stuff not measured by the ratecontrol. So it predicts best when the QP used for a given frame in the 2nd pass is close to the QP used in the 1st pass. When the prediction is wrong, lavc needs to distribute the surplus or deficit of bits among future frames, which means that they too deviate from the planned distribution. Obviously, with vqcomp=1 you can get the 1st pass QPs very close by using CQP, and with vqcomp=0 a CBR 1st pass is very close. But with vqcomp=0.5 it's more ambiguous. The accepted wisdom is that CBR is better for vqcomp=0.5, mostly because you then don't have to guess a QP in advance. But with vqcomp=0.6 or 0.7, the 2nd pass QPs vary less, so a CQP 1st pass (with the right QP) will be a better predictor than CBR. To make it more confusing, 1pass CBR uses the same rc_eq with a different meaning. In CBR, we don't have a real encode to estimate from, so "tex" is calculated from the full-pixel precision motion-compensation residual. While the number of bits allocated to a given frame is decided by the rc_eq just like in 2nd pass, the target bitrate is constant (instead of being the sum of per-frame rc_eq values). So the scaling factor (which is constant in 2nd pass) now varies in order to keep the local average bitrate near the CBR target. So vqcomp does affect CBR, though it only determines the local allocation of bits, not the long-term allocation. --Loren Merritt