# HG changeset patch # User iive # Date 1127031056 0 # Node ID e67e1c3069f971c5ee29f6310b1b40a8ce83d548 # Parent 3a24be1b0a600ed0b4e47c392ab12dda089dc716 remove very obsolate draft... diff -r 3a24be1b0a60 -r e67e1c3069f9 DOCS/tech/libvo2.txt --- a/DOCS/tech/libvo2.txt Sat Sep 17 22:32:35 2005 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,583 +0,0 @@ -******************************************************************************* -****************************************************************************** -WARNING: THIS FILE IS OBSOLETE, SEE libvo.txt INSTEAD -******************************************************************************* -******************************************************************************* - -============================================================ - -NOTE: the 'libvo2 from scratch' plan was abandoned, we're changing libvo1 now. - -So, this draft is ONLY A DRAFT, see libvo.txt for the current code docs! - -============================================================ - -//First Announce by Ivan Kalvachev -//Some explanations by Arpi & Pontscho - -If you have any suggestion related to the subjects in this document you -could send them to mplayer developer or advanced users mail lists. If you are -developer and have CVS access do not delete parts of this document, but you -could feel free to add paragraphs that you will sign with your name. -Be warned that the text could be changed, modified and your name could be -moved at the top of the document. - -1.libvo2 drivers -1.1 functions -Currently these functions are implemented: - init - control - start - stop - get_surface - update_surface - renamed draw - show_surface - renamed flip_page - query - hw_decode - subpicture - -Here is detailed description of the functions: - init - initialisation. It is called once on mplayer start - control - this function is message oriented interface for controlling the libvo2 driver - start - sets given mode and display it on the screen - stop - closes libvo2 driver, after stop we may call start again - query - the negotiation is more complex than just finding which imgfmt the - device could show, we must have list of capabilities, etc. - This function will have at least 3 modes: - a) return list with description of available modes. - b) check could we use this mode with these parameters. E.g. if we want - RGB32 with 3 surfaces for windows image 800x600 we may get out of video - memory. We don't want error because this mode could be used with 2 - surfaces. - c) return supported subpicture formats if any. - +d) supported functionality by hw_decode - -As you may see I have removed some functionality from control() and made -separate function. Why? It is generally good thing functions that are -critical to the driver to have it's own implementation. - get_surface - this function give us surfaces where we could write. In most - cases this is video memory, but it is possible to be and computer RAM, with - some special meaning (AGP memory , X shared memory, GL texture ...). - - update_surface - as in the note above, this is draw function. Why I change - it's name? I have 2 reasons, first I don't want implementation like vo1, - second it really must update video surface, it must directly call the - system function that will do it. This function should work only with - slices, the size of slice should not be limited and should be passed - (e.g ystart, yend), if we want draw function, we will call one form libvo2 - core, that will call this one with ystart=0; yend=Ymax;. Also some system - screen update functions wait for vertical retrace before return, other - functions just can't handle partial updates. In this case we should inform - libvo2 core that device cannot slice, and libvo2 core must take care of - the additional buffering and update_surface becomes usual draw function. - When update_surface() is used with combination on get_surface(), ONLY VALID - POINTERS ARE THESE RETURNED BY get_surface(). Watch out with cropping. - - show_surface - this functions is always called on frame change. it is used - to show the given surface on the screen. - If there is only one surface then it is always visible and this function - does nothing. - - hw_decode - to make all dvb,dxr3, TV etc. developers happy. This function - is for you. Be careful, don't OBSEBE it, think and for the future, this - function should have and ability to control HW IDCT, MC that one day will - be supported and under linux. Be careful:) - - subpicture - this function will place subtitles. It must be called once to - place them and once to remove them, it should not be called on every - frame, the driver will take care of this. Currently I propose this - implementation: we get array of bitmaps. Each one have its own starting - x, y and it's own height and width, each one (or all together) could be - in specific imgfmt (spfmt). THE BITMAPS SHOULD NOT OVERLAP! This may not - be hw limitation but sw subtitles may get confused if they work as 'c' - filter (look my libvo2 core). Anyway, so far I don't know hardware that - have such limitations, but it is safer to be so (and faster I think). - It is generally good to merge small bitmaps (like characters) in larger - ones and make all subtitles as one bitmap( or one bitmap for one subtitle line). - There will be and one for each OSD time & seek/brightness/contrast/volume bar. - -1.2 control() -OK, here is list of some control()s that I think that could be useful: - SET_ASPECT - SET_SCALLE_X, SET_SIZE_X - SET_SCALLE_Y, SET_SIZE_Y - RESET_SIZE - GET/SET_POSITION_X - GET/SET_POSTIION_Y - GET/SET_RESOLUTION - GET/SET_DISPLAY - GET/SET_ATTRIBUTES -+ GET/SET_WIN_DECORATION - -Here is description of how these controls to be used: - - SET_ASPECT - this is the move/video aspect, why not calculate it in - different place (mplayer.c) and pass the results to driver by - set_size_x/y. First this is only if hardware could scale. Second we may - need this value if we have TV and we won't calculate any new height and - width. - - SET_SCALLE_X/Y - this is to enlarge/downscale the image, it WILL NOT - override SET_ASPECT, they will have cumulative effect, this could be used - for deinterlacing (HALF SIZE). Second if we want to zoom 200% we don't - want to lose aspect calculations. Or better SET_SCALLE to work with - current size? - - SET_SIZE_X/Y - This is for custom enlarge, to save some scale calculation - and for more precise results. - - RESET_SIZE - Set the original size of image, we must call SET_ASPECT again. - - GET/SET_POSOTION_X/Y - This if for windows only, to allow custom move on - window. - - GET/SET_RESOLUTION - change resolution and/or bpp if possible. To be used - for changing desktop resolution or the resolution of the current - fullscreen mode (NOT TO SET IT just to change it if we don't like it) - - GET/SET_DISPLAY - mainly for X11 and remote displays. Not very useful, but - may be handy. - - GET/SET_ATTRIBUTES - Xv overlays have contrast, brightness, hue, - saturation etc. these and others could be controlled by this. If we want - to query it we must call GET_*, and the to check does our attribute is in - there (xv developers be careful, 2 or 3 of default attributes sometimes - are not queried by X, but could be set). - -Do you think that TV encoding (NTSC,PAL,SECAM) should have it's own attribute? -I would like to hear the GUI developers. Could we separate Mouse/Keyboard -from the driver. What info do you need to do it. Don't forget that SDL have -it's own keyboard/mouse interface. Maybe we should allow video driver to -change the libin driver ? - - -Arpi wrote: -I've asked Pontscho (he doesn't understand english well...). -There is 2 option of GUI<->mplayer interface. - -The current, ugly (IMHO) way: -gui have the control of the video window, it does handle resizing, moving, -key events etc. all window manipulation in libvo drivers are disabled as gui -is enabled. it was required as libvo isn't inited and running when gui -already display the video window. - -The wanted way: -GUI shouldn't control the X window directly, it should use libvo2 control -calls to resize/move/etc it. But there is a big problem: X cannot be opened -twice from a process. It means GUI and libvo2 should share the X connection. -And, as GUI run first (and when file is selected etc then libvo2 is started) -it should connect to X and later pass the connection to libvo2. It needs an -extra control() call and some extra code in mplayer.c - -but this way gui could work with non-X stuff, like SDL, fbdev (on second -head for TVout etc), hardware decoders (dvb.dxr3) etc. - -as X is so special, libvo2 should have a core function to open/get an X -connection, and it should be used by all X-based X drivers and gui. - -also, GUI needs functions to get mouse and keyboard events, and to -enable/disable window decoration (title, border). - -we need fullscreen switch control function too. - -> Maybe we should allow video driver to change the libin driver ? -forget libin. most input stuff is handled by libvo drivers. -think of all X stuff (x11,xv,dga,xmga,gl), SDL, aalib, libcaca, svgalib. -only a few transparent drivers (fbdev, mga, tdfxfb, vesa) has not, but all -of them are running on console (and maybe on second head) at fullscreen, so -they may not need mouse events. console keyboard events are already catched -and handled by getch2. - -I can't see any sense of writing libin. - -mpalyer.c should _handle_ all input events, collected from lirc interface, -getch2, libvo2 etc. and it should set update flags, for gui and osd. - -but we should share some plugin code. examples: *_vid code, all common X -code. it can be either implementing them in libvo2 core (and called from -plugins) or include these files from all drivers which need it. later method -is a bit cleaner (from viewpoint of core-plugin independency) but results -bigger binaries... - - -Btw. when we finish we will have libin, but it will be spread around mplayer. -I agree that libin could be build in in libvo2 driver, but there have to be -standart way to send commands to the mplayer itself. - - -1.3. query() - -Here come and some attributes for the queried modes, each supported mode -should have such description. It is even possible to have more than one mode -that could display given imgfmt. I think that we have to separate window from fullscreen -modes and to have yv12 mode for window and yv12 fullscreen mode. We need and naming -scheme, in order to have *.conf control over modes - to disable buggy modes, to limit -surfaces (buggy ones), to manually disable slices etc. The naming should not change from -one computer to another and have to be flexible. -{ - IMGFMT - image format (RGB,YV12, etc...) - - Height - the height of fullscreen mode or the maximum height of window mode - - Width - the width of fullscreen mode or the maximum withd of window mode - -} -{ - Scale y/n - hardware scale, do you think that we must have one for x and - one for y (win does)? - - Fullscreen y/n - if the supported mode is fullscreen, if we have yv12 for - fullscreen and window we must threat them as separate modes. - - Window y/n - The mode will show the image in a window. Could be removed as - it is mutually exclusive with Fullscreen - - GetSurface y/n - if driver could give us video surface we'll use get_surface() - - UpdateSurfece y/n - if driver will update video surface through sys function (X,SDL) - - HWdecode y/n - if driver could take advantage of hw_decode() - - MaxSurfaces 1..n - Theoretical maximum of surfaces - - SubPicture y/n - Could we put subpicture (OSD) of any kind by hw - - WriteCombine y/n - if GetSurface==yes, most (or all) pci&agp cards are - extremely slow on byte access, this is hint to vo2 core those surfaces - that got affected by WC. Some surfaces are in memory (X shm, OpenGL textures) - This is only a hint. - - us_clip y/n - if UpdateSurface=yes, this shows could update_surface() - remove strides (when stride> width ), this is used and for cropping. If - not, we must do it. - - us_slice y/n - if UpdateSurface=yes, this shows that update_surface() - could draw slices and that after updating surface,it won't wait for - vertical retrace, so we could update surface slice by slice. - If us_slice==n we will have to accumulate all slices in some buffer. - - us_upsidedown - if UpdateSufrace=yes, this shows that update_suface() - could flip the image vertically. In some case this could be combined with - us_clip /stride tricks/ - - switch_resoliton y/n - if window=y, this shows could we switch resolution - of desktop, if fullscreen==y, shows that we could change resolution, after - we have set the fullscreen mode. - - deinterlace y/n - indicates that the device could deinterlace on it's own - (radeon, TV out). -} -1.4 conclusion - -As you see, I have removed all additional buffering from the driver. There -is a lot of functionality that should be checked and handled by libvo2 core. -If some of the functionality is not supported the libvo2 core should add filters -that will support it by software. - -Some of the parameters should be able to - be overridden by user config, mainly -to disable buggy modes or parameters. I - believe that this should not be done -by command line as there are enough - commands now. - -I wait comments and ideas. -//-------------------------------------------------------------------------- -2. libvo2 core -2.1 functions -now these function are implemented: - init - new - start - query_format - close - -and as draw.c: - choose_buffering - draw_slice_start - draw_slice - draw_frame - flip - -init() is called at mplayer start. internal initialisation. -new() -> rename to open_drv() or something like this. -query_format -> not usable in this form, this function mean that all - negotiation will be performed outside libvo2. Replace or find better name. -close -> open/close :) - -choose_buffering - all buffering must stay hidden. The only exception is for - hw_decode. In the new implementation this functions is not usable. - It will be replaced with some kind of negotiation. -draw_slice_start, draw_slice -> if you like it this way, then it's OK. But i think that -draw_slice_done could help. - -draw_frame -> classic draw function. - -2.2 Minimal buffering - -I should say that I stand after the idea all buffering, postprocessing, -format conversion , sw draw of subtitles, etc to be done in libvo2 core. -Why? First this is the only way we could fully control buffering and -decrease it to minimum. Less buffers means less coping. In some cases this -could have the opposite effect (look at direct rendering). - -The first step of the analyse is to find out what we need: - -DECODER - num_out_buffers={1/2/3/...} - { - buffer_type:{fixed/static/movable} - read_only:{yes/no} - } * (num_out_buffers) - slice:{not/supported} - -FILTER 1..x - processing:{ c-copy(buff1,buff2), p-process(buff1) }, - slice:{not/supported} - write_combine:{not/safe}, - runtime_remove:{static/dynamic} - -VIDEO_OUT - method:{get_surface,update_surface}, - slice:{not/supported}, - write_combine:{not/safe}, - clip:{can/not}, - upsidedown:(can/not), - surfaces:{1/2/3,..,n} - - - -I use one letter code for the type of filters. You could find them in filters section. -Details: - -DECODER - We always get buffer from the decoder, some decoders could give - pointer to it's internal buffers, other takes pointers to buffers where - they should store the final image. Some decoders could call draw_slice - after they have finished with some portion of the image. - - num_out_buffers - number of output buffers. Each one could have it's own - parameters. In the usual case there will be only one buffer. Some - decoders may have 2 internal buffers like odivx, or like mpeg12 - 3 buffers - of different types(2 static and 1 temp). - - buffer_type - - - fixed - we don't have control where the buffer will be. We could - just take pointer to this buffer. No direct rendering is possible. - - static - we could set this buffer but then we can't change it's position. - - movable - we could set this buffer to any location at any time. - read_only - the data in this buffer will be used in future so we must not - try to write in there or we'll corrupt the video. If we have any 'p' kind - of filter we'll make copy. - - slice - this flag shows that decoder knows and want to work with slices. - -FILTER - postprocessing, sw drawing subtitles, format conversion, crop, -external filters. - - slice - could this filter work with slice order. We could use slice even - when decoder does not support slice, we just need 2 or more filters that - does. This could give us remarkable speed boost. - - processing - some filters can copy the image from one buffer to the other, - I call them 'c', convert and crop(stride copy) are good examples but don't - forget simple 1:1 copy. Other filters does process only part if the image, - and could reuse the given buffer, e.g. putting subtitles. I call them 'p' - Other filters could work in one buffer, but could work and with 2, I call - them 't' class, after analyse they will fade to 'c' or 'p'. - - runtime_remove - postprocess with autoq. Subtitles appear and disappear, - should we copy image from one buffer to another if there is no processing - at all? - -//clip, crop, upsidedown - all 'c' filters must support strides, and should - be able to remove them and to make some tricks like crop and upside_down. - -VIDEO_OUT - take a look of libvo2 driver I propose. - method - If we get surface -'S'. If we use draw* (update_surface) - 'd' - -As you may see hw_decode don't have complicated buffering:) - -I make the analyse this way. First I put decoder buffer, then I put all -filters, that may be needed, and finally I put video out method. Then I add -temp buffers where needed. This is simple enough to be made on runtime. - -2.5 Various -2.5.1 clip&crop - we have x1,y1 that shows how much of the beginning and -x2,y2 how much of the end we should remove. - plane+=(x1*sizeof(pixel))+(y1*stride);//let plane point to 1'st visible pixel - height-=y1+y2; - width-=x1+x2; - isn't is simple? no copy just change few variables. In order to make normal -plane we just need to copy it to frame where stide=width; - -2.5.2 flip,upsidedown - in windows this is indicated by negative height, here - in mplayer we may use negative stride, so we must make sure that filters and - drivers could use negative stride - plane+=(width-1)*stride;//point to the last line - stride=-stride;//make stride point to previus line - and this one is very simple, and I hope that could work with all know image formats - - BE careful, some modes may pack 2 pixels in 1 byte! - Other modes (YUYV) require y1 to be multiply of 2. - - stride is always in bytes, while width & height are in pixels - -2.5.3 PostProcessing -Arpi was afraid that postprocessing needs more internal data to work. I think -that the quantization table should be passed as additional plane. -How to be done this? When using Frame structure there is qbase that should point -to quantization table. The only problem is that usually the table is with fixed -size. I expect recommendations how to be properly implemented. should we crop it? Or add qstride, qheight, qwidth? Or mark the size of marcoblocks and -calc table size form image size? Currently pp work with fixed 8x8 blocks. -There may have and problem with interlaced images. -/ for frame look at 2.3.4 / -I recommend splitting postprocessing to it's original filters and ability to -use them separately. - -2.3. Rules for minimal buffering -2.3.1 Direct rendering. -Direct rendering means that the decoder will use video surface as output buffer. - Most of the decoders have internal buffers and on request they copy -the ready image from one of them to a given location. As we can't get pointer -to the internal buffer the fastest way is to give video surface as -output buffer and the decoder will draw it for us. This is safe as most of -copy routines are optimised for double words aligned access. - If we get internal buffer, we could copy the image on our own. This is not -direct rendering but it gets the same speed. If fact that's why -vc odivx -is faster that -vc divx4 while they use the same divx4linux library. - Sometimes it's possible to set video surface as internal buffer, but in most -cases the decoding process is byte oriented and many unaligned access is -performed. Moreover, reading from video memory on some cards is extremely -slow, about 4 times and more (and this is without setting MTRR), but some -decoders could take advantage of this. In the best case (reading performed -from the cache and using combined write ) we'll watch DivX movie with the same -speed as DivX4 is skipping frames. - -What does we need for Direct Rendering? -1. We should be able to get video surfaces. -2. The decoder should have at least one buffer with buffer_type != fixed. -3. If we have 'c' filter we can not use direct rendering. If we have - 'p' filter we may allow it. -4. If decoder have one static buffer, then we are limited to 1 video surface. - In this case we may see how the frame is rendered (ugly refresh in best case) -5. Each static buffer and each read_only buffer needs to have it own - video surface. If we don't have enough ... well we may make some tricks - but it is too complicated //using direct rendering for the first in - the list and the rest will use memory buffering. And we must have (1 or 2 ) free - video surfaces for the rest of decoder buffers// -6. Normal (buffer_type=movable, read_only=no) buffer could be redirected to - any available video surface. - -2.3.2 Normal process - The usual case libvo2 core takes responsibility to move the data. It must -follow these rules: - 1. The 'p' filters process in the buffer of the left, if we have read_only -buffer then vo2 core must insert 'c' copy filter and temp buffer. - 2. With 'c' filter we must make sure that we have buffer on the right(->) side. I think -that - 3. In the usual case 't' are replaced with 'p' except when 't' is before video surface. -We must have at least one 'c' if core have to make crop, clip, or flip image -upside down. - 4. Take care for the additional buffering when we have 1 surface (the libvo1 way). - 5. Be aware that some filters must be before other. E.g. Postporcessing should -be before subtitles:) - 6. If we want scale (-zoom), and vo2 driver can't make it then add and scale -filter 'c'. For better understanding I have only one convert filter that can -copy, convert, scale, convert and scale. In mplayer it really will be only -one filter. - 7. If we have video surface then the final 'c' filters will update it for us. If the filter -and video surface are not WriteCombine safe we may add buffering. In case we use both -get_surface and update_surface, after writing in video surface we must call and -update_sufrace() function. - -If we must update_surface() then we will call it with the last buffer. This buffer could -be and the internal decoder buffer if there are no 'c' filters. This buffer could be -returned and by get_surface(). - -2.3.3 Slices. - Slice is a small rectangle of the image. In decoders world it represents - independently rendered portion of the image. In mplayer slice width is equal - to the image width, the height is usually 8 but there is no problem to vary. - The slices advantage is that working with smaller part of the image the most - of data stays in the cache, so post processing would read the data for free. - This makes slice processing of video data preferred even when decoder and/or - video driver couldn't work with slices. - Smaller slices increase possibility of data to be in the cache, but also - increase the overhead of function calls( brunch prediction too), so it may be - good to tune the size, when it is possible (mainly at 2 filter slices) - - Here are some rules: -1. Slices are always with width of the image -2. Slices always are one after another, so you could not skip few lines because - they are not changed. This is made for postprocessing filter as there may - have different output image based on different neighbourhood lines(slices). -3. Slice always finish with last line, this is extended of 2. rule. -4. Slice buffers are normal buffers that could contain a whole frame. This is - need in case we have to accumulate slices for frame process (draw). This is - needed and for pp filters. -5. Slice processing could be used if: -5.1. decoder know for slices and call function when one is completed. The next - filter (or video driver) should be able to work with slices. -5.2. Two or more filters could work with slices. Call them one after another. - The result will be accumulated in the buffer of the last filter (look down - for 'p' type) -5.3. If the final filter can slice and vo2_driver can slice -6. All filers should have independent counters for processed lines. These counters -must be controlled by vo2 core. - -2.3.3.1 Slice counters. -For the incoming image we need: -1. value that show the last valid line. -2. value that show the line from where filter will start working. It is updated by the -filter to remember what portion of the image is processed. Vo2 core will zero it -on new frame. - -For the result image we need: -1. value that show which line is ready. This will be last valid line for next filter. - -The filter may need more internal variables. And as it may be used 2 or more times -in one chain it must be reentrant. So that internal variables should be passed to -filter as parameter. - -2.3.3.2 Auto slice. -In case we have complete frame that will be processed by few filters that support slices, we must start processing this frame slice by slice. We have same situation -when one filter accumulates too many lines and forces the next filters to work with bigger slice. -To avoid that case and to automatically start slicing we need to limit the slice size -and when slice is bigger to break it apart. If some filter needs more image lines then -it will wait until it accumulates them. - -2.3.4. Frame structure - So far we have buffer, that contain image, we have filters that work with - buffers. For croping and for normal work with the image data we need to know - dimensions of the image. We also need some structure to pass to the filters as - they have to know from where to read, and where they should write. -So I introduce Frame struct: -{ -imgfmt - the image format, the most important parameter -height, width - dimensions in pixel -stride - size of image line in bytes, it could be larger then width*sizeof(pixel), - it could be and negative (for vertical flip) -base0,base1,base2,base3 - pointers to planes, thay depend on imgfmt -baseq - quant storage plane. we may need to add qstride, or some qhight/qwidth -palette - pointer to table with palette colors. -flags read-only - this frame is read only. -//screen position ?? -} - - -2.4 Negotiation -Few words about negotiation. It is hard thing to find the best mode. Here is -algorithm that could find the best mode. But first I must say that we need -some kind of weight for the filters and drawing. I think that we could use -something like megabytes/second, something that we may measure or benchmark. - - 1. We choose codec - 2. We choose video driver. - 3. For each combination find the total weight and if there are any - optional filters find min and max weight. Be careful max weight is not - always at maximum filters!! (e.g. cropping) - 4. Compare the results. - -I may say that we don't need automatic codec selection as now we could put -best codecs at beginning of codecs.conf as it is now. We may need to make -the same thing with videodrv.conf. Or better make config files with preferred -order of decoders and video modes:) - -I wait comments and ideas.