log cabac.h @ 4019:6d4ac21853d7 libavcodec

age author description
Fri, 13 Oct 2006 14:21:25 +0000 michael dehack *ps_state indexing in the branchless decoder libavcodec
Thu, 12 Oct 2006 21:32:56 +0000 michael add "memory" to the clobber list we change memory so we need it, this also fixes some problems with gcc svn libavcodec
Thu, 12 Oct 2006 14:49:19 +0000 michael prevent "mb level" get_cabac() calls from being inlined (3% faster decode_mb_cabac() on P3) libavcodec
Thu, 12 Oct 2006 07:51:18 +0000 gpoirier adds some useful comments after some of the #else, #elseif, libavcodec
Wed, 11 Oct 2006 23:17:58 +0000 diego Rename ABS macro to FFABS. libavcodec
Wed, 11 Oct 2006 17:59:40 +0000 michael slightly faster on P3 slightly slower on athlon and probably faster on P4 libavcodec
Wed, 11 Oct 2006 16:39:50 +0000 michael moving lps state transition code a little up in the branched asm code (1% faster on P3) libavcodec
Wed, 11 Oct 2006 16:11:41 +0000 michael write cabac low and range variables as early as possible to prevent stalls from reading them before they where written, the P4 is said to disslike that alot, on P3 its 2% faster (START/STOP_TIMER over decode_residual) libavcodec
Wed, 11 Oct 2006 15:20:08 +0000 michael use ecx instead of cl (no speed change on P3 but might avoid partial register stalls on some cpus) libavcodec
Wed, 11 Oct 2006 14:44:17 +0000 michael make state transition tables global as they are constant and the code is slightly faster that way libavcodec
Wed, 11 Oct 2006 13:25:29 +0000 michael 10l libavcodec
Wed, 11 Oct 2006 13:21:42 +0000 michael make lps_range a global table its constant anyway (saves 1 addition for accessing it) libavcodec
Wed, 11 Oct 2006 12:23:40 +0000 michael enable CMOV_IS_FAST as its faster or equal speed on every cpu (duron, athlon, PM, P3) from which ive seen benchmarks, it might be slower on P4 but noone has posted benchmarks ... libavcodec
Tue, 10 Oct 2006 08:16:41 +0000 diego BRANCHLESS_CABAD --> BRANCHLESS_CABAC_DECODER libavcodec
Tue, 10 Oct 2006 06:56:51 +0000 michael moving another bit&1 out, this is as fast as with it in there, but it makes more sense with it outside of the loop libavcodec
Tue, 10 Oct 2006 01:17:39 +0000 michael move the &1 out of the asm so gcc can optimize it away in inlined cases (yes this is slightly faster) libavcodec
Tue, 10 Oct 2006 01:08:39 +0000 michael replace a few and/sub/... by cmov libavcodec
Mon, 09 Oct 2006 21:57:10 +0000 michael reading 8bit mem into a 8bit register needs 2 uops on P4, 8bit->32bit with zero extension needs just 1 libavcodec
Mon, 09 Oct 2006 21:39:07 +0000 michael on the P4 inc needs twice as much time a add libavcodec
Mon, 09 Oct 2006 21:21:10 +0000 michael 10l libavcodec
Mon, 09 Oct 2006 21:14:16 +0000 michael reverse remainder of the failed attempt to optimize *state=c->mps_state[s] libavcodec
Mon, 09 Oct 2006 20:51:33 +0000 michael x86 branchless cabac decoder libavcodec
Mon, 09 Oct 2006 20:44:11 +0000 michael optimize branchless C CABAC decoder libavcodec
Mon, 09 Oct 2006 18:20:00 +0000 michael move outcommented START/STOP_TIMER to a hopefully better place for benchmarking ... libavcodec
Mon, 09 Oct 2006 15:52:17 +0000 michael drop failed attempt to optimize *state= c->mps_state[s]; libavcodec
Mon, 09 Oct 2006 14:15:53 +0000 michael 10l bugfix for some disabled code libavcodec
Mon, 09 Oct 2006 14:15:14 +0000 michael first try of a handwritten get_cabac() for x86, this is 10-20% faster on P3 depening on if you try to subtract the START/STOP_TIMER overhead libavcodec
Mon, 09 Oct 2006 12:25:24 +0000 michael remove bytestream_end checks, seems to work fine without them and the bitstream reader doesnt check for the end either libavcodec
Mon, 09 Oct 2006 00:59:42 +0000 michael decrease ff_h264_norm_shift[] size libavcodec
Sun, 08 Oct 2006 21:26:08 +0000 michael cleanup libavcodec