log

age author description
Wed, 11 Oct 2006 16:11:41 +0000 michael write cabac low and range variables as early as possible to prevent stalls from reading them before they where written, the P4 is said to disslike that alot, on P3 its 2% faster (START/STOP_TIMER over decode_residual) libavcodec
Wed, 11 Oct 2006 15:20:08 +0000 michael use ecx instead of cl (no speed change on P3 but might avoid partial register stalls on some cpus) libavcodec
Wed, 11 Oct 2006 14:44:17 +0000 michael make state transition tables global as they are constant and the code is slightly faster that way libavcodec
Wed, 11 Oct 2006 13:25:29 +0000 michael 10l libavcodec
Wed, 11 Oct 2006 13:21:42 +0000 michael make lps_range a global table its constant anyway (saves 1 addition for accessing it) libavcodec
Wed, 11 Oct 2006 12:23:40 +0000 michael enable CMOV_IS_FAST as its faster or equal speed on every cpu (duron, athlon, PM, P3) from which ive seen benchmarks, it might be slower on P4 but noone has posted benchmarks ... libavcodec
Wed, 11 Oct 2006 10:29:00 +0000 michael doxy libavcodec
Wed, 11 Oct 2006 08:30:13 +0000 diego Move CFLAGS handling to common.mak. libavcodec
Wed, 11 Oct 2006 07:47:59 +0000 diego Switch to the LGPL as agreed to by the author according to the libavcodec
Wed, 11 Oct 2006 04:15:04 +0000 kostya Targa image decoder libavcodec
Tue, 10 Oct 2006 12:07:25 +0000 diego Rename SIGN macro to the more fitting UNFOLD. libavcodec
Tue, 10 Oct 2006 08:16:41 +0000 diego BRANCHLESS_CABAD --> BRANCHLESS_CABAC_DECODER libavcodec
Tue, 10 Oct 2006 08:01:19 +0000 gpoirier Move TRANSPOSE8 macro to dsputil_altivec.h. libavcodec
Tue, 10 Oct 2006 06:56:51 +0000 michael moving another bit&1 out, this is as fast as with it in there, but it makes more sense with it outside of the loop libavcodec
Tue, 10 Oct 2006 01:17:39 +0000 michael move the &1 out of the asm so gcc can optimize it away in inlined cases (yes this is slightly faster) libavcodec
Tue, 10 Oct 2006 01:08:39 +0000 michael replace a few and/sub/... by cmov libavcodec
Mon, 09 Oct 2006 21:57:10 +0000 michael reading 8bit mem into a 8bit register needs 2 uops on P4, 8bit->32bit with zero extension needs just 1 libavcodec
Mon, 09 Oct 2006 21:39:07 +0000 michael on the P4 inc needs twice as much time a add libavcodec
Mon, 09 Oct 2006 21:21:10 +0000 michael 10l libavcodec
Mon, 09 Oct 2006 21:14:16 +0000 michael reverse remainder of the failed attempt to optimize *state=c->mps_state[s] libavcodec
Mon, 09 Oct 2006 20:51:33 +0000 michael x86 branchless cabac decoder libavcodec
Mon, 09 Oct 2006 20:44:11 +0000 michael optimize branchless C CABAC decoder libavcodec
Mon, 09 Oct 2006 18:29:46 +0000 lu_zero removing ALTIVEC_USE_REFERENCE_C_CODE, since has no use anymore libavcodec
Mon, 09 Oct 2006 18:20:00 +0000 michael move outcommented START/STOP_TIMER to a hopefully better place for benchmarking ... libavcodec
Mon, 09 Oct 2006 15:52:17 +0000 michael drop failed attempt to optimize *state= c->mps_state[s]; libavcodec
Mon, 09 Oct 2006 14:15:53 +0000 michael 10l bugfix for some disabled code libavcodec
Mon, 09 Oct 2006 14:15:14 +0000 michael first try of a handwritten get_cabac() for x86, this is 10-20% faster on P3 depening on if you try to subtract the START/STOP_TIMER overhead libavcodec
Mon, 09 Oct 2006 13:37:43 +0000 lu_zero add_bytes passes tests libavcodec