log x86/vp8dsp.asm @ 12475:9fef0a8ddd63 libavcodec

age author description
Sun, 05 Sep 2010 10:10:16 +0000 reimar Use "d" suffix for general-purpose registers used with movd. libavcodec
Tue, 24 Aug 2010 16:52:27 +0000 rbultje Mark xmm registers as clobbered in simple loopfilter. Should fix the last libavcodec
Mon, 23 Aug 2010 02:41:22 +0000 rbultje Fix segfaults in VP8 SIMD code on Win64 (and FATE/win64 failures). libavcodec
Mon, 02 Aug 2010 20:18:09 +0000 darkshikari VP8: move zeroing of luma DC block into the WHT libavcodec
Sat, 31 Jul 2010 23:13:15 +0000 rbultje Use word-writing instead of dword-writing (with two cached but otherwise libavcodec
Mon, 26 Jul 2010 21:18:19 +0000 rbultje Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster. libavcodec
Mon, 26 Jul 2010 19:34:00 +0000 darkshikari VP8: Much faster SSE2 MC libavcodec
Mon, 26 Jul 2010 14:07:57 +0000 rbultje Enable no-loop memory/register saving for ssse3/sse4 also. libavcodec
Mon, 26 Jul 2010 14:00:15 +0000 rbultje Save a register (or regsize of stackspace for x86-32) for the no-loop libavcodec
Mon, 26 Jul 2010 13:56:51 +0000 rbultje Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. this libavcodec
Mon, 26 Jul 2010 13:50:59 +0000 rbultje Split pextrw macro-spaghetti into several opt-specific macros, this will make libavcodec
Sun, 25 Jul 2010 02:42:40 +0000 rbultje Fix obvious bug in assignment. Somehow, the test vectors don't test this... libavcodec
Sat, 24 Jul 2010 19:33:05 +0000 rbultje Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so this libavcodec
Fri, 23 Jul 2010 06:02:52 +0000 darkshikari VP8: optimize DC-only chroma case in the same way as luma. libavcodec
Fri, 23 Jul 2010 03:02:56 +0000 darkshikari VP8 asm: cosmetics (spacing) libavcodec
Fri, 23 Jul 2010 02:58:27 +0000 darkshikari VP8: 30% faster idct_mb libavcodec
Fri, 23 Jul 2010 00:07:16 +0000 darkshikari VP8: clear DCT blocks in iDCT instead of using clear_blocks. libavcodec
Thu, 22 Jul 2010 19:59:34 +0000 rbultje Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on libavcodec
Thu, 22 Jul 2010 01:35:26 +0000 rbultje Fix and enable horizontal >=SSE2 mbedge loopfilter. libavcodec
Wed, 21 Jul 2010 22:41:37 +0000 darkshikari Eliminate one instruction in VP8 dc_add_sse4 libavcodec
Wed, 21 Jul 2010 22:11:03 +0000 darkshikari Various VP8 x86 deblocking speedups libavcodec
Wed, 21 Jul 2010 20:51:01 +0000 darkshikari Make mmx VP8 WHT faster libavcodec
Tue, 20 Jul 2010 22:58:56 +0000 rbultje VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16) libavcodec
Tue, 20 Jul 2010 22:04:18 +0000 rbultje Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder. libavcodec
Mon, 19 Jul 2010 23:57:09 +0000 rbultje Revert r24339 (it causes fate failures on x86-64) - I'll figure out what's libavcodec
Mon, 19 Jul 2010 21:53:28 +0000 rbultje Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions. libavcodec
Mon, 19 Jul 2010 21:45:36 +0000 rbultje Be more efficient with registers or stack memory. Saves 8/16 bytes stack libavcodec
Mon, 19 Jul 2010 21:18:04 +0000 rbultje Change function prototypes for width=8 inner and mbedge loopfilter functions libavcodec
Fri, 16 Jul 2010 21:35:30 +0000 rbultje Attempt to fix x86-64 testsuite on fate. libavcodec
Fri, 16 Jul 2010 19:54:47 +0000 rbultje Remove duplicate define. libavcodec
Fri, 16 Jul 2010 19:54:25 +0000 rbultje Revert 24270, it contained some stuff that shouldn't have been in there. libavcodec
Fri, 16 Jul 2010 19:42:32 +0000 rbultje Remove duplicate define. libavcodec
Fri, 16 Jul 2010 19:38:10 +0000 rbultje Give x86 r%d registers names, this will simplify implementation of the chroma libavcodec
Fri, 16 Jul 2010 18:29:14 +0000 rbultje Change return statement, the REP_RET is a mistake since the else case (x86-64, libavcodec
Thu, 15 Jul 2010 23:02:34 +0000 rbultje VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations. libavcodec
Sat, 03 Jul 2010 19:26:30 +0000 rbultje Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros). libavcodec
Sat, 03 Jul 2010 00:48:12 +0000 darkshikari SSSE3 versions of vp8 width4 bilinear MC functions libavcodec
Fri, 02 Jul 2010 05:27:41 +0000 darkshikari SSSE3 versions of width4 VP8 6-tap MC functions libavcodec
Tue, 29 Jun 2010 17:23:17 +0000 darkshikari Use add instead of lshift in mmxext vp8 idct libavcodec
Tue, 29 Jun 2010 17:04:29 +0000 rbultje Remove unused macros (duplicates from the now-LGPL x86util.asm). libavcodec
Tue, 29 Jun 2010 14:43:11 +0000 rbultje MMX idct_add for VP8. libavcodec
Tue, 29 Jun 2010 01:41:59 +0000 darkshikari Add mmxext version of VP8 DC Hadamard transform libavcodec
Mon, 28 Jun 2010 22:13:14 +0000 darkshikari Fix VP8 bilinear mc on x86_64 libavcodec
Mon, 28 Jun 2010 19:14:40 +0000 darkshikari Add x86 asm functions for VP8 put_pixels libavcodec
Mon, 28 Jun 2010 18:56:24 +0000 darkshikari Add MMX, SSE2, SSSE3 asm for VP8 bilinear MC libavcodec
Sun, 27 Jun 2010 02:01:45 +0000 rbultje First shot at VP8 optimizations: libavcodec