annotate mp3lib/dct36_3dnow.c @ 28992:947ef23ba798

Test if create_vdp_decoder() might succeed by calling it from config() with a small value for max_reference_frames. This does not make automatic recovery by using software decoder possible, but lets MPlayer fail more graciously on - actually existing - buggy hardware that does not support certain H264 widths when using hardware accelerated decoding (784, 864, 944, 1024, 1808, 1888 pixels on NVIDIA G98) and if the user tries to hardware-decode more samples at the same time than supported. Might break playback of H264 Intra-Only samples on hardware with very little video memory.
author cehoyos
date Sat, 21 Mar 2009 20:11:05 +0000
parents bd6833421e56
children 347d152a5cfa
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
1 /*
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
2 * dct36_3dnow.c - 3DNow! optimized dct36()
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
3 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
4 * This code based 'dct36_3dnow.s' by Syuuhei Kashiyama
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
5 * <squash@mb.kcom.ne.jp>, only two types of changes have been made:
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
6 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
7 * - removed PREFETCH instruction for speedup
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
8 * - changed function name for support 3DNow! automatic detection
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
9 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
10 * You can find Kashiyama's original 3dnow! support patch
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
11 * (for mpg123-0.59o) at
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
12 * http://user.ecc.u-tokyo.ac.jp/~g810370/linux-simd/ (Japanese).
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
13 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
14 * by KIMURA Takuhiro <kim@hannah.ipc.miyakyo-u.ac.jp> - until 31.Mar.1999
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
15 * <kim@comtec.co.jp> - after 1.Apr.1999
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
16 *
18783
0783dd397f74 CVS --> Subversion in copyright notices
diego
parents: 16989
diff changeset
17 * Modified for use with MPlayer, for details see the changelog at
0783dd397f74 CVS --> Subversion in copyright notices
diego
parents: 16989
diff changeset
18 * http://svn.mplayerhq.hu/mplayer/trunk/
15167
07e7a572bd84 Mark modified imported files as such to comply with (L)GPL ¡ø2a.
diego
parents: 10322
diff changeset
19 * $Id$
07e7a572bd84 Mark modified imported files as such to comply with (L)GPL ¡ø2a.
diego
parents: 10322
diff changeset
20 *
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
21 * Original disclaimer:
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
22 * The author of this program disclaim whole expressed or implied
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
23 * warranties with regard to this program, and in no event shall the
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
24 * author of this program liable to whatever resulted from the use of
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
25 * this program. Use it at your own risk.
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
26 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
27 * 2003/06/21: Moved to GCC inline assembly - Alex Beregszaszi
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
28 */
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
29
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
30 #define real float /* ugly - but only way */
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
31
28117
bd6833421e56 Consistently include config.h before mangle.h, fixes possible compilation
reimar
parents: 27757
diff changeset
32 #include "config.h"
16989
e7a129082fda Unify include paths, -I.. is in CFLAGS.
diego
parents: 15167
diff changeset
33 #include "mangle.h"
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
34
25325
7c7885350d89 Identifiers starting with __ are reserved for the system.
diego
parents: 18783
diff changeset
35 #ifdef DCT36_OPTIMIZE_FOR_K7
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
36 void dct36_3dnowex(real *inbuf, real *o1,
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
37 real *o2, real *wintab, real *tsbuf)
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
38 #else
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
39 void dct36_3dnow(real *inbuf, real *o1,
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
40 real *o2, real *wintab, real *tsbuf)
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
41 #endif
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
42 {
27757
b5a46071062a Replace all occurrences of '__volatile__' and '__volatile' by plain 'volatile'.
diego
parents: 25325
diff changeset
43 __asm__ volatile(
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
44 "movq (%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
45 "movq 4(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
46 "pfadd %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
47 "movq %%mm0,4(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
48 "psrlq $32,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
49 "movq 12(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
50 "punpckldq %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
51 "pfadd %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
52 "movq %%mm1,12(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
53 "psrlq $32,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
54 "movq 20(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
55 "punpckldq %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
56 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
57 "movq %%mm2,20(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
58 "psrlq $32,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
59 "movq 28(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
60 "punpckldq %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
61 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
62 "movq %%mm3,28(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
63 "psrlq $32,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
64 "movq 36(%%eax),%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
65 "punpckldq %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
66 "pfadd %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
67 "movq %%mm4,36(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
68 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
69 "movq 44(%%eax),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
70 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
71 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
72 "movq %%mm5,44(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
73 "psrlq $32,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
74 "movq 52(%%eax),%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
75 "punpckldq %%mm7,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
76 "pfadd %%mm7,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
77 "movq %%mm6,52(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
78 "psrlq $32,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
79 "movq 60(%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
80 "punpckldq %%mm0,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
81 "pfadd %%mm0,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
82 "movq %%mm7,60(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
83 "psrlq $32,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
84 "movd 68(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
85 "pfadd %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
86 "movd %%mm0,68(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
87 "movd 4(%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
88 "movd 12(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
89 "punpckldq %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
90 "punpckldq 20(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
91 "pfadd %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
92 "movd %%mm0,12(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
93 "psrlq $32,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
94 "movd %%mm0,20(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
95 "psrlq $32,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
96 "movd 28(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
97 "punpckldq %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
98 "punpckldq 36(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
99 "pfadd %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
100 "movd %%mm1,28(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
101 "psrlq $32,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
102 "movd %%mm1,36(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
103 "psrlq $32,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
104 "movd 44(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
105 "punpckldq %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
106 "punpckldq 52(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
107 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
108 "movd %%mm2,44(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
109 "psrlq $32,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
110 "movd %%mm2,52(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
111 "psrlq $32,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
112 "movd 60(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
113 "punpckldq %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
114 "punpckldq 68(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
115 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
116 "movd %%mm3,60(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
117 "psrlq $32,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
118 "movd %%mm3,68(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
119
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
120 "movq 24(%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
121 "movq 48(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
122 "movd "MANGLE(COS9)"+12,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
123 "punpckldq %%mm2,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
124 "movd "MANGLE(COS9)"+24,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
125 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
126 "pfmul %%mm2,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
127 "pfmul %%mm3,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
128 "pushl %%eax\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
129 "movl $1,%%eax\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
130 "movd %%eax,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
131 "pi2fd %%mm7,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
132 "popl %%eax\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
133 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
134 "movd "MANGLE(COS9)"+4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
135 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
136 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
137 "pfadd %%mm0,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
138 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
139 "movd "MANGLE(COS9)"+20,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
140 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
141 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
142 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
143 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
144 "movd "MANGLE(COS9)"+28,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
145 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
146 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
147 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
148 "movq (%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
149 "movq 16(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
150 "movd "MANGLE(COS9)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
151 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
152 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
153 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
154 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
155 "movd "MANGLE(COS9)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
156 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
157 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
158 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
159 "pfadd %%mm1,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
160 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
161 "movd "MANGLE(COS9)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
162 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
163 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
164 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
165 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
166 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
167 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
168 "punpckldq "MANGLE(tfcos36)"+0,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
169 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
170 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
171 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
172 "movd 108(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
173 "punpckldq 104(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
174 "pfmul %%mm6,%%mm5\n\t"
25325
7c7885350d89 Identifiers starting with __ are reserved for the system.
diego
parents: 18783
diff changeset
175 #ifdef DCT36_OPTIMIZE_FOR_K7
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
176 "pswapd %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
177 "movq %%mm5,32(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
178 #else
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
179 "movd %%mm5,36(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
180 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
181 "movd %%mm5,32(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
182 #endif
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
183 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
184 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
185 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
186 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
187 "movd 32(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
188 "punpckldq 36(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
189 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
190 "movd 32(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
191 "punpckldq 36(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
192 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
193 "movd %%mm5,1024(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
194 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
195 "movd %%mm5,1152(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
196 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
197 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
198 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
199 "punpckldq "MANGLE(tfcos36)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
200 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
201 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
202 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
203 "movd 140(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
204 "punpckldq 72(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
205 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
206 "movd %%mm5,68(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
207 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
208 "movd %%mm5,0(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
209 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
210 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
211 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
212 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
213 "movd 0(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
214 "punpckldq 68(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
215 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
216 "movd 0(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
217 "punpckldq 68(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
218 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
219 "movd %%mm5,0(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
220 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
221 "movd %%mm5,2176(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
222 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
223 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
224 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
225 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
226 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
227 "movd "MANGLE(COS9)"+12,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
228 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
229 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
230 "movq 16(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
231 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
232 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
233 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
234 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
235 "movd "MANGLE(COS9)"+24,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
236 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
237 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
238 "movq 48(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
239 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
240 "movq (%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
241 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
242 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
243 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
244 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
245 "punpckldq "MANGLE(tfcos36)"+4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
246 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
247 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
248 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
249 "movd 112(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
250 "punpckldq 100(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
251 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
252 "movd %%mm5,40(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
253 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
254 "movd %%mm5,28(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
255 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
256 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
257 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
258 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
259 "movd 28(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
260 "punpckldq 40(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
261 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
262 "movd 28(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
263 "punpckldq 40(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
264 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
265 "movd %%mm5,896(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
266 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
267 "movd %%mm5,1280(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
268 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
269 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
270 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
271 "punpckldq "MANGLE(tfcos36)"+28,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
272 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
273 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
274 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
275 "movd 136(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
276 "punpckldq 76(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
277 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
278 "movd %%mm5,64(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
279 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
280 "movd %%mm5,4(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
281 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
282 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
283 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
284 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
285 "movd 4(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
286 "punpckldq 64(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
287 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
288 "movd 4(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
289 "punpckldq 64(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
290 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
291 "movd %%mm5,128(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
292 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
293 "movd %%mm5,2048(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
294
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
295 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
296 "movd "MANGLE(COS9)"+20,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
297 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
298 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
299 "pfsub %%mm0,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
300 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
301 "movd "MANGLE(COS9)"+28,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
302 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
303 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
304 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
305 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
306 "movd "MANGLE(COS9)"+4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
307 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
308 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
309 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
310 "movq (%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
311 "movq 16(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
312 "movd "MANGLE(COS9)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
313 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
314 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
315 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
316 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
317 "movd "MANGLE(COS9)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
318 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
319 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
320 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
321 "pfadd %%mm1,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
322 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
323 "movd "MANGLE(COS9)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
324 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
325 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
326 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
327 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
328 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
329 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
330 "punpckldq "MANGLE(tfcos36)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
331 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
332 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
333 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
334 "movd 116(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
335 "punpckldq 96(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
336 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
337 "movd %%mm5,44(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
338 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
339 "movd %%mm5,24(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
340 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
341 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
342 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
343 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
344 "movd 24(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
345 "punpckldq 44(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
346 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
347 "movd 24(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
348 "punpckldq 44(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
349 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
350 "movd %%mm5,768(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
351 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
352 "movd %%mm5,1408(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
353 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
354 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
355 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
356 "punpckldq "MANGLE(tfcos36)"+24,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
357 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
358 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
359 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
360 "movd 132(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
361 "punpckldq 80(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
362 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
363 "movd %%mm5,60(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
364 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
365 "movd %%mm5,8(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
366 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
367 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
368 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
369 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
370 "movd 8(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
371 "punpckldq 60(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
372 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
373 "movd 8(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
374 "punpckldq 60(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
375 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
376 "movd %%mm5,256(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
377 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
378 "movd %%mm5,1920(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
379 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
380 "movd "MANGLE(COS9)"+28,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
381 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
382 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
383 "pfsub %%mm0,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
384 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
385 "movd "MANGLE(COS9)"+4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
386 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
387 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
388 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
389 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
390 "movd "MANGLE(COS9)"+20,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
391 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
392 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
393 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
394 "movq (%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
395 "movq 16(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
396 "movd "MANGLE(COS9)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
397 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
398 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
399 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
400 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
401 "movd "MANGLE(COS9)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
402 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
403 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
404 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
405 "pfadd %%mm1,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
406 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
407 "movd "MANGLE(COS9)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
408 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
409 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
410 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
411 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
412 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
413 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
414 "punpckldq "MANGLE(tfcos36)"+12,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
415 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
416 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
417 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
418 "movd 120(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
419 "punpckldq 92(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
420 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
421 "movd %%mm5,48(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
422 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
423 "movd %%mm5,20(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
424 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
425 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
426 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
427 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
428 "movd 20(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
429 "punpckldq 48(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
430 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
431 "movd 20(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
432 "punpckldq 48(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
433 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
434 "movd %%mm5,640(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
435 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
436 "movd %%mm5,1536(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
437 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
438 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
439 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
440 "punpckldq "MANGLE(tfcos36)"+20,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
441 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
442 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
443 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
444 "movd 128(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
445 "punpckldq 84(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
446 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
447 "movd %%mm5,56(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
448 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
449 "movd %%mm5,12(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
450 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
451 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
452 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
453 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
454 "movd 12(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
455 "punpckldq 56(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
456 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
457 "movd 12(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
458 "punpckldq 56(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
459 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
460 "movd %%mm5,384(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
461 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
462 "movd %%mm5,1792(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
463
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
464 "movq (%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
465 "movq 16(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
466 "pfsub %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
467 "movq 32(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
468 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
469 "movq 48(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
470 "pfsub %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
471 "movq 64(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
472 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
473 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
474 "punpckldq "MANGLE(tfcos36)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
475 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
476 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
477 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
478 "movd 124(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
479 "punpckldq 88(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
480 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
481 "movd %%mm5,52(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
482 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
483 "movd %%mm5,16(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
484 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
485 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
486 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
487 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
488 "movd 16(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
489 "punpckldq 52(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
490 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
491 "movd 16(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
492 "punpckldq 52(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
493 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
494 "movd %%mm5,512(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
495 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
496 "movd %%mm5,1664(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
497
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
498 "femms\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
499 :
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
500 : "a" (inbuf), "S" (o1), "c" (o2), "d" (wintab), "b" (tsbuf)
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
501 : "memory");
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
502 }