annotate mp3lib/dct36_3dnow.c @ 30795:1001c606f94c

Make emulated Win32 critical sections thread safe. Earlier, cs->locked was accessed outside the mutex to get around the problem that default pthread mutexes are not recursive (ie., you cannot do a double-lock from the same thread), causing a thread-safety problem, as both detected by Helgrind and showing up in some multithreaded codecs. The ideal solution here would be to simply use recursive pthread mutexes, but there were concerns about reduced debuggability and possibly portability. Thus, instead, rewrite the critical sections to be a simple lock count (with owner) protected by a regular mutex. Whenever a thread wants to enter the critical section and lock_count is not 0, it sleeps on a special event that tells it when the critical section is available.
author sesse
date Thu, 04 Mar 2010 15:57:08 +0000
parents 347d152a5cfa
children 0ad2da052b2e
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
1 /*
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
2 * dct36_3dnow.c - 3DNow! optimized dct36()
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
3 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
4 * This code based 'dct36_3dnow.s' by Syuuhei Kashiyama
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
5 * <squash@mb.kcom.ne.jp>, only two types of changes have been made:
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
6 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
7 * - removed PREFETCH instruction for speedup
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
8 * - changed function name for support 3DNow! automatic detection
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
9 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
10 * You can find Kashiyama's original 3dnow! support patch
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
11 * (for mpg123-0.59o) at
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
12 * http://user.ecc.u-tokyo.ac.jp/~g810370/linux-simd/ (Japanese).
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
13 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
14 * by KIMURA Takuhiro <kim@hannah.ipc.miyakyo-u.ac.jp> - until 31.Mar.1999
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
15 * <kim@comtec.co.jp> - after 1.Apr.1999
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
16 *
18783
0783dd397f74 CVS --> Subversion in copyright notices
diego
parents: 16989
diff changeset
17 * Modified for use with MPlayer, for details see the changelog at
0783dd397f74 CVS --> Subversion in copyright notices
diego
parents: 16989
diff changeset
18 * http://svn.mplayerhq.hu/mplayer/trunk/
15167
07e7a572bd84 Mark modified imported files as such to comply with (L)GPL ¡ø2a.
diego
parents: 10322
diff changeset
19 * $Id$
07e7a572bd84 Mark modified imported files as such to comply with (L)GPL ¡ø2a.
diego
parents: 10322
diff changeset
20 *
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
21 * Original disclaimer:
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
22 * The author of this program disclaim whole expressed or implied
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
23 * warranties with regard to this program, and in no event shall the
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
24 * author of this program liable to whatever resulted from the use of
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
25 * this program. Use it at your own risk.
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
26 *
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
27 * 2003/06/21: Moved to GCC inline assembly - Alex Beregszaszi
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
28 */
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
29
28117
bd6833421e56 Consistently include config.h before mangle.h, fixes possible compilation
reimar
parents: 27757
diff changeset
30 #include "config.h"
16989
e7a129082fda Unify include paths, -I.. is in CFLAGS.
diego
parents: 15167
diff changeset
31 #include "mangle.h"
30167
347d152a5cfa Refactor real --> float #define to a typedef in a common header.
diego
parents: 28117
diff changeset
32 #include "mpg123.h"
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
33
25325
7c7885350d89 Identifiers starting with __ are reserved for the system.
diego
parents: 18783
diff changeset
34 #ifdef DCT36_OPTIMIZE_FOR_K7
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
35 void dct36_3dnowex(real *inbuf, real *o1,
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
36 real *o2, real *wintab, real *tsbuf)
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
37 #else
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
38 void dct36_3dnow(real *inbuf, real *o1,
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
39 real *o2, real *wintab, real *tsbuf)
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
40 #endif
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
41 {
27757
b5a46071062a Replace all occurrences of '__volatile__' and '__volatile' by plain 'volatile'.
diego
parents: 25325
diff changeset
42 __asm__ volatile(
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
43 "movq (%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
44 "movq 4(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
45 "pfadd %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
46 "movq %%mm0,4(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
47 "psrlq $32,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
48 "movq 12(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
49 "punpckldq %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
50 "pfadd %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
51 "movq %%mm1,12(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
52 "psrlq $32,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
53 "movq 20(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
54 "punpckldq %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
55 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
56 "movq %%mm2,20(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
57 "psrlq $32,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
58 "movq 28(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
59 "punpckldq %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
60 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
61 "movq %%mm3,28(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
62 "psrlq $32,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
63 "movq 36(%%eax),%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
64 "punpckldq %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
65 "pfadd %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
66 "movq %%mm4,36(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
67 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
68 "movq 44(%%eax),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
69 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
70 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
71 "movq %%mm5,44(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
72 "psrlq $32,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
73 "movq 52(%%eax),%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
74 "punpckldq %%mm7,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
75 "pfadd %%mm7,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
76 "movq %%mm6,52(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
77 "psrlq $32,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
78 "movq 60(%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
79 "punpckldq %%mm0,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
80 "pfadd %%mm0,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
81 "movq %%mm7,60(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
82 "psrlq $32,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
83 "movd 68(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
84 "pfadd %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
85 "movd %%mm0,68(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
86 "movd 4(%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
87 "movd 12(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
88 "punpckldq %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
89 "punpckldq 20(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
90 "pfadd %%mm1,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
91 "movd %%mm0,12(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
92 "psrlq $32,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
93 "movd %%mm0,20(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
94 "psrlq $32,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
95 "movd 28(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
96 "punpckldq %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
97 "punpckldq 36(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
98 "pfadd %%mm2,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
99 "movd %%mm1,28(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
100 "psrlq $32,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
101 "movd %%mm1,36(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
102 "psrlq $32,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
103 "movd 44(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
104 "punpckldq %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
105 "punpckldq 52(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
106 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
107 "movd %%mm2,44(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
108 "psrlq $32,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
109 "movd %%mm2,52(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
110 "psrlq $32,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
111 "movd 60(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
112 "punpckldq %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
113 "punpckldq 68(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
114 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
115 "movd %%mm3,60(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
116 "psrlq $32,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
117 "movd %%mm3,68(%%eax)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
118
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
119 "movq 24(%%eax),%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
120 "movq 48(%%eax),%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
121 "movd "MANGLE(COS9)"+12,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
122 "punpckldq %%mm2,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
123 "movd "MANGLE(COS9)"+24,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
124 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
125 "pfmul %%mm2,%%mm0\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
126 "pfmul %%mm3,%%mm1\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
127 "pushl %%eax\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
128 "movl $1,%%eax\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
129 "movd %%eax,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
130 "pi2fd %%mm7,%%mm7\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
131 "popl %%eax\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
132 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
133 "movd "MANGLE(COS9)"+4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
134 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
135 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
136 "pfadd %%mm0,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
137 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
138 "movd "MANGLE(COS9)"+20,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
139 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
140 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
141 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
142 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
143 "movd "MANGLE(COS9)"+28,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
144 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
145 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
146 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
147 "movq (%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
148 "movq 16(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
149 "movd "MANGLE(COS9)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
150 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
151 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
152 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
153 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
154 "movd "MANGLE(COS9)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
155 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
156 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
157 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
158 "pfadd %%mm1,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
159 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
160 "movd "MANGLE(COS9)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
161 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
162 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
163 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
164 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
165 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
166 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
167 "punpckldq "MANGLE(tfcos36)"+0,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
168 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
169 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
170 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
171 "movd 108(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
172 "punpckldq 104(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
173 "pfmul %%mm6,%%mm5\n\t"
25325
7c7885350d89 Identifiers starting with __ are reserved for the system.
diego
parents: 18783
diff changeset
174 #ifdef DCT36_OPTIMIZE_FOR_K7
10322
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
175 "pswapd %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
176 "movq %%mm5,32(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
177 #else
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
178 "movd %%mm5,36(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
179 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
180 "movd %%mm5,32(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
181 #endif
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
182 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
183 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
184 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
185 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
186 "movd 32(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
187 "punpckldq 36(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
188 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
189 "movd 32(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
190 "punpckldq 36(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
191 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
192 "movd %%mm5,1024(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
193 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
194 "movd %%mm5,1152(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
195 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
196 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
197 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
198 "punpckldq "MANGLE(tfcos36)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
199 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
200 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
201 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
202 "movd 140(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
203 "punpckldq 72(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
204 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
205 "movd %%mm5,68(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
206 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
207 "movd %%mm5,0(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
208 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
209 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
210 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
211 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
212 "movd 0(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
213 "punpckldq 68(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
214 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
215 "movd 0(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
216 "punpckldq 68(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
217 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
218 "movd %%mm5,0(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
219 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
220 "movd %%mm5,2176(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
221 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
222 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
223 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
224 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
225 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
226 "movd "MANGLE(COS9)"+12,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
227 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
228 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
229 "movq 16(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
230 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
231 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
232 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
233 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
234 "movd "MANGLE(COS9)"+24,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
235 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
236 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
237 "movq 48(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
238 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
239 "movq (%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
240 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
241 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
242 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
243 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
244 "punpckldq "MANGLE(tfcos36)"+4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
245 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
246 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
247 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
248 "movd 112(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
249 "punpckldq 100(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
250 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
251 "movd %%mm5,40(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
252 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
253 "movd %%mm5,28(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
254 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
255 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
256 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
257 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
258 "movd 28(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
259 "punpckldq 40(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
260 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
261 "movd 28(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
262 "punpckldq 40(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
263 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
264 "movd %%mm5,896(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
265 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
266 "movd %%mm5,1280(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
267 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
268 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
269 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
270 "punpckldq "MANGLE(tfcos36)"+28,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
271 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
272 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
273 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
274 "movd 136(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
275 "punpckldq 76(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
276 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
277 "movd %%mm5,64(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
278 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
279 "movd %%mm5,4(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
280 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
281 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
282 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
283 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
284 "movd 4(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
285 "punpckldq 64(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
286 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
287 "movd 4(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
288 "punpckldq 64(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
289 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
290 "movd %%mm5,128(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
291 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
292 "movd %%mm5,2048(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
293
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
294 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
295 "movd "MANGLE(COS9)"+20,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
296 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
297 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
298 "pfsub %%mm0,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
299 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
300 "movd "MANGLE(COS9)"+28,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
301 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
302 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
303 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
304 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
305 "movd "MANGLE(COS9)"+4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
306 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
307 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
308 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
309 "movq (%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
310 "movq 16(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
311 "movd "MANGLE(COS9)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
312 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
313 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
314 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
315 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
316 "movd "MANGLE(COS9)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
317 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
318 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
319 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
320 "pfadd %%mm1,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
321 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
322 "movd "MANGLE(COS9)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
323 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
324 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
325 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
326 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
327 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
328 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
329 "punpckldq "MANGLE(tfcos36)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
330 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
331 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
332 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
333 "movd 116(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
334 "punpckldq 96(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
335 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
336 "movd %%mm5,44(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
337 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
338 "movd %%mm5,24(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
339 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
340 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
341 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
342 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
343 "movd 24(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
344 "punpckldq 44(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
345 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
346 "movd 24(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
347 "punpckldq 44(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
348 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
349 "movd %%mm5,768(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
350 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
351 "movd %%mm5,1408(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
352 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
353 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
354 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
355 "punpckldq "MANGLE(tfcos36)"+24,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
356 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
357 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
358 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
359 "movd 132(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
360 "punpckldq 80(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
361 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
362 "movd %%mm5,60(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
363 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
364 "movd %%mm5,8(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
365 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
366 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
367 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
368 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
369 "movd 8(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
370 "punpckldq 60(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
371 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
372 "movd 8(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
373 "punpckldq 60(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
374 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
375 "movd %%mm5,256(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
376 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
377 "movd %%mm5,1920(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
378 "movq 8(%%eax),%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
379 "movd "MANGLE(COS9)"+28,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
380 "punpckldq %%mm3,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
381 "pfmul %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
382 "pfsub %%mm0,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
383 "movq 40(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
384 "movd "MANGLE(COS9)"+4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
385 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
386 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
387 "pfadd %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
388 "movq 56(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
389 "movd "MANGLE(COS9)"+20,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
390 "punpckldq %%mm4,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
391 "pfmul %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
392 "pfsub %%mm3,%%mm2\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
393 "movq (%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
394 "movq 16(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
395 "movd "MANGLE(COS9)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
396 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
397 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
398 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
399 "movq 32(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
400 "movd "MANGLE(COS9)"+32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
401 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
402 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
403 "pfadd %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
404 "pfadd %%mm1,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
405 "movq 64(%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
406 "movd "MANGLE(COS9)"+8,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
407 "punpckldq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
408 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
409 "pfsub %%mm4,%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
410 "movq %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
411 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
412 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
413 "punpckldq "MANGLE(tfcos36)"+12,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
414 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
415 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
416 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
417 "movd 120(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
418 "punpckldq 92(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
419 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
420 "movd %%mm5,48(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
421 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
422 "movd %%mm5,20(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
423 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
424 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
425 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
426 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
427 "movd 20(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
428 "punpckldq 48(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
429 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
430 "movd 20(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
431 "punpckldq 48(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
432 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
433 "movd %%mm5,640(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
434 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
435 "movd %%mm5,1536(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
436 "movq %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
437 "pfsub %%mm2,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
438 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
439 "punpckldq "MANGLE(tfcos36)"+20,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
440 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
441 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
442 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
443 "movd 128(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
444 "punpckldq 84(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
445 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
446 "movd %%mm5,56(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
447 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
448 "movd %%mm5,12(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
449 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
450 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
451 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
452 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
453 "movd 12(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
454 "punpckldq 56(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
455 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
456 "movd 12(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
457 "punpckldq 56(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
458 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
459 "movd %%mm5,384(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
460 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
461 "movd %%mm5,1792(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
462
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
463 "movq (%%eax),%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
464 "movq 16(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
465 "pfsub %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
466 "movq 32(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
467 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
468 "movq 48(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
469 "pfsub %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
470 "movq 64(%%eax),%%mm3\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
471 "pfadd %%mm3,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
472 "movq %%mm7,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
473 "punpckldq "MANGLE(tfcos36)"+16,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
474 "pfmul %%mm5,%%mm4\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
475 "movq %%mm4,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
476 "pfacc %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
477 "movd 124(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
478 "punpckldq 88(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
479 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
480 "movd %%mm5,52(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
481 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
482 "movd %%mm5,16(%%ecx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
483 "movq %%mm4,%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
484 "punpckldq %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
485 "pfsub %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
486 "punpckhdq %%mm5,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
487 "movd 16(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
488 "punpckldq 52(%%edx),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
489 "pfmul %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
490 "movd 16(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
491 "punpckldq 52(%%esi),%%mm6\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
492 "pfadd %%mm6,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
493 "movd %%mm5,512(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
494 "psrlq $32,%%mm5\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
495 "movd %%mm5,1664(%%ebx)\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
496
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
497 "femms\n\t"
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
498 :
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
499 : "a" (inbuf), "S" (o1), "c" (o2), "d" (wintab), "b" (tsbuf)
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
500 : "memory");
9163bdb578a6 moved 3dnow and 3dnowex dct36 optimisations into gcc inline assembly
alex
parents:
diff changeset
501 }