annotate ppc/idct_altivec.c @ 7855:9a135b6a1dc7 libavcodec

Correct order of parsing for pulse scalefactor band and offset to match the specification. Patch by Alex Converse (alex converse gmail com)
author superdump
date Sat, 13 Sep 2008 18:47:43 +0000
parents 266d4949aa15
children 1615d6b75ada
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
1 /*
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
2 * Copyright (c) 2001 Michel Lespinasse
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
3 *
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3036
diff changeset
4 * This file is part of FFmpeg.
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3036
diff changeset
5 *
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3036
diff changeset
6 * FFmpeg is free software; you can redistribute it and/or
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
7 * modify it under the terms of the GNU Lesser General Public
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
8 * License as published by the Free Software Foundation; either
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3036
diff changeset
9 * version 2.1 of the License, or (at your option) any later version.
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
10 *
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3036
diff changeset
11 * FFmpeg is distributed in the hope that it will be useful,
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
14 * Lesser General Public License for more details.
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
15 *
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
16 * You should have received a copy of the GNU Lesser General Public
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3036
diff changeset
17 * License along with FFmpeg; if not, write to the Free Software
3036
0b546eab515d Update licensing information: The FSF changed postal address.
diego
parents: 2979
diff changeset
18 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
19 */
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
20
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
21 /*
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
22 * NOTE: This code is based on GPL code from the libmpeg2 project. The
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
23 * author, Michel Lespinasses, has given explicit permission to release
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
24 * under LGPL as part of ffmpeg.
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
25 */
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
26
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
27 /*
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
28 * FFMpeg integration by Dieter Shirley
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
29 *
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
30 * This file is a direct copy of the altivec idct module from the libmpeg2
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
31 * project. I've deleted all of the libmpeg2 specific code, renamed the functions and
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
32 * re-ordered the function parameters. The only change to the IDCT function
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
33 * itself was to factor out the partial transposition, and to perform a full
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
34 * transpose at the end of the function.
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
35 */
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
36
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
37
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
38 #include <stdlib.h> /* malloc(), free() */
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
39 #include <string.h>
6763
f7cbb7733146 Use full path for #includes from another directory.
diego
parents: 6105
diff changeset
40 #include "libavcodec/dsputil.h"
1277
f3152eb76f1a altivec gcc-3 fixes by (Magnus Damm <damm at opensource dot se>)
michaelni
parents: 1064
diff changeset
41
f3152eb76f1a altivec gcc-3 fixes by (Magnus Damm <damm at opensource dot se>)
michaelni
parents: 1064
diff changeset
42 #include "gcc_fixes.h"
f3152eb76f1a altivec gcc-3 fixes by (Magnus Damm <damm at opensource dot se>)
michaelni
parents: 1064
diff changeset
43
6105
33674fb857b5 Change some files to only include the necessary headers.
diego
parents: 5746
diff changeset
44 #include "dsputil_ppc.h"
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
45
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
46 #define vector_s16_t vector signed short
5746
55ed6dc5d476 Remove const vector macro indirection that is useless and obfuscating
diego
parents: 5215
diff changeset
47 #define const_vector_s16_t const vector signed short
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
48 #define vector_u16_t vector unsigned short
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
49 #define vector_s8_t vector signed char
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
50 #define vector_u8_t vector unsigned char
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
51 #define vector_s32_t vector signed int
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
52 #define vector_u32_t vector unsigned int
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
53
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
54 #define IDCT_HALF \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
55 /* 1st stage */ \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
56 t1 = vec_mradds (a1, vx7, vx1 ); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
57 t8 = vec_mradds (a1, vx1, vec_subs (zero, vx7)); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
58 t7 = vec_mradds (a2, vx5, vx3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
59 t3 = vec_mradds (ma2, vx3, vx5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
60 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
61 /* 2nd stage */ \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
62 t5 = vec_adds (vx0, vx4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
63 t0 = vec_subs (vx0, vx4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
64 t2 = vec_mradds (a0, vx6, vx2); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
65 t4 = vec_mradds (a0, vx2, vec_subs (zero, vx6)); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
66 t6 = vec_adds (t8, t3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
67 t3 = vec_subs (t8, t3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
68 t8 = vec_subs (t1, t7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
69 t1 = vec_adds (t1, t7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
70 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
71 /* 3rd stage */ \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
72 t7 = vec_adds (t5, t2); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
73 t2 = vec_subs (t5, t2); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
74 t5 = vec_adds (t0, t4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
75 t0 = vec_subs (t0, t4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
76 t4 = vec_subs (t8, t3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
77 t3 = vec_adds (t8, t3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
78 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
79 /* 4th stage */ \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
80 vy0 = vec_adds (t7, t1); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
81 vy7 = vec_subs (t7, t1); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
82 vy1 = vec_mradds (c4, t3, t5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
83 vy6 = vec_mradds (mc4, t3, t5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
84 vy2 = vec_mradds (c4, t4, t0); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
85 vy5 = vec_mradds (mc4, t4, t0); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
86 vy3 = vec_adds (t2, t6); \
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
87 vy4 = vec_subs (t2, t6);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
88
2967
ef2149182f1c COSMETICS: Remove all trailing whitespace.
diego
parents: 1839
diff changeset
89
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
90 #define IDCT \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
91 vector_s16_t vx0, vx1, vx2, vx3, vx4, vx5, vx6, vx7; \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
92 vector_s16_t vy0, vy1, vy2, vy3, vy4, vy5, vy6, vy7; \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
93 vector_s16_t a0, a1, a2, ma2, c4, mc4, zero, bias; \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
94 vector_s16_t t0, t1, t2, t3, t4, t5, t6, t7, t8; \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
95 vector_u16_t shift; \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
96 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
97 c4 = vec_splat (constants[0], 0); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
98 a0 = vec_splat (constants[0], 1); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
99 a1 = vec_splat (constants[0], 2); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
100 a2 = vec_splat (constants[0], 3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
101 mc4 = vec_splat (constants[0], 4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
102 ma2 = vec_splat (constants[0], 5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
103 bias = (vector_s16_t)vec_splat ((vector_s32_t)constants[0], 3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
104 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
105 zero = vec_splat_s16 (0); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
106 shift = vec_splat_u16 (4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
107 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
108 vx0 = vec_mradds (vec_sl (block[0], shift), constants[1], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
109 vx1 = vec_mradds (vec_sl (block[1], shift), constants[2], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
110 vx2 = vec_mradds (vec_sl (block[2], shift), constants[3], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
111 vx3 = vec_mradds (vec_sl (block[3], shift), constants[4], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
112 vx4 = vec_mradds (vec_sl (block[4], shift), constants[1], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
113 vx5 = vec_mradds (vec_sl (block[5], shift), constants[4], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
114 vx6 = vec_mradds (vec_sl (block[6], shift), constants[3], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
115 vx7 = vec_mradds (vec_sl (block[7], shift), constants[2], zero); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
116 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
117 IDCT_HALF \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
118 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
119 vx0 = vec_mergeh (vy0, vy4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
120 vx1 = vec_mergel (vy0, vy4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
121 vx2 = vec_mergeh (vy1, vy5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
122 vx3 = vec_mergel (vy1, vy5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
123 vx4 = vec_mergeh (vy2, vy6); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
124 vx5 = vec_mergel (vy2, vy6); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
125 vx6 = vec_mergeh (vy3, vy7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
126 vx7 = vec_mergel (vy3, vy7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
127 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
128 vy0 = vec_mergeh (vx0, vx4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
129 vy1 = vec_mergel (vx0, vx4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
130 vy2 = vec_mergeh (vx1, vx5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
131 vy3 = vec_mergel (vx1, vx5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
132 vy4 = vec_mergeh (vx2, vx6); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
133 vy5 = vec_mergel (vx2, vx6); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
134 vy6 = vec_mergeh (vx3, vx7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
135 vy7 = vec_mergel (vx3, vx7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
136 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
137 vx0 = vec_adds (vec_mergeh (vy0, vy4), bias); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
138 vx1 = vec_mergel (vy0, vy4); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
139 vx2 = vec_mergeh (vy1, vy5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
140 vx3 = vec_mergel (vy1, vy5); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
141 vx4 = vec_mergeh (vy2, vy6); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
142 vx5 = vec_mergel (vy2, vy6); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
143 vx6 = vec_mergeh (vy3, vy7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
144 vx7 = vec_mergel (vy3, vy7); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
145 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
146 IDCT_HALF \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
147 \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
148 shift = vec_splat_u16 (6); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
149 vx0 = vec_sra (vy0, shift); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
150 vx1 = vec_sra (vy1, shift); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
151 vx2 = vec_sra (vy2, shift); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
152 vx3 = vec_sra (vy3, shift); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
153 vx4 = vec_sra (vy4, shift); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
154 vx5 = vec_sra (vy5, shift); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
155 vx6 = vec_sra (vy6, shift); \
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
156 vx7 = vec_sra (vy7, shift);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
157
1033
b4172ff70d27 Altivec on non darwin systems patch by Romain Dolbeau
bellard
parents: 1015
diff changeset
158
1839
b370288f004d Metrowerks CodeWarrior patches by (John Dalgliesh <johnd at defyne dot org>)
michael
parents: 1352
diff changeset
159 static const_vector_s16_t constants[5] = {
7373
266d4949aa15 Remove AltiVec vector declaration compiler compatibility macros.
diego
parents: 7333
diff changeset
160 {23170, 13573, 6518, 21895, -23170, -21895, 32, 31},
266d4949aa15 Remove AltiVec vector declaration compiler compatibility macros.
diego
parents: 7333
diff changeset
161 {16384, 22725, 21407, 19266, 16384, 19266, 21407, 22725},
266d4949aa15 Remove AltiVec vector declaration compiler compatibility macros.
diego
parents: 7333
diff changeset
162 {22725, 31521, 29692, 26722, 22725, 26722, 29692, 31521},
266d4949aa15 Remove AltiVec vector declaration compiler compatibility macros.
diego
parents: 7333
diff changeset
163 {21407, 29692, 27969, 25172, 21407, 25172, 27969, 29692},
266d4949aa15 Remove AltiVec vector declaration compiler compatibility macros.
diego
parents: 7333
diff changeset
164 {19266, 26722, 25172, 22654, 19266, 22654, 25172, 26722}
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
165 };
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
166
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
167 void idct_put_altivec(uint8_t* dest, int stride, vector_s16_t* block)
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
168 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1277
diff changeset
169 POWERPC_PERF_DECLARE(altivec_idct_put_num, 1);
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
170 vector_u8_t tmp;
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
171
4521
891590781d9e rename POWERPC_PERFORMANCE_REPORT to CONFIG_POWERPC_PERF
mru
parents: 3973
diff changeset
172 #ifdef CONFIG_POWERPC_PERF
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1277
diff changeset
173 POWERPC_PERF_START_COUNT(altivec_idct_put_num, 1);
1839
b370288f004d Metrowerks CodeWarrior patches by (John Dalgliesh <johnd at defyne dot org>)
michael
parents: 1352
diff changeset
174 #endif
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
175 IDCT
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
176
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
177 #define COPY(dest,src) \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
178 tmp = vec_packsu (src, src); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
179 vec_ste ((vector_u32_t)tmp, 0, (unsigned int *)dest); \
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
180 vec_ste ((vector_u32_t)tmp, 4, (unsigned int *)dest);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
181
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
182 COPY (dest, vx0) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
183 COPY (dest, vx1) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
184 COPY (dest, vx2) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
185 COPY (dest, vx3) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
186 COPY (dest, vx4) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
187 COPY (dest, vx5) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
188 COPY (dest, vx6) dest += stride;
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
189 COPY (dest, vx7)
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 828
diff changeset
190
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1277
diff changeset
191 POWERPC_PERF_STOP_COUNT(altivec_idct_put_num, 1);
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
192 }
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
193
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
194 void idct_add_altivec(uint8_t* dest, int stride, vector_s16_t* block)
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
195 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1277
diff changeset
196 POWERPC_PERF_DECLARE(altivec_idct_add_num, 1);
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
197 vector_u8_t tmp;
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
198 vector_s16_t tmp2, tmp3;
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
199 vector_u8_t perm0;
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
200 vector_u8_t perm1;
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
201 vector_u8_t p0, p1, p;
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
202
4521
891590781d9e rename POWERPC_PERFORMANCE_REPORT to CONFIG_POWERPC_PERF
mru
parents: 3973
diff changeset
203 #ifdef CONFIG_POWERPC_PERF
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1277
diff changeset
204 POWERPC_PERF_START_COUNT(altivec_idct_add_num, 1);
1839
b370288f004d Metrowerks CodeWarrior patches by (John Dalgliesh <johnd at defyne dot org>)
michael
parents: 1352
diff changeset
205 #endif
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 828
diff changeset
206
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
207 IDCT
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
208
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
209 p0 = vec_lvsl (0, dest);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
210 p1 = vec_lvsl (stride, dest);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
211 p = vec_splat_u8 (-1);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
212 perm0 = vec_mergeh (p, p0);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
213 perm1 = vec_mergeh (p, p1);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
214
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
215 #define ADD(dest,src,perm) \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
216 /* *(uint64_t *)&tmp = *(uint64_t *)dest; */ \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
217 tmp = vec_ld (0, dest); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
218 tmp2 = (vector_s16_t)vec_perm (tmp, (vector_u8_t)zero, perm); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
219 tmp3 = vec_adds (tmp2, src); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
220 tmp = vec_packsu (tmp3, tmp3); \
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
221 vec_ste ((vector_u32_t)tmp, 0, (unsigned int *)dest); \
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
222 vec_ste ((vector_u32_t)tmp, 4, (unsigned int *)dest);
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
223
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
224 ADD (dest, vx0, perm0) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
225 ADD (dest, vx1, perm1) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
226 ADD (dest, vx2, perm0) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
227 ADD (dest, vx3, perm1) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
228 ADD (dest, vx4, perm0) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
229 ADD (dest, vx5, perm1) dest += stride;
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
230 ADD (dest, vx6, perm0) dest += stride;
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
231 ADD (dest, vx7, perm1)
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 828
diff changeset
232
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1277
diff changeset
233 POWERPC_PERF_STOP_COUNT(altivec_idct_add_num, 1);
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
234 }
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents:
diff changeset
235