annotate postproc/yuv2rgb_altivec.c @ 17557:3f863d1d8b43

vYCoeffsBank and vCCoeffsBank are allocated and initialized using incorrect sizes based on the image width instead of height. patch by Alan Curry, pacman at world dot std dot com
author diego
date Wed, 08 Feb 2006 08:16:53 +0000
parents 08cac43f1e38
children ad90899eeee6
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
1 /*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
2 marc.hoffman@analog.com March 8, 2004
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
3
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
4 Altivec Acceleration for Color Space Conversion revision 0.2
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
5
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
6 convert I420 YV12 to RGB in various formats,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
7 it rejects images that are not in 420 formats
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
8 it rejects images that don't have widths of multiples of 16
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
9 it rejects images that don't have heights of multiples of 2
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
10 reject defers to C simulation codes.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
11
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
12 lots of optimizations to be done here
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
13
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
14 1. need to fix saturation code, I just couldn't get it to fly with packs and adds.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
15 so we currently use max min to clip
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
16
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
17 2. the inefficient use of chroma loading needs a bit of brushing up
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
18
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
19 3. analysis of pipeline stalls needs to be done, use shark to identify pipeline stalls
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
20
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
21
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
22 MODIFIED to calculate coeffs from currently selected color space.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
23 MODIFIED core to be a macro which you spec the output format.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
24 ADDED UYVY conversion which is never called due to some thing in SWSCALE.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
25 CORRECTED algorithim selection to be strict on input formats.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
26 ADDED runtime detection of altivec.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
27
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
28 ADDED altivec_yuv2packedX vertical scl + RGB converter
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
29
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
30 March 27,2004
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
31 PERFORMANCE ANALYSIS
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
32
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
33 The C version use 25% of the processor or ~250Mips for D1 video rawvideo used as test
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
34 The ALTIVEC version uses 10% of the processor or ~100Mips for D1 video same sequence
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
35
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
36 720*480*30 ~10MPS
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
37
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
38 so we have roughly 10clocks per pixel this is too high something has to be wrong.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
39
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
40 OPTIMIZED clip codes to utilize vec_max and vec_packs removing the need for vec_min.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
41
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
42 OPTIMIZED DST OUTPUT cache/dma controls. we are pretty much
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
43 guaranteed to have the input video frame it was just decompressed so
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
44 it probably resides in L1 caches. However we are creating the
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
45 output video stream this needs to use the DSTST instruction to
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
46 optimize for the cache. We couple this with the fact that we are
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
47 not going to be visiting the input buffer again so we mark it Least
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
48 Recently Used. This shaves 25% of the processor cycles off.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
49
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
50 Now MEMCPY is the largest mips consumer in the system, probably due
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
51 to the inefficient X11 stuff.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
52
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
53 GL libraries seem to be very slow on this machine 1.33Ghz PB running
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
54 Jaguar, this is not the case for my 1Ghz PB. I thought it might be
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
55 a versioning issues, however i have libGL.1.2.dylib for both
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
56 machines. ((We need to figure this out now))
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
57
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
58 GL2 libraries work now with patch for RGB32
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
59
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
60 NOTE quartz vo driver ARGB32_to_RGB24 consumes 30% of the processor
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
61
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
62 Integrated luma prescaling adjustment for saturation/contrast/brightness adjustment.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
63
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
64 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
65 #include <stdio.h>
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
66 #include <stdlib.h>
12836
9a310b31359f some fixes
alex
parents: 12698
diff changeset
67 #include <string.h>
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
68 #include <inttypes.h>
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
69 #include <assert.h>
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
70 #include "config.h"
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
71 #include "rgb2rgb.h"
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
72 #include "swscale.h"
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
73 #include "swscale_internal.h"
16985
08cac43f1e38 Unify include paths, -I.. is in CFLAGS.
diego
parents: 13564
diff changeset
74 #include "mangle.h"
08cac43f1e38 Unify include paths, -I.. is in CFLAGS.
diego
parents: 13564
diff changeset
75 #include "libvo/img_format.h" //FIXME try to reduce dependency of such stuff
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
76
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
77 #undef PROFILE_THE_BEAST
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
78 #undef INC_SCALING
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
79
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
80 typedef unsigned char ubyte;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
81 typedef signed char sbyte;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
82
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
83
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
84 /* RGB interleaver, 16 planar pels 8-bit samples per channel in
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
85 homogeneous vector registers x0,x1,x2 are interleaved with the
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
86 following technique:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
87
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
88 o0 = vec_mergeh (x0,x1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
89 o1 = vec_perm (o0, x2, perm_rgb_0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
90 o2 = vec_perm (o0, x2, perm_rgb_1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
91 o3 = vec_mergel (x0,x1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
92 o4 = vec_perm (o3,o2,perm_rgb_2);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
93 o5 = vec_perm (o3,o2,perm_rgb_3);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
94
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
95 perm_rgb_0: o0(RG).h v1(B) --> o1*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
96 0 1 2 3 4
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
97 rgbr|gbrg|brgb|rgbr
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
98 0010 0100 1001 0010
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
99 0102 3145 2673 894A
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
100
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
101 perm_rgb_1: o0(RG).h v1(B) --> o2
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
102 0 1 2 3 4
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
103 gbrg|brgb|bbbb|bbbb
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
104 0100 1001 1111 1111
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
105 B5CD 6EF7 89AB CDEF
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
106
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
107 perm_rgb_2: o3(RG).l o2(rgbB.l) --> o4*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
108 0 1 2 3 4
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
109 gbrg|brgb|rgbr|gbrg
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
110 1111 1111 0010 0100
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
111 89AB CDEF 0182 3945
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
112
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
113 perm_rgb_2: o3(RG).l o2(rgbB.l) ---> o5*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
114 0 1 2 3 4
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
115 brgb|rgbr|gbrg|brgb
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
116 1001 0010 0100 1001
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
117 a67b 89cA BdCD eEFf
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
118
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
119 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
120 static
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
121 const vector unsigned char
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
122 perm_rgb_0 = (const vector unsigned char)AVV(0x00,0x01,0x10,0x02,0x03,0x11,0x04,0x05,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
123 0x12,0x06,0x07,0x13,0x08,0x09,0x14,0x0a),
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
124 perm_rgb_1 = (const vector unsigned char)AVV(0x0b,0x15,0x0c,0x0d,0x16,0x0e,0x0f,0x17,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
125 0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f),
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
126 perm_rgb_2 = (const vector unsigned char)AVV(0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
127 0x00,0x01,0x18,0x02,0x03,0x19,0x04,0x05),
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
128 perm_rgb_3 = (const vector unsigned char)AVV(0x1a,0x06,0x07,0x1b,0x08,0x09,0x1c,0x0a,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
129 0x0b,0x1d,0x0c,0x0d,0x1e,0x0e,0x0f,0x1f);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
130
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
131 #define vec_merge3(x2,x1,x0,y0,y1,y2) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
132 do { \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
133 typeof(x0) o0,o2,o3; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
134 o0 = vec_mergeh (x0,x1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
135 y0 = vec_perm (o0, x2, perm_rgb_0);\
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
136 o2 = vec_perm (o0, x2, perm_rgb_1);\
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
137 o3 = vec_mergel (x0,x1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
138 y1 = vec_perm (o3,o2,perm_rgb_2); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
139 y2 = vec_perm (o3,o2,perm_rgb_3); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
140 } while(0)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
141
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
142 #define vec_mstrgb24(x0,x1,x2,ptr) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
143 do { \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
144 typeof(x0) _0,_1,_2; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
145 vec_merge3 (x0,x1,x2,_0,_1,_2); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
146 vec_st (_0, 0, ptr++); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
147 vec_st (_1, 0, ptr++); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
148 vec_st (_2, 0, ptr++); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
149 } while (0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
150
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
151 #define vec_mstbgr24(x0,x1,x2,ptr) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
152 do { \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
153 typeof(x0) _0,_1,_2; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
154 vec_merge3 (x2,x1,x0,_0,_1,_2); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
155 vec_st (_0, 0, ptr++); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
156 vec_st (_1, 0, ptr++); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
157 vec_st (_2, 0, ptr++); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
158 } while (0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
159
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
160 /* pack the pixels in rgb0 format
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
161 msb R
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
162 lsb 0
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
163 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
164 #define vec_mstrgb32(T,x0,x1,x2,x3,ptr) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
165 do { \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
166 T _0,_1,_2,_3; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
167 _0 = vec_mergeh (x0,x1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
168 _1 = vec_mergeh (x2,x3); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
169 _2 = (T)vec_mergeh ((vector unsigned short)_0,(vector unsigned short)_1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
170 _3 = (T)vec_mergel ((vector unsigned short)_0,(vector unsigned short)_1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
171 vec_st (_2, 0*16, (T *)ptr); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
172 vec_st (_3, 1*16, (T *)ptr); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
173 _0 = vec_mergel (x0,x1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
174 _1 = vec_mergel (x2,x3); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
175 _2 = (T)vec_mergeh ((vector unsigned short)_0,(vector unsigned short)_1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
176 _3 = (T)vec_mergel ((vector unsigned short)_0,(vector unsigned short)_1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
177 vec_st (_2, 2*16, (T *)ptr); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
178 vec_st (_3, 3*16, (T *)ptr); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
179 ptr += 4; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
180 } while (0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
181
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
182 /*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
183
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
184 | 1 0 1.4021 | | Y |
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
185 | 1 -0.3441 -0.7142 |x| Cb|
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
186 | 1 1.7718 0 | | Cr|
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
187
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
188
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
189 Y: [-128 127]
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
190 Cb/Cr : [-128 127]
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
191
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
192 typical yuv conversion work on Y: 0-255 this version has been optimized for jpeg decode.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
193
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
194 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
195
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
196
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
197
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
198
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
199 #define vec_unh(x) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
200 (vector signed short) \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
201 vec_perm(x,(typeof(x))AVV(0),\
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
202 (vector unsigned char)AVV(0x10,0x00,0x10,0x01,0x10,0x02,0x10,0x03,\
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
203 0x10,0x04,0x10,0x05,0x10,0x06,0x10,0x07))
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
204 #define vec_unl(x) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
205 (vector signed short) \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
206 vec_perm(x,(typeof(x))AVV(0),\
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
207 (vector unsigned char)AVV(0x10,0x08,0x10,0x09,0x10,0x0A,0x10,0x0B,\
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
208 0x10,0x0C,0x10,0x0D,0x10,0x0E,0x10,0x0F))
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
209
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
210 #define vec_clip(x) \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
211 vec_max (vec_min (x, (typeof(x))AVV(235)), (typeof(x))AVV(16))
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
212
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
213 #define vec_packclp_a(x,y) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
214 (vector unsigned char)vec_pack (vec_clip (x), vec_clip (y))
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
215
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
216 #define vec_packclp(x,y) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
217 (vector unsigned char)vec_packs \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
218 ((vector unsigned short)vec_max (x,(vector signed short) AVV(0)), \
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
219 (vector unsigned short)vec_max (y,(vector signed short) AVV(0)))
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
220
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
221 //#define out_pixels(a,b,c,ptr) vec_mstrgb32(typeof(a),((typeof (a))AVV(0)),a,a,a,ptr)
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
222
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
223
12836
9a310b31359f some fixes
alex
parents: 12698
diff changeset
224 static inline void cvtyuvtoRGB (SwsContext *c,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
225 vector signed short Y, vector signed short U, vector signed short V,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
226 vector signed short *R, vector signed short *G, vector signed short *B)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
227 {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
228 vector signed short vx,ux,uvx;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
229
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
230 Y = vec_mradds (Y, c->CY, c->OY);
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
231 U = vec_sub (U,(vector signed short)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
232 vec_splat((vector signed short)AVV(128),0));
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
233 V = vec_sub (V,(vector signed short)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
234 vec_splat((vector signed short)AVV(128),0));
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
235
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
236 // ux = (CBU*(u<<c->CSHIFT)+0x4000)>>15;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
237 ux = vec_sl (U, c->CSHIFT);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
238 *B = vec_mradds (ux, c->CBU, Y);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
239
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
240 // vx = (CRV*(v<<c->CSHIFT)+0x4000)>>15;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
241 vx = vec_sl (V, c->CSHIFT);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
242 *R = vec_mradds (vx, c->CRV, Y);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
243
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
244 // uvx = ((CGU*u) + (CGV*v))>>15;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
245 uvx = vec_mradds (U, c->CGU, Y);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
246 *G = vec_mradds (V, c->CGV, uvx);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
247 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
248
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
249
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
250 /*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
251 ------------------------------------------------------------------------------
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
252 CS converters
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
253 ------------------------------------------------------------------------------
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
254 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
255
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
256
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
257 #define DEFCSP420_CVT(name,out_pixels) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
258 static int altivec_##name (SwsContext *c, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
259 unsigned char **in, int *instrides, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
260 int srcSliceY, int srcSliceH, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
261 unsigned char **oplanes, int *outstrides) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
262 { \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
263 int w = c->srcW; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
264 int h = srcSliceH; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
265 int i,j; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
266 int instrides_scl[3]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
267 vector unsigned char y0,y1; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
268 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
269 vector signed char u,v; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
270 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
271 vector signed short Y0,Y1,Y2,Y3; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
272 vector signed short U,V; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
273 vector signed short vx,ux,uvx; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
274 vector signed short vx0,ux0,uvx0; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
275 vector signed short vx1,ux1,uvx1; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
276 vector signed short R0,G0,B0; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
277 vector signed short R1,G1,B1; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
278 vector unsigned char R,G,B; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
279 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
280 vector unsigned char *uivP, *vivP; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
281 vector unsigned char align_perm; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
282 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
283 vector signed short \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
284 lCY = c->CY, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
285 lOY = c->OY, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
286 lCRV = c->CRV, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
287 lCBU = c->CBU, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
288 lCGU = c->CGU, \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
289 lCGV = c->CGV; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
290 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
291 vector unsigned short lCSHIFT = c->CSHIFT; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
292 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
293 ubyte *y1i = in[0]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
294 ubyte *y2i = in[0]+w; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
295 ubyte *ui = in[1]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
296 ubyte *vi = in[2]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
297 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
298 vector unsigned char *oute \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
299 = (vector unsigned char *) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
300 (oplanes[0]+srcSliceY*outstrides[0]); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
301 vector unsigned char *outo \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
302 = (vector unsigned char *) \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
303 (oplanes[0]+srcSliceY*outstrides[0]+outstrides[0]); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
304 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
305 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
306 instrides_scl[0] = instrides[0]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
307 instrides_scl[1] = instrides[1]-w/2; /* the loop moves ui by w/2 */ \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
308 instrides_scl[2] = instrides[2]-w/2; /* the loop moves vi by w/2 */ \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
309 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
310 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
311 for (i=0;i<h/2;i++) { \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
312 vec_dstst (outo, (0x02000002|(((w*3+32)/32)<<16)), 0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
313 vec_dstst (oute, (0x02000002|(((w*3+32)/32)<<16)), 1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
314 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
315 for (j=0;j<w/16;j++) { \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
316 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
317 y0 = vec_ldl (0,y1i); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
318 y1 = vec_ldl (0,y2i); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
319 uivP = (vector unsigned char *)ui; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
320 vivP = (vector unsigned char *)vi; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
321 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
322 align_perm = vec_lvsl (0, ui); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
323 u = (vector signed char)vec_perm (uivP[0], uivP[1], align_perm); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
324 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
325 align_perm = vec_lvsl (0, vi); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
326 v = (vector signed char)vec_perm (vivP[0], vivP[1], align_perm); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
327 \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
328 u = (vector signed char) \
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
329 vec_sub (u,(vector signed char) \
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
330 vec_splat((vector signed char)AVV(128),0));\
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
331 v = (vector signed char) \
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
332 vec_sub (v,(vector signed char) \
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
333 vec_splat((vector signed char)AVV(128),0));\
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
334 \
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
335 U = vec_unpackh (u); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
336 V = vec_unpackh (v); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
337 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
338 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
339 Y0 = vec_unh (y0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
340 Y1 = vec_unl (y0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
341 Y2 = vec_unh (y1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
342 Y3 = vec_unl (y1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
343 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
344 Y0 = vec_mradds (Y0, lCY, lOY); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
345 Y1 = vec_mradds (Y1, lCY, lOY); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
346 Y2 = vec_mradds (Y2, lCY, lOY); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
347 Y3 = vec_mradds (Y3, lCY, lOY); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
348 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
349 /* ux = (CBU*(u<<CSHIFT)+0x4000)>>15 */ \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
350 ux = vec_sl (U, lCSHIFT); \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
351 ux = vec_mradds (ux, lCBU, (vector signed short)AVV(0)); \
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
352 ux0 = vec_mergeh (ux,ux); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
353 ux1 = vec_mergel (ux,ux); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
354 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
355 /* vx = (CRV*(v<<CSHIFT)+0x4000)>>15; */ \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
356 vx = vec_sl (V, lCSHIFT); \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
357 vx = vec_mradds (vx, lCRV, (vector signed short)AVV(0)); \
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
358 vx0 = vec_mergeh (vx,vx); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
359 vx1 = vec_mergel (vx,vx); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
360 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
361 /* uvx = ((CGU*u) + (CGV*v))>>15 */ \
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
362 uvx = vec_mradds (U, lCGU, (vector signed short)AVV(0)); \
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
363 uvx = vec_mradds (V, lCGV, uvx); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
364 uvx0 = vec_mergeh (uvx,uvx); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
365 uvx1 = vec_mergel (uvx,uvx); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
366 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
367 R0 = vec_add (Y0,vx0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
368 G0 = vec_add (Y0,uvx0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
369 B0 = vec_add (Y0,ux0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
370 R1 = vec_add (Y1,vx1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
371 G1 = vec_add (Y1,uvx1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
372 B1 = vec_add (Y1,ux1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
373 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
374 R = vec_packclp (R0,R1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
375 G = vec_packclp (G0,G1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
376 B = vec_packclp (B0,B1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
377 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
378 out_pixels(R,G,B,oute); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
379 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
380 R0 = vec_add (Y2,vx0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
381 G0 = vec_add (Y2,uvx0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
382 B0 = vec_add (Y2,ux0); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
383 R1 = vec_add (Y3,vx1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
384 G1 = vec_add (Y3,uvx1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
385 B1 = vec_add (Y3,ux1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
386 R = vec_packclp (R0,R1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
387 G = vec_packclp (G0,G1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
388 B = vec_packclp (B0,B1); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
389 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
390 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
391 out_pixels(R,G,B,outo); \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
392 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
393 y1i += 16; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
394 y2i += 16; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
395 ui += 8; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
396 vi += 8; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
397 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
398 } \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
399 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
400 outo += (outstrides[0])>>4; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
401 oute += (outstrides[0])>>4; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
402 \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
403 ui += instrides_scl[1]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
404 vi += instrides_scl[2]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
405 y1i += instrides_scl[0]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
406 y2i += instrides_scl[0]; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
407 } \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
408 return srcSliceH; \
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
409 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
410
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
411
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
412 #define out_abgr(a,b,c,ptr) vec_mstrgb32(typeof(a),((typeof (a))AVV(0)),c,b,a,ptr)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
413 #define out_bgra(a,b,c,ptr) vec_mstrgb32(typeof(a),c,b,a,((typeof (a))AVV(0)),ptr)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
414 #define out_rgba(a,b,c,ptr) vec_mstrgb32(typeof(a),a,b,c,((typeof (a))AVV(0)),ptr)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
415 #define out_argb(a,b,c,ptr) vec_mstrgb32(typeof(a),((typeof (a))AVV(0)),a,b,c,ptr)
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
416 #define out_rgb24(a,b,c,ptr) vec_mstrgb24(a,b,c,ptr)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
417 #define out_bgr24(a,b,c,ptr) vec_mstbgr24(c,b,a,ptr)
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
418
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
419 DEFCSP420_CVT (yuv2_abgr32, out_abgr)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
420 #if 1
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
421 DEFCSP420_CVT (yuv2_bgra32, out_argb)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
422 #else
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
423 static int altivec_yuv2_bgra32 (SwsContext *c,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
424 unsigned char **in, int *instrides,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
425 int srcSliceY, int srcSliceH,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
426 unsigned char **oplanes, int *outstrides)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
427 {
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
428 int w = c->srcW;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
429 int h = srcSliceH;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
430 int i,j;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
431 int instrides_scl[3];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
432 vector unsigned char y0,y1;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
433
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
434 vector signed char u,v;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
435
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
436 vector signed short Y0,Y1,Y2,Y3;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
437 vector signed short U,V;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
438 vector signed short vx,ux,uvx;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
439 vector signed short vx0,ux0,uvx0;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
440 vector signed short vx1,ux1,uvx1;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
441 vector signed short R0,G0,B0;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
442 vector signed short R1,G1,B1;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
443 vector unsigned char R,G,B;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
444
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
445 vector unsigned char *uivP, *vivP;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
446 vector unsigned char align_perm;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
447
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
448 vector signed short
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
449 lCY = c->CY,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
450 lOY = c->OY,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
451 lCRV = c->CRV,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
452 lCBU = c->CBU,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
453 lCGU = c->CGU,
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
454 lCGV = c->CGV;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
455
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
456 vector unsigned short lCSHIFT = c->CSHIFT;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
457
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
458 ubyte *y1i = in[0];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
459 ubyte *y2i = in[0]+w;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
460 ubyte *ui = in[1];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
461 ubyte *vi = in[2];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
462
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
463 vector unsigned char *oute
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
464 = (vector unsigned char *)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
465 (oplanes[0]+srcSliceY*outstrides[0]);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
466 vector unsigned char *outo
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
467 = (vector unsigned char *)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
468 (oplanes[0]+srcSliceY*outstrides[0]+outstrides[0]);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
469
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
470
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
471 instrides_scl[0] = instrides[0];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
472 instrides_scl[1] = instrides[1]-w/2; /* the loop moves ui by w/2 */
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
473 instrides_scl[2] = instrides[2]-w/2; /* the loop moves vi by w/2 */
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
474
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
475
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
476 for (i=0;i<h/2;i++) {
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
477 vec_dstst (outo, (0x02000002|(((w*3+32)/32)<<16)), 0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
478 vec_dstst (oute, (0x02000002|(((w*3+32)/32)<<16)), 1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
479
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
480 for (j=0;j<w/16;j++) {
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
481
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
482 y0 = vec_ldl (0,y1i);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
483 y1 = vec_ldl (0,y2i);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
484 uivP = (vector unsigned char *)ui;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
485 vivP = (vector unsigned char *)vi;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
486
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
487 align_perm = vec_lvsl (0, ui);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
488 u = (vector signed char)vec_perm (uivP[0], uivP[1], align_perm);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
489
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
490 align_perm = vec_lvsl (0, vi);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
491 v = (vector signed char)vec_perm (vivP[0], vivP[1], align_perm);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
492 u = (vector signed char)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
493 vec_sub (u,(vector signed char)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
494 vec_splat((vector signed char)AVV(128),0));
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
495
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
496 v = (vector signed char)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
497 vec_sub (v, (vector signed char)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
498 vec_splat((vector signed char)AVV(128),0));
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
499
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
500 U = vec_unpackh (u);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
501 V = vec_unpackh (v);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
502
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
503
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
504 Y0 = vec_unh (y0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
505 Y1 = vec_unl (y0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
506 Y2 = vec_unh (y1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
507 Y3 = vec_unl (y1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
508
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
509 Y0 = vec_mradds (Y0, lCY, lOY);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
510 Y1 = vec_mradds (Y1, lCY, lOY);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
511 Y2 = vec_mradds (Y2, lCY, lOY);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
512 Y3 = vec_mradds (Y3, lCY, lOY);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
513
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
514 /* ux = (CBU*(u<<CSHIFT)+0x4000)>>15 */
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
515 ux = vec_sl (U, lCSHIFT);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
516 ux = vec_mradds (ux, lCBU, (vector signed short)AVV(0));
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
517 ux0 = vec_mergeh (ux,ux);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
518 ux1 = vec_mergel (ux,ux);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
519
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
520 /* vx = (CRV*(v<<CSHIFT)+0x4000)>>15; */
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
521 vx = vec_sl (V, lCSHIFT);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
522 vx = vec_mradds (vx, lCRV, (vector signed short)AVV(0));
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
523 vx0 = vec_mergeh (vx,vx);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
524 vx1 = vec_mergel (vx,vx);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
525 /* uvx = ((CGU*u) + (CGV*v))>>15 */
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
526 uvx = vec_mradds (U, lCGU, (vector signed short)AVV(0));
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
527 uvx = vec_mradds (V, lCGV, uvx);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
528 uvx0 = vec_mergeh (uvx,uvx);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
529 uvx1 = vec_mergel (uvx,uvx);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
530 R0 = vec_add (Y0,vx0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
531 G0 = vec_add (Y0,uvx0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
532 B0 = vec_add (Y0,ux0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
533 R1 = vec_add (Y1,vx1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
534 G1 = vec_add (Y1,uvx1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
535 B1 = vec_add (Y1,ux1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
536 R = vec_packclp (R0,R1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
537 G = vec_packclp (G0,G1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
538 B = vec_packclp (B0,B1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
539
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
540 out_argb(R,G,B,oute);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
541 R0 = vec_add (Y2,vx0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
542 G0 = vec_add (Y2,uvx0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
543 B0 = vec_add (Y2,ux0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
544 R1 = vec_add (Y3,vx1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
545 G1 = vec_add (Y3,uvx1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
546 B1 = vec_add (Y3,ux1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
547 R = vec_packclp (R0,R1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
548 G = vec_packclp (G0,G1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
549 B = vec_packclp (B0,B1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
550
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
551 out_argb(R,G,B,outo);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
552 y1i += 16;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
553 y2i += 16;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
554 ui += 8;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
555 vi += 8;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
556
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
557 }
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
558
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
559 outo += (outstrides[0])>>4;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
560 oute += (outstrides[0])>>4;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
561
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
562 ui += instrides_scl[1];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
563 vi += instrides_scl[2];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
564 y1i += instrides_scl[0];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
565 y2i += instrides_scl[0];
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
566 }
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
567 return srcSliceH;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
568 }
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
569
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
570 #endif
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
571
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
572
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
573 DEFCSP420_CVT (yuv2_rgba32, out_rgba)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
574 DEFCSP420_CVT (yuv2_argb32, out_argb)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
575 DEFCSP420_CVT (yuv2_rgb24, out_rgb24)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
576 DEFCSP420_CVT (yuv2_bgr24, out_bgr24)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
577
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
578
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
579 // uyvy|uyvy|uyvy|uyvy
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
580 // 0123 4567 89ab cdef
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
581 static
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
582 const vector unsigned char
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
583 demux_u = (const vector unsigned char)AVV(0x10,0x00,0x10,0x00,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
584 0x10,0x04,0x10,0x04,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
585 0x10,0x08,0x10,0x08,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
586 0x10,0x0c,0x10,0x0c),
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
587 demux_v = (const vector unsigned char)AVV(0x10,0x02,0x10,0x02,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
588 0x10,0x06,0x10,0x06,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
589 0x10,0x0A,0x10,0x0A,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
590 0x10,0x0E,0x10,0x0E),
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
591 demux_y = (const vector unsigned char)AVV(0x10,0x01,0x10,0x03,
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
592 0x10,0x05,0x10,0x07,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
593 0x10,0x09,0x10,0x0B,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
594 0x10,0x0D,0x10,0x0F);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
595
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
596 /*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
597 this is so I can play live CCIR raw video
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
598 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
599 static int altivec_uyvy_rgb32 (SwsContext *c,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
600 unsigned char **in, int *instrides,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
601 int srcSliceY, int srcSliceH,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
602 unsigned char **oplanes, int *outstrides)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
603 {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
604 int w = c->srcW;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
605 int h = srcSliceH;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
606 int i,j;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
607 vector unsigned char uyvy;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
608 vector signed short Y,U,V;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
609 vector signed short vx,ux,uvx;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
610 vector signed short R0,G0,B0,R1,G1,B1;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
611 vector unsigned char R,G,B;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
612 vector unsigned char *out;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
613 ubyte *img;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
614
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
615 img = in[0];
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
616 out = (vector unsigned char *)(oplanes[0]+srcSliceY*outstrides[0]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
617
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
618 for (i=0;i<h;i++) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
619 for (j=0;j<w/16;j++) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
620 uyvy = vec_ld (0, img);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
621 U = (vector signed short)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
622 vec_perm (uyvy, (vector unsigned char)AVV(0), demux_u);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
623
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
624 V = (vector signed short)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
625 vec_perm (uyvy, (vector unsigned char)AVV(0), demux_v);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
626
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
627 Y = (vector signed short)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
628 vec_perm (uyvy, (vector unsigned char)AVV(0), demux_y);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
629
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
630 cvtyuvtoRGB (c, Y,U,V,&R0,&G0,&B0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
631
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
632 uyvy = vec_ld (16, img);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
633 U = (vector signed short)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
634 vec_perm (uyvy, (vector unsigned char)AVV(0), demux_u);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
635
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
636 V = (vector signed short)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
637 vec_perm (uyvy, (vector unsigned char)AVV(0), demux_v);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
638
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
639 Y = (vector signed short)
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
640 vec_perm (uyvy, (vector unsigned char)AVV(0), demux_y);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
641
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
642 cvtyuvtoRGB (c, Y,U,V,&R1,&G1,&B1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
643
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
644 R = vec_packclp (R0,R1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
645 G = vec_packclp (G0,G1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
646 B = vec_packclp (B0,B1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
647
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
648 // vec_mstbgr24 (R,G,B, out);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
649 out_rgba (R,G,B,out);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
650
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
651 img += 32;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
652 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
653 }
12836
9a310b31359f some fixes
alex
parents: 12698
diff changeset
654 return srcSliceH;
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
655 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
656
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
657
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
658
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
659 /* Ok currently the acceleration routine only supports
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
660 inputs of widths a multiple of 16
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
661 and heights a multiple 2
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
662
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
663 So we just fall back to the C codes for this.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
664 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
665 SwsFunc yuv2rgb_init_altivec (SwsContext *c)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
666 {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
667 if (!(c->flags & SWS_CPU_CAPS_ALTIVEC))
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
668 return NULL;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
669
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
670 /*
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
671 and this seems not to matter too much I tried a bunch of
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
672 videos with abnormal widths and mplayer crashes else where.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
673 mplayer -vo x11 -rawvideo on:w=350:h=240 raw-350x240.eyuv
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
674 boom with X11 bad match.
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
675
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
676 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
677 if ((c->srcW & 0xf) != 0) return NULL;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
678
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
679 switch (c->srcFormat) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
680 case IMGFMT_YVU9:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
681 case IMGFMT_IF09:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
682 case IMGFMT_YV12:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
683 case IMGFMT_I420:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
684 case IMGFMT_IYUV:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
685 case IMGFMT_CLPL:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
686 case IMGFMT_Y800:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
687 case IMGFMT_Y8:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
688 case IMGFMT_NV12:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
689 case IMGFMT_NV21:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
690 if ((c->srcH & 0x1) != 0)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
691 return NULL;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
692
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
693 switch(c->dstFormat){
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
694 case IMGFMT_RGB24:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
695 MSG_WARN("ALTIVEC: Color Space RGB24\n");
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
696 return altivec_yuv2_rgb24;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
697 case IMGFMT_BGR24:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
698 MSG_WARN("ALTIVEC: Color Space BGR24\n");
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
699 return altivec_yuv2_bgr24;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
700 case IMGFMT_RGB32:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
701 MSG_WARN("ALTIVEC: Color Space ARGB32\n");
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
702 return altivec_yuv2_argb32;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
703 case IMGFMT_BGR32:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
704 MSG_WARN("ALTIVEC: Color Space BGRA32\n");
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
705 // return profile_altivec_bgra32;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
706
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
707 return altivec_yuv2_bgra32;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
708 default: return NULL;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
709 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
710 break;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
711
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
712 case IMGFMT_UYVY:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
713 switch(c->dstFormat){
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
714 case IMGFMT_RGB32:
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
715 MSG_WARN("ALTIVEC: Color Space UYVY -> RGB32\n");
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
716 return altivec_uyvy_rgb32;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
717 default: return NULL;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
718 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
719 break;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
720
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
721 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
722 return NULL;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
723 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
724
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
725 static uint16_t roundToInt16(int64_t f){
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
726 int r= (f + (1<<15))>>16;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
727 if(r<-0x7FFF) return 0x8000;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
728 else if(r> 0x7FFF) return 0x7FFF;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
729 else return r;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
730 }
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
731
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
732 void yuv2rgb_altivec_init_tables (SwsContext *c, const int inv_table[4],int brightness,int contrast, int saturation)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
733 {
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
734 union {
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
735 signed short tmp[8] __attribute__ ((aligned(16)));
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
736 vector signed short vec;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
737 } buf;
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
738
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
739 buf.tmp[0] = ( (0xffffLL) * contrast>>8 )>>9; //cy
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
740 buf.tmp[1] = -256*brightness; //oy
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
741 buf.tmp[2] = (inv_table[0]>>3) *(contrast>>16)*(saturation>>16); //crv
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
742 buf.tmp[3] = (inv_table[1]>>3) *(contrast>>16)*(saturation>>16); //cbu
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
743 buf.tmp[4] = -((inv_table[2]>>1)*(contrast>>16)*(saturation>>16)); //cgu
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
744 buf.tmp[5] = -((inv_table[3]>>1)*(contrast>>16)*(saturation>>16)); //cgv
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
745
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
746
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
747 c->CSHIFT = (vector unsigned short)vec_splat((vector unsigned short)AVV(2),0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
748 c->CY = vec_splat ((vector signed short)buf.vec, 0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
749 c->OY = vec_splat ((vector signed short)buf.vec, 1);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
750 c->CRV = vec_splat ((vector signed short)buf.vec, 2);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
751 c->CBU = vec_splat ((vector signed short)buf.vec, 3);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
752 c->CGU = vec_splat ((vector signed short)buf.vec, 4);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
753 c->CGV = vec_splat ((vector signed short)buf.vec, 5);
12836
9a310b31359f some fixes
alex
parents: 12698
diff changeset
754 #if 0
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
755 {
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
756 int i;
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
757 char *v[6]={"cy","oy","crv","cbu","cgu","cgv"};
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
758 for (i=0; i<6;i++)
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
759 printf("%s %d ", v[i],buf.tmp[i] );
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
760 printf("\n");
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
761 }
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
762 #endif
12836
9a310b31359f some fixes
alex
parents: 12698
diff changeset
763 return;
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
764 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
765
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
766
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
767 void
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
768 altivec_yuv2packedX (SwsContext *c,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
769 int16_t *lumFilter, int16_t **lumSrc, int lumFilterSize,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
770 int16_t *chrFilter, int16_t **chrSrc, int chrFilterSize,
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
771 uint8_t *dest, int dstW, int dstY)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
772 {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
773 int i,j;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
774 short tmp __attribute__((aligned (16)));
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
775 int16_t *p;
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
776 short *f;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
777 vector signed short X,X0,X1,Y0,U0,V0,Y1,U1,V1,U,V;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
778 vector signed short R0,G0,B0,R1,G1,B1;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
779
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
780 vector unsigned char R,G,B,pels[3];
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
781 vector unsigned char *out,*nout;
13564
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
782
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
783 vector signed short RND = vec_splat((vector signed short)AVV(1<<3),0);
992960f68af0 postproc/yuv2rgb_altivec.c compile fix
michael
parents: 12837
diff changeset
784 vector unsigned short SCL = vec_splat((vector unsigned short)AVV(4),0);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
785 unsigned long scratch[16] __attribute__ ((aligned (16)));
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
786
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
787 vector signed short *vYCoeffsBank, *vCCoeffsBank;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
788
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
789 vector signed short *YCoeffs, *CCoeffs;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
790
17557
3f863d1d8b43 vYCoeffsBank and vCCoeffsBank are allocated and initialized using incorrect
diego
parents: 16985
diff changeset
791 vYCoeffsBank = malloc (sizeof (vector signed short)*lumFilterSize*c->dstH);
3f863d1d8b43 vYCoeffsBank and vCCoeffsBank are allocated and initialized using incorrect
diego
parents: 16985
diff changeset
792 vCCoeffsBank = malloc (sizeof (vector signed short)*chrFilterSize*c->dstH);
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
793
17557
3f863d1d8b43 vYCoeffsBank and vCCoeffsBank are allocated and initialized using incorrect
diego
parents: 16985
diff changeset
794 for (i=0;i<lumFilterSize*c->dstH;i++) {
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
795 tmp = c->vLumFilter[i];
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
796 p = &vYCoeffsBank[i];
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
797 for (j=0;j<8;j++)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
798 p[j] = tmp;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
799 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
800
17557
3f863d1d8b43 vYCoeffsBank and vCCoeffsBank are allocated and initialized using incorrect
diego
parents: 16985
diff changeset
801 for (i=0;i<chrFilterSize*c->dstH;i++) {
12698
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
802 tmp = c->vChrFilter[i];
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
803 p = &vCCoeffsBank[i];
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
804 for (j=0;j<8;j++)
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
805 p[j] = tmp;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
806 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
807
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
808 YCoeffs = vYCoeffsBank+dstY*lumFilterSize;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
809 CCoeffs = vCCoeffsBank+dstY*chrFilterSize;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
810
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
811 out = (vector unsigned char *)dest;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
812
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
813 for(i=0; i<dstW; i+=16){
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
814 Y0 = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
815 Y1 = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
816 /* extract 16 coeffs from lumSrc */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
817 for(j=0; j<lumFilterSize; j++) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
818 X0 = vec_ld (0, &lumSrc[j][i]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
819 X1 = vec_ld (16, &lumSrc[j][i]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
820 Y0 = vec_mradds (X0, YCoeffs[j], Y0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
821 Y1 = vec_mradds (X1, YCoeffs[j], Y1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
822 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
823
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
824 U = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
825 V = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
826 /* extract 8 coeffs from U,V */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
827 for(j=0; j<chrFilterSize; j++) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
828 X = vec_ld (0, &chrSrc[j][i/2]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
829 U = vec_mradds (X, CCoeffs[j], U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
830 X = vec_ld (0, &chrSrc[j][i/2+2048]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
831 V = vec_mradds (X, CCoeffs[j], V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
832 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
833
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
834 /* scale and clip signals */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
835 Y0 = vec_sra (Y0, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
836 Y1 = vec_sra (Y1, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
837 U = vec_sra (U, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
838 V = vec_sra (V, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
839
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
840 Y0 = vec_clip (Y0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
841 Y1 = vec_clip (Y1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
842 U = vec_clip (U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
843 V = vec_clip (V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
844
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
845 /* now we have
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
846 Y0= y0 y1 y2 y3 y4 y5 y6 y7 Y1= y8 y9 y10 y11 y12 y13 y14 y15
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
847 U= u0 u1 u2 u3 u4 u5 u6 u7 V= v0 v1 v2 v3 v4 v5 v6 v7
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
848
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
849 Y0= y0 y1 y2 y3 y4 y5 y6 y7 Y1= y8 y9 y10 y11 y12 y13 y14 y15
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
850 U0= u0 u0 u1 u1 u2 u2 u3 u3 U1= u4 u4 u5 u5 u6 u6 u7 u7
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
851 V0= v0 v0 v1 v1 v2 v2 v3 v3 V1= v4 v4 v5 v5 v6 v6 v7 v7
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
852 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
853
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
854 U0 = vec_mergeh (U,U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
855 V0 = vec_mergeh (V,V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
856
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
857 U1 = vec_mergel (U,U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
858 V1 = vec_mergel (V,V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
859
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
860 cvtyuvtoRGB (c, Y0,U0,V0,&R0,&G0,&B0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
861 cvtyuvtoRGB (c, Y1,U1,V1,&R1,&G1,&B1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
862
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
863 R = vec_packclp (R0,R1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
864 G = vec_packclp (G0,G1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
865 B = vec_packclp (B0,B1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
866
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
867 out_rgba (R,G,B,out);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
868 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
869
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
870 if (i < dstW) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
871 i -= 16;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
872
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
873 Y0 = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
874 Y1 = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
875 /* extract 16 coeffs from lumSrc */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
876 for(j=0; j<lumFilterSize; j++) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
877 X0 = vec_ld (0, &lumSrc[j][i]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
878 X1 = vec_ld (16, &lumSrc[j][i]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
879 Y0 = vec_mradds (X0, YCoeffs[j], Y0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
880 Y1 = vec_mradds (X1, YCoeffs[j], Y1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
881 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
882
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
883 U = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
884 V = RND;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
885 /* extract 8 coeffs from U,V */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
886 for(j=0; j<chrFilterSize; j++) {
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
887 X = vec_ld (0, &chrSrc[j][i/2]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
888 U = vec_mradds (X, CCoeffs[j], U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
889 X = vec_ld (0, &chrSrc[j][i/2+2048]);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
890 V = vec_mradds (X, CCoeffs[j], V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
891 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
892
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
893 /* scale and clip signals */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
894 Y0 = vec_sra (Y0, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
895 Y1 = vec_sra (Y1, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
896 U = vec_sra (U, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
897 V = vec_sra (V, SCL);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
898
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
899 Y0 = vec_clip (Y0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
900 Y1 = vec_clip (Y1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
901 U = vec_clip (U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
902 V = vec_clip (V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
903
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
904 /* now we have
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
905 Y0= y0 y1 y2 y3 y4 y5 y6 y7 Y1= y8 y9 y10 y11 y12 y13 y14 y15
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
906 U= u0 u1 u2 u3 u4 u5 u6 u7 V= v0 v1 v2 v3 v4 v5 v6 v7
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
907
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
908 Y0= y0 y1 y2 y3 y4 y5 y6 y7 Y1= y8 y9 y10 y11 y12 y13 y14 y15
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
909 U0= u0 u0 u1 u1 u2 u2 u3 u3 U1= u4 u4 u5 u5 u6 u6 u7 u7
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
910 V0= v0 v0 v1 v1 v2 v2 v3 v3 V1= v4 v4 v5 v5 v6 v6 v7 v7
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
911 */
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
912
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
913 U0 = vec_mergeh (U,U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
914 V0 = vec_mergeh (V,V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
915
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
916 U1 = vec_mergel (U,U);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
917 V1 = vec_mergel (V,V);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
918
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
919 cvtyuvtoRGB (c, Y0,U0,V0,&R0,&G0,&B0);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
920 cvtyuvtoRGB (c, Y1,U1,V1,&R1,&G1,&B1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
921
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
922 R = vec_packclp (R0,R1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
923 G = vec_packclp (G0,G1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
924 B = vec_packclp (B0,B1);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
925
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
926 nout = (vector unsigned char *)scratch;
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
927 out_rgba (R,G,B,nout);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
928
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
929 memcpy (&((uint32_t*)dest)[i], scratch, (dstW-i)/4);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
930 }
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
931
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
932 if (vYCoeffsBank) free (vYCoeffsBank);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
933 if (vCCoeffsBank) free (vCCoeffsBank);
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
934
d2aef091743c altivec yuv->rgb converter
michael
parents:
diff changeset
935 }