annotate bfin/idct_bfin.S @ 8404:60b6a780100b libavcodec

Port x264 deblocking code to libavcodec. This includes SSE2 luma deblocking code and both MMXEXT and SSE2 luma intra deblocking code for H.264 decoding. This assembly is available under --enable-gpl and speeds decoding of Cathedral by 7%.
author darkshikari
date Fri, 19 Dec 2008 13:45:13 +0000
parents 78aa57eba353
children 8327c5b4df9b
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4765
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
1 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
2 * idct BlackFin
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
3 *
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
4 * Copyright (C) 2007 Marc Hoffman <marc.hoffman@analog.com>
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
5 *
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
6 * This file is part of FFmpeg.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
7 *
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
8 * FFmpeg is free software; you can redistribute it and/or
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
9 * modify it under the terms of the GNU Lesser General Public
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
10 * License as published by the Free Software Foundation; either
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
11 * version 2.1 of the License, or (at your option) any later version.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
12 *
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
13 * FFmpeg is distributed in the hope that it will be useful,
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
14 * but WITHOUT ANY WARRANTY; without even the implied warranty of
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
15 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
16 * Lesser General Public License for more details.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
17 *
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
18 * You should have received a copy of the GNU Lesser General Public
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
19 * License along with FFmpeg; if not, write to the Free Software
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
20 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
21 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
22 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
23 This blackfin DSP code implements an 8x8 inverse type II DCT.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
24
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
25 Prototype : void ff_bfin_idct(DCTELEM *in)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
26
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
27 Registers Used : A0, A1, R0-R7, I0-I3, B0, B2, B3, M0-M2, L0-L3, P0-P5, LC0.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
28
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
29 Performance :
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
30 Code Size : 498 Bytes.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
31 Cycle Count : 417 Cycles
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
32
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
33
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
34 -----------------------------------------------------------
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
35 FFMPEG conformance testing results
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
36 -----------------------------------------------------------
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
37
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
38 dct-test: modified with the following
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
39 dct_error("BFINidct", 1, ff_bfin_idct, idct, test);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
40 produces the following output
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
41
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
42 root:/u/ffmpeg/bhead/libavcodec> ./dct-test -i
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
43 ffmpeg DCT/IDCT test
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
44
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
45 8 15 -2 21 24 17 0 10
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
46 2 -10 -5 -5 -3 7 -14 -3
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
47 2 -13 -10 -19 18 -6 6 -2
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
48 9 4 16 -3 9 12 10 15
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
49 15 -9 -2 10 1 16 0 -15
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
50 -15 5 7 3 13 0 13 20
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
51 -6 -15 24 9 -18 1 9 -22
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
52 -8 25 23 2 -7 0 30 13
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
53 IDCT BFINidct: err_inf=1 err2=0.01002344 syserr=0.00150000 maxout=266 blockSumErr=64
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
54 IDCT BFINidct: 88.3 kdct/s
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
55
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
56 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
57
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
58 #include "config_bfin.h"
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
59
6362
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
60 #ifdef __FDPIC__
4765
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
61 .section .l1.data.B,"aw",@progbits
6362
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
62 #else
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
63 .data
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
64 #endif
4765
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
65
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
66 .align 4;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
67 coefs:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
68 .short 0x5a82; // C4
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
69 .short 0x5a82; // C4
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
70 .short 0x30FC; //cos(3pi/8) C6
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
71 .short 0x7642; //cos(pi/8) C2
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
72 .short 0x18F9; //cos(7pi/16)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
73 .short 0x7D8A; //cos(pi/16)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
74 .short 0x471D; //cos(5pi/16)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
75 .short 0x6A6E; //cos(3pi/16)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
76 .short 0x18F9; //cos(7pi/16)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
77 .short 0x7D8A; //cos(pi/16)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
78
6362
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
79 #ifdef __FDPIC__
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
80 .section .l1.data.A,"aw",@progbits
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
81 #endif
4765
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
82
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
83 vtmp: .space 256
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
84
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
85 #define TMP0 FP-8
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
86 #define TMP1 FP-12
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
87 #define TMP2 FP-16
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
88
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
89
6362
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
90 .text
4765
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
91 DEFUN(idct,mL1,
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
92 (DCTELEM *block)):
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
93
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
94 /********************** Function Prologue *********************************/
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
95 link 16;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
96 [--SP] = (R7:4, P5:3); // Push the registers onto the stack.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
97 B0 = R0; // Pointer to Input matrix
6362
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
98 RELOC(R1, P3, coefs); // Pointer to Coefficients
78aa57eba353 FLAT objects cannot have multiple sections, so using the L1 attributes breaks
diego
parents: 5001
diff changeset
99 RELOC(R2, P3, vtmp); // Pointer to Temporary matrix
4765
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
100 B3 = R1;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
101 B2 = R2;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
102 L3 = 20; // L3 is used for making the coefficient array
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
103 // circular.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
104 // MUST BE RESTORED TO ZERO at function exit.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
105 M1 = 16 (X); // All these registers are initialized for
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
106 M3 = 8(X); // modifying address offsets.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
107
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
108 I0 = B0; // I0 points to Input Element (0, 0).
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
109 I2 = B0; // I2 points to Input Element (0, 0).
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
110 I2 += M3 || R0.H = W[I0];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
111 // Element 0 is read into R0.H
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
112 I1 = I2; // I1 points to input Element (0, 6).
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
113 I1 += 4 || R0.L = W[I2++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
114 // I2 points to input Element (0, 4).
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
115 // Element 4 is read into R0.L.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
116 P2 = 8 (X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
117 P3 = 32 (X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
118 P4 = -32 (X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
119 P5 = 98 (X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
120 R7 = 0x8000(Z);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
121 I3 = B3; // I3 points to Coefficients
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
122 P0 = B2; // P0 points to array Element (0, 0) of temp
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
123 P1 = B2;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
124 R7 = [I3++] || [TMP2]=R7; // Coefficient C4 is read into R7.H and R7.L.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
125 MNOP;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
126 NOP;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
127
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
128 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
129 * A1 = Y0 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
130 * A0 = Y0 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
131 * A1 = A1 + Y4 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
132 * A0 = A0 - Y4 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
133 * load:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
134 * R1=(Y2,Y6)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
135 * R7=(C2,C6)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
136 * res:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
137 * R3=Y0, R2=Y4
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
138 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
139 A1=R7.H*R0.H, A0=R7.H*R0.H (IS) || I0+= 4 || R1.L=W[I1++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
140 R3=(A1+=R7.H*R0.L), R2=(A0-=R7.H*R0.L) (IS) || R1.H=W[I0--] || R7=[I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
141
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
142 LSETUP (.0, .1) LC0 = P2; // perform 8 1d idcts
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
143
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
144 P2 = 112 (X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
145 P1 = P1 + P2; // P1 points to element (7, 0) of temp buffer.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
146 P2 = -94(X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
147
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
148 .0:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
149 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
150 * A1 = Y2 * cos(3pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
151 * A0 = Y2 * cos(pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
152 * A1 = A1 - Y6 * cos(pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
153 * A0 = A0 + Y6 * cos(3pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
154 * R5 = (Y1,Y7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
155 * R7 = (C1,C7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
156 * res:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
157 * R1=Y2, R0=Y6
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
158 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
159 A1=R7.L*R1.H, A0=R7.H*R1.H (IS) || I0+=4 || R5.H=W[I0];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
160 R1=(A1-=R7.H*R1.L), R0=(A0+=R7.L*R1.L) (IS) || R5.L=W[I1--] || R7=[I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
161 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
162 * Y0 = Y0 + Y6.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
163 * Y4 = Y4 + Y2.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
164 * Y2 = Y4 - Y2.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
165 * Y6 = Y0 - Y6.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
166 * R3 is saved
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
167 * R6.l=Y3
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
168 * note: R3: Y0, R2: Y4, R1: Y2, R0: Y6
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
169 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
170 R3=R3+R0, R0=R3-R0;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
171 R2=R2+R1, R1=R2-R1 || [TMP0]=R3 || R6.L=W[I0--];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
172 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
173 * Compute the odd portion (1,3,5,7) even is done.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
174 *
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
175 * Y1 = C7 * Y1 - C1 * Y7 + C3 * Y5 - C5 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
176 * Y7 = C1 * Y1 + C7 * Y7 + C5 * Y5 + C3 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
177 * Y5 = C5 * Y1 + C3 * Y7 + C7 * Y5 - C1 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
178 * Y3 = C3 * Y1 - C5 * Y7 - C1 * Y5 - C7 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
179 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
180 // R5=(Y1,Y7) R6=(Y5,Y3) // R7=(C1,C7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
181 A1 =R7.L*R5.H, A0 =R7.H*R5.H (IS) || [TMP1]=R2 || R6.H=W[I2--];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
182 A1-=R7.H*R5.L, A0+=R7.L*R5.L (IS) || I0-=4 || R7=[I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
183 A1+=R7.H*R6.H, A0+=R7.L*R6.H (IS) || I0+=M1; // R7=(C3,C5)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
184 R3 =(A1-=R7.L*R6.L), R2 =(A0+=R7.H*R6.L) (IS);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
185 A1 =R7.L*R5.H, A0 =R7.H*R5.H (IS) || R4=[TMP0];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
186 A1+=R7.H*R5.L, A0-=R7.L*R5.L (IS) || I1+=M1 || R7=[I3++]; // R7=(C1,C7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
187 A1+=R7.L*R6.H, A0-=R7.H*R6.H (IS);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
188 R7 =(A1-=R7.H*R6.L), R6 =(A0-=R7.L*R6.L) (IS) || I2+=M1;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
189 // R3=Y1, R2=Y7, R7=Y5, R6=Y3
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
190
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
191 /* Transpose write column. */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
192 R5.H=R4+R2 (RND12); // Y0=Y0+Y7
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
193 R5.L=R4-R2 (RND12) || R4 = [TMP1]; // Y7=Y7-Y0
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
194 R2.H=R1+R7 (RND12) || W[P0++P3]=R5.H; // Y2=Y2+Y5 st Y0
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
195 R2.L=R1-R7 (RND12) || W[P1++P4]=R5.L || R7=[I3++]; // Y5=Y2-Y5 st Y7
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
196 R5.H=R0-R3 (RND12) || W[P0++P3]=R2.H || R1.L=W[I1++]; // Y1=Y6-Y1 st Y2
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
197 R5.L=R0+R3 (RND12) || W[P1++P4]=R2.L || R0.H=W[I0++]; // Y6=Y6+Y1 st Y5
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
198 R3.H=R4-R6 (RND12) || W[P0++P3]=R5.H || R0.L=W[I2++]; // Y3=Y3-Y4 st Y1
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
199 R3.L=R4+R6 (RND12) || W[P1++P4]=R5.L || R1.H=W[I0++]; // Y4=Y3+Y4 st Y6
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
200
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
201 /* pipeline loop start, + drain Y3, Y4 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
202 A1=R7.H*R0.H, A0=R7.H*R0.H (IS) || W[P0++P2]= R3.H || R1.H = W[I0--];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
203 .1: R3=(A1+=R7.H*R0.L), R2=(A0-=R7.H*R0.L) (IS) || W[P1++P5]= R3.L || R7 = [I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
204
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
205
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
206
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
207 I0 = B2; // I0 points to Input Element (0, 0)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
208 I2 = B2; // I2 points to Input Element (0, 0)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
209 I2 += M3 || R0.H = W[I0];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
210 // Y0 is read in R0.H
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
211 I1 = I2; // I1 points to input Element (0, 6)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
212 I1 += 4 || R0.L = W[I2++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
213 // I2 points to input Element (0, 4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
214 // Y4 is read in R0.L
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
215 P2 = 8 (X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
216 I3 = B3; // I3 points to Coefficients
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
217 P0 = B0; // P0 points to array Element (0, 0) for writing
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
218 // output
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
219 P1 = B0;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
220 R7 = [I3++]; // R7.H = C4 and R7.L = C4
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
221 NOP;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
222
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
223 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
224 * A1 = Y0 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
225 * A0 = Y0 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
226 * A1 = A1 + Y4 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
227 * A0 = A0 - Y4 * cos(pi/4)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
228 * load:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
229 * R1=(Y2,Y6)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
230 * R7=(C2,C6)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
231 * res:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
232 * R3=Y0, R2=Y4
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
233 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
234 A1=R7.H*R0.H, A0=R7.H*R0.H (IS) || I0+=4 || R1.L=W[I1++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
235 R3=(A1+=R7.H*R0.L), R2=(A0-=R7.H*R0.L) (IS) || R1.H=W[I0--] || R7=[I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
236
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
237 LSETUP (.2, .3) LC0 = P2; // peform 8 1d idcts
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
238 P2 = 112 (X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
239 P1 = P1 + P2;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
240 P2 = -94(X);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
241
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
242 .2:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
243 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
244 * A1 = Y2 * cos(3pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
245 * A0 = Y2 * cos(pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
246 * A1 = A1 - Y6 * cos(pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
247 * A0 = A0 + Y6 * cos(3pi/8)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
248 * R5 = (Y1,Y7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
249 * R7 = (C1,C7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
250 * res:
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
251 * R1=Y2, R0=Y6
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
252 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
253 A1=R7.L*R1.H, A0=R7.H*R1.H (IS) || I0+=4 || R5.H=W[I0];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
254 R1=(A1-=R7.H*R1.L), R0=(A0+=R7.L*R1.L) (IS) || R5.L=W[I1--] || R7=[I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
255 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
256 * Y0 = Y0 + Y6.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
257 * Y4 = Y4 + Y2.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
258 * Y2 = Y4 - Y2.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
259 * Y6 = Y0 - Y6.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
260 * R3 is saved
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
261 * R6.l=Y3
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
262 * note: R3: Y0, R2: Y4, R1: Y2, R0: Y6
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
263 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
264 R3=R3+R0, R0=R3-R0;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
265 R2=R2+R1, R1=R2-R1 || [TMP0]=R3 || R6.L=W[I0--];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
266 /*
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
267 * Compute the odd portion (1,3,5,7) even is done.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
268 *
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
269 * Y1 = C7 * Y1 - C1 * Y7 + C3 * Y5 - C5 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
270 * Y7 = C1 * Y1 + C7 * Y7 + C5 * Y5 + C3 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
271 * Y5 = C5 * Y1 + C3 * Y7 + C7 * Y5 - C1 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
272 * Y3 = C3 * Y1 - C5 * Y7 - C1 * Y5 - C7 * Y3.
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
273 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
274 // R5=(Y1,Y7) R6=(Y5,Y3) // R7=(C1,C7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
275 A1 =R7.L*R5.H, A0 =R7.H*R5.H (IS) || [TMP1]=R2 || R6.H=W[I2--];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
276 A1-=R7.H*R5.L, A0+=R7.L*R5.L (IS) || I0-=4 || R7=[I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
277 A1+=R7.H*R6.H, A0+=R7.L*R6.H (IS) || I0+=M1; // R7=(C3,C5)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
278 R3 =(A1-=R7.L*R6.L), R2 =(A0+=R7.H*R6.L) (IS);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
279 A1 =R7.L*R5.H, A0 =R7.H*R5.H (IS) || R4=[TMP0];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
280 A1+=R7.H*R5.L, A0-=R7.L*R5.L (IS) || I1+=M1 || R7=[I3++]; // R7=(C1,C7)
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
281 A1+=R7.L*R6.H, A0-=R7.H*R6.H (IS);
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
282 R7 =(A1-=R7.H*R6.L), R6 =(A0-=R7.L*R6.L) (IS) || I2+=M1;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
283 // R3=Y1, R2=Y7, R7=Y5, R6=Y3
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
284
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
285 /* Transpose write column. */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
286 R5.H=R4+R2 (RND20); // Y0=Y0+Y7
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
287 R5.L=R4-R2 (RND20) || R4 = [TMP1]; // Y7=Y7-Y0
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
288 R2.H=R1+R7 (RND20) || W[P0++P3]=R5.H; // Y2=Y2+Y5 st Y0
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
289 R2.L=R1-R7 (RND20) || W[P1++P4]=R5.L || R7=[I3++]; // Y5=Y2-Y5 st Y7
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
290 R5.H=R0-R3 (RND20) || W[P0++P3]=R2.H || R1.L=W[I1++]; // Y1=Y6-Y1 st Y2
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
291 R5.L=R0+R3 (RND20) || W[P1++P4]=R2.L || R0.H=W[I0++]; // Y6=Y6+Y1 st Y5
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
292 R3.H=R4-R6 (RND20) || W[P0++P3]=R5.H || R0.L=W[I2++]; // Y3=Y3-Y4 st Y1
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
293 R3.L=R4+R6 (RND20) || W[P1++P4]=R5.L || R1.H=W[I0++]; // Y4=Y3+Y4 st Y6
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
294
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
295 /* pipeline loop start, + drain Y3, Y4 */
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
296 A1=R7.H*R0.H, A0=R7.H*R0.H (IS) || W[P0++P2]= R3.H || R1.H = W[I0--];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
297 .3: R3=(A1+=R7.H*R0.L), R2=(A0-=R7.H*R0.L) (IS) || W[P1++P5]= R3.L || R7 = [I3++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
298
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
299 L3 = 0;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
300 (R7:4,P5:3)=[SP++];
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
301 unlink;
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
302 RTS;
5001
75bf61c6c385 Blackfin DSP utilities: add DEFUN_END
gpoirier
parents: 4765
diff changeset
303 DEFUN_END(idct)
4765
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
304
85298e8c55c4 bfin dsputils, basic pixel operations sads, diffs, motion compensation
diego
parents:
diff changeset
305