annotate armv4l/simple_idct_arm.S @ 5707:c46509aca422 libavcodec

Remove check for input buffer size as it does not guarantee that decoder will not run out of output buffer bounds (and all suspected decoders have their own checks now).
author kostya
date Mon, 24 Sep 2007 16:50:32 +0000
parents d2a7fc14345c
children 15ed47af1838
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2967
ef2149182f1c COSMETICS: Remove all trailing whitespace.
diego
parents: 1347
diff changeset
1 /*
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
2 * simple_idct_arm.S
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
3 * Copyright (C) 2002 Frederic 'dilb' Boulay.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
4 *
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
5 * Author: Frederic Boulay <dilb@handhelds.org>
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
6 *
5214
470601203f44 Group all copyright and author notices together.
diego
parents: 3947
diff changeset
7 * The function defined in this file is derived from the simple_idct function
470601203f44 Group all copyright and author notices together.
diego
parents: 3947
diff changeset
8 * from the libavcodec library part of the FFmpeg project.
470601203f44 Group all copyright and author notices together.
diego
parents: 3947
diff changeset
9 *
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3683
diff changeset
10 * This file is part of FFmpeg.
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3683
diff changeset
11 *
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3683
diff changeset
12 * FFmpeg is free software; you can redistribute it and/or
3683
dc1e28564bb2 Switch license from GPL to LGPL. The original author agreed to this as
diego
parents: 3036
diff changeset
13 * modify it under the terms of the GNU Lesser General Public
dc1e28564bb2 Switch license from GPL to LGPL. The original author agreed to this as
diego
parents: 3036
diff changeset
14 * License as published by the Free Software Foundation; either
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3683
diff changeset
15 * version 2.1 of the License, or (at your option) any later version.
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
16 *
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3683
diff changeset
17 * FFmpeg is distributed in the hope that it will be useful,
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
18 * but WITHOUT ANY WARRANTY; without even the implied warranty of
3683
dc1e28564bb2 Switch license from GPL to LGPL. The original author agreed to this as
diego
parents: 3036
diff changeset
19 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
dc1e28564bb2 Switch license from GPL to LGPL. The original author agreed to this as
diego
parents: 3036
diff changeset
20 * Lesser General Public License for more details.
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
21 *
3683
dc1e28564bb2 Switch license from GPL to LGPL. The original author agreed to this as
diego
parents: 3036
diff changeset
22 * You should have received a copy of the GNU Lesser General Public
3947
c8c591fe26f8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
diego
parents: 3683
diff changeset
23 * License along with FFmpeg; if not, write to the Free Software
3036
0b546eab515d Update licensing information: The FSF changed postal address.
diego
parents: 2979
diff changeset
24 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
25 */
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
26
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
27 /* useful constants for the algorithm, they are save in __constant_ptr__ at */
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
28 /* the end of the source code.*/
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
29 #define W1 22725
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
30 #define W2 21407
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
31 #define W3 19266
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
32 #define W4 16383
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
33 #define W5 12873
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
34 #define W6 8867
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
35 #define W7 4520
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
36 #define MASK_MSHW 0xFFFF0000
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
37
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
38 /* offsets of the constants in the vector */
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
39 #define offW1 0
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
40 #define offW2 4
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
41 #define offW3 8
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
42 #define offW4 12
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
43 #define offW5 16
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
44 #define offW6 20
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
45 #define offW7 24
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
46 #define offMASK_MSHW 28
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
47
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
48 #define ROW_SHIFT 11
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
49 #define ROW_SHIFT2MSHW (16-11)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
50 #define COL_SHIFT 20
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
51 #define ROW_SHIFTED_1 1024 /* 1<< (ROW_SHIFT-1) */
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
52 #define COL_SHIFTED_1 524288 /* 1<< (COL_SHIFT-1) */
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
53
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
54
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
55 .text
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
56 .align
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
57 .global simple_idct_ARM
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
58
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
59 simple_idct_ARM:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
60 @@ void simple_idct_ARM(int16_t *block)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
61 @@ save stack for reg needed (take all of them),
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
62 @@ R0-R3 are scratch regs, so no need to save them, but R0 contains the pointer to block
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
63 @@ so it must not be overwritten, if it is not saved!!
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
64 @@ R12 is another scratch register, so it should not be saved too
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
65 @@ save all registers
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
66 stmfd sp!, {r4-r11, r14} @ R14 is also called LR
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
67 @@ at this point, R0=block, other registers are free.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
68 add r14, r0, #112 @ R14=&block[8*7], better start from the last row, and decrease the value until row=0, i.e. R12=block.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
69 add r12, pc, #(__constant_ptr__-.-8) @ R12=__constant_ptr__, the vector containing the constants, probably not necessary to reserve a register for it
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
70 @@ add 2 temporary variables in the stack: R0 and R14
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
71 sub sp, sp, #8 @ allow 2 local variables
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
72 str r0, [sp, #0] @ save block in sp[0]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
73 @@ stack status
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
74 @@ sp+4 free
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
75 @@ sp+0 R0 (block)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
76
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
77
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
78 @@ at this point, R0=block, R14=&block[56], R12=__const_ptr_, R1-R11 free
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
79
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
80
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
81 __row_loop:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
82 @@ read the row and check if it is null, almost null, or not, according to strongarm specs, it is not necessary to optimise ldr accesses (i.e. split 32bits in 2 16bits words), at least it gives more usable registers :)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
83 ldr r1, [r14, #0] @ R1=(int32)(R12)[0]=ROWr32[0] (relative row cast to a 32b pointer)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
84 ldr r2, [r14, #4] @ R2=(int32)(R12)[1]=ROWr32[1]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
85 ldr r3, [r14, #8] @ R3=ROWr32[2]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
86 ldr r4, [r14, #12] @ R4=ROWr32[3]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
87 @@ check if the words are null, if all of them are null, then proceed with next row (branch __end_row_loop),
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
88 @@ if ROWr16[0] is the only one not null, then proceed with this special case (branch __almost_empty_row)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
89 @@ else follow the complete algorithm.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
90 @@ at this point, R0=block, R14=&block[n], R12=__const_ptr_, R1=ROWr32[0], R2=ROWr32[1],
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
91 @@ R3=ROWr32[2], R4=ROWr32[3], R5-R11 free
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
92 orr r5, r4, r3 @ R5=R4 | R3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
93 orr r5, r5, r2 @ R5=R4 | R3 | R2
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
94 orrs r6, r5, r1 @ Test R5 | R1 (the aim is to check if everything is null)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
95 beq __end_row_loop
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
96 mov r7, r1, asr #16 @ R7=R1>>16=ROWr16[1] (evaluate it now, as it could be useful later)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
97 ldrsh r6, [r14, #0] @ R6=ROWr16[0]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
98 orrs r5, r5, r7 @ R5=R4 | R3 | R2 | R7
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
99 beq __almost_empty_row
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
100
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
101 __b_evaluation:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
102 @@ at this point, R0=block (temp), R1(free), R2=ROWr32[1], R3=ROWr32[2], R4=ROWr32[3],
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
103 @@ R5=(temp), R6=ROWr16[0], R7=ROWr16[1], R8-R11 free,
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
104 @@ R12=__const_ptr_, R14=&block[n]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
105 @@ to save some registers/calls, proceed with b0-b3 first, followed by a0-a3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
106
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
107 @@ MUL16(b0, W1, row[1]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
108 @@ MUL16(b1, W3, row[1]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
109 @@ MUL16(b2, W5, row[1]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
110 @@ MUL16(b3, W7, row[1]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
111 @@ MAC16(b0, W3, row[3]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
112 @@ MAC16(b1, -W7, row[3]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
113 @@ MAC16(b2, -W1, row[3]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
114 @@ MAC16(b3, -W5, row[3]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
115 ldr r8, [r12, #offW1] @ R8=W1
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
116 mov r2, r2, asr #16 @ R2=ROWr16[3]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
117 mul r0, r8, r7 @ R0=W1*ROWr16[1]=b0 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
118 ldr r9, [r12, #offW3] @ R9=W3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
119 ldr r10, [r12, #offW5] @ R10=W5
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
120 mul r1, r9, r7 @ R1=W3*ROWr16[1]=b1 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
121 ldr r11, [r12, #offW7] @ R11=W7
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
122 mul r5, r10, r7 @ R5=W5*ROWr16[1]=b2 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
123 mul r7, r11, r7 @ R7=W7*ROWr16[1]=b3 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
124 teq r2, #0 @ if null avoid muls
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
125 mlane r0, r9, r2, r0 @ R0+=W3*ROWr16[3]=b0 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
126 rsbne r2, r2, #0 @ R2=-ROWr16[3]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
127 mlane r1, r11, r2, r1 @ R1-=W7*ROWr16[3]=b1 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
128 mlane r5, r8, r2, r5 @ R5-=W1*ROWr16[3]=b2 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
129 mlane r7, r10, r2, r7 @ R7-=W5*ROWr16[3]=b3 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
130
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
131 @@ at this point, R0=b0, R1=b1, R2 (free), R3=ROWr32[2], R4=ROWr32[3],
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
132 @@ R5=b2, R6=ROWr16[0], R7=b3, R8=W1, R9=W3, R10=W5, R11=W7,
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
133 @@ R12=__const_ptr_, R14=&block[n]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
134 @@ temp = ((uint32_t*)row)[2] | ((uint32_t*)row)[3];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
135 @@ if (temp != 0) {}
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
136 orrs r2, r3, r4 @ R2=ROWr32[2] | ROWr32[3]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
137 beq __end_b_evaluation
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
138
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
139 @@ at this point, R0=b0, R1=b1, R2 (free), R3=ROWr32[2], R4=ROWr32[3],
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
140 @@ R5=b2, R6=ROWr16[0], R7=b3, R8=W1, R9=W3, R10=W5, R11=W7,
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
141 @@ R12=__const_ptr_, R14=&block[n]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
142 @@ MAC16(b0, W5, row[5]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
143 @@ MAC16(b2, W7, row[5]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
144 @@ MAC16(b3, W3, row[5]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
145 @@ MAC16(b1, -W1, row[5]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
146 @@ MAC16(b0, W7, row[7]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
147 @@ MAC16(b2, W3, row[7]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
148 @@ MAC16(b3, -W1, row[7]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
149 @@ MAC16(b1, -W5, row[7]);
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
150 mov r3, r3, asr #16 @ R3=ROWr16[5]
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
151 teq r3, #0 @ if null avoid muls
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
152 mlane r0, r10, r3, r0 @ R0+=W5*ROWr16[5]=b0
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
153 mov r4, r4, asr #16 @ R4=ROWr16[7]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
154 mlane r5, r11, r3, r5 @ R5+=W7*ROWr16[5]=b2
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
155 mlane r7, r9, r3, r7 @ R7+=W3*ROWr16[5]=b3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
156 rsbne r3, r3, #0 @ R3=-ROWr16[5]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
157 mlane r1, r8, r3, r1 @ R7-=W1*ROWr16[5]=b1
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
158 @@ R3 is free now
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
159 teq r4, #0 @ if null avoid muls
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
160 mlane r0, r11, r4, r0 @ R0+=W7*ROWr16[7]=b0
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
161 mlane r5, r9, r4, r5 @ R5+=W3*ROWr16[7]=b2
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
162 rsbne r4, r4, #0 @ R4=-ROWr16[7]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
163 mlane r7, r8, r4, r7 @ R7-=W1*ROWr16[7]=b3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
164 mlane r1, r10, r4, r1 @ R1-=W5*ROWr16[7]=b1
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
165 @@ R4 is free now
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
166 __end_b_evaluation:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
167 @@ at this point, R0=b0, R1=b1, R2=ROWr32[2] | ROWr32[3] (tmp), R3 (free), R4 (free),
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
168 @@ R5=b2, R6=ROWr16[0], R7=b3, R8 (free), R9 (free), R10 (free), R11 (free),
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
169 @@ R12=__const_ptr_, R14=&block[n]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
170
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
171 __a_evaluation:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
172 @@ a0 = (W4 * row[0]) + (1 << (ROW_SHIFT - 1));
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
173 @@ a1 = a0 + W6 * row[2];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
174 @@ a2 = a0 - W6 * row[2];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
175 @@ a3 = a0 - W2 * row[2];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
176 @@ a0 = a0 + W2 * row[2];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
177 ldr r9, [r12, #offW4] @ R9=W4
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
178 mul r6, r9, r6 @ R6=W4*ROWr16[0]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
179 ldr r10, [r12, #offW6] @ R10=W6
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
180 ldrsh r4, [r14, #4] @ R4=ROWr16[2] (a3 not defined yet)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
181 add r6, r6, #ROW_SHIFTED_1 @ R6=W4*ROWr16[0] + 1<<(ROW_SHIFT-1) (a0)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
182
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
183 mul r11, r10, r4 @ R11=W6*ROWr16[2]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
184 ldr r8, [r12, #offW2] @ R8=W2
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
185 sub r3, r6, r11 @ R3=a0-W6*ROWr16[2] (a2)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
186 @@ temp = ((uint32_t*)row)[2] | ((uint32_t*)row)[3];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
187 @@ if (temp != 0) {}
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
188 teq r2, #0
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
189 beq __end_bef_a_evaluation
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
190
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
191 add r2, r6, r11 @ R2=a0+W6*ROWr16[2] (a1)
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
192 mul r11, r8, r4 @ R11=W2*ROWr16[2]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
193 sub r4, r6, r11 @ R4=a0-W2*ROWr16[2] (a3)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
194 add r6, r6, r11 @ R6=a0+W2*ROWr16[2] (a0)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
195
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
196
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
197 @@ at this point, R0=b0, R1=b1, R2=a1, R3=a2, R4=a3,
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
198 @@ R5=b2, R6=a0, R7=b3, R8=W2, R9=W4, R10=W6, R11 (free),
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
199 @@ R12=__const_ptr_, R14=&block[n]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
200
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
201
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
202 @@ a0 += W4*row[4]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
203 @@ a1 -= W4*row[4]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
204 @@ a2 -= W4*row[4]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
205 @@ a3 += W4*row[4]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
206 ldrsh r11, [r14, #8] @ R11=ROWr16[4]
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
207 teq r11, #0 @ if null avoid muls
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
208 mulne r11, r9, r11 @ R11=W4*ROWr16[4]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
209 @@ R9 is free now
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
210 ldrsh r9, [r14, #12] @ R9=ROWr16[6]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
211 addne r6, r6, r11 @ R6+=W4*ROWr16[4] (a0)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
212 subne r2, r2, r11 @ R2-=W4*ROWr16[4] (a1)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
213 subne r3, r3, r11 @ R3-=W4*ROWr16[4] (a2)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
214 addne r4, r4, r11 @ R4+=W4*ROWr16[4] (a3)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
215 @@ W6 alone is no more useful, save W2*ROWr16[6] in it instead
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
216 teq r9, #0 @ if null avoid muls
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
217 mulne r11, r10, r9 @ R11=W6*ROWr16[6]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
218 addne r6, r6, r11 @ R6+=W6*ROWr16[6] (a0)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
219 mulne r10, r8, r9 @ R10=W2*ROWr16[6]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
220 @@ a0 += W6*row[6];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
221 @@ a3 -= W6*row[6];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
222 @@ a1 -= W2*row[6];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
223 @@ a2 += W2*row[6];
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
224 subne r4, r4, r11 @ R4-=W6*ROWr16[6] (a3)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
225 subne r2, r2, r10 @ R2-=W2*ROWr16[6] (a1)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
226 addne r3, r3, r10 @ R3+=W2*ROWr16[6] (a2)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
227
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
228 __end_a_evaluation:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
229 @@ at this point, R0=b0, R1=b1, R2=a1, R3=a2, R4=a3,
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
230 @@ R5=b2, R6=a0, R7=b3, R8 (free), R9 (free), R10 (free), R11 (free),
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
231 @@ R12=__const_ptr_, R14=&block[n]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
232 @@ row[0] = (a0 + b0) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
233 @@ row[1] = (a1 + b1) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
234 @@ row[2] = (a2 + b2) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
235 @@ row[3] = (a3 + b3) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
236 @@ row[4] = (a3 - b3) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
237 @@ row[5] = (a2 - b2) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
238 @@ row[6] = (a1 - b1) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
239 @@ row[7] = (a0 - b0) >> ROW_SHIFT;
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
240 add r8, r6, r0 @ R8=a0+b0
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
241 add r9, r2, r1 @ R9=a1+b1
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
242 @@ put 2 16 bits half-words in a 32bits word
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
243 @@ ROWr32[0]=ROWr16[0] | (ROWr16[1]<<16) (only Little Endian compliant then!!!)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
244 ldr r10, [r12, #offMASK_MSHW] @ R10=0xFFFF0000
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
245 and r9, r10, r9, lsl #ROW_SHIFT2MSHW @ R9=0xFFFF0000 & ((a1+b1)<<5)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
246 mvn r11, r10 @ R11= NOT R10= 0x0000FFFF
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
247 and r8, r11, r8, asr #ROW_SHIFT @ R8=0x0000FFFF & ((a0+b0)>>11)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
248 orr r8, r8, r9
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
249 str r8, [r14, #0]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
250
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
251 add r8, r3, r5 @ R8=a2+b2
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
252 add r9, r4, r7 @ R9=a3+b3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
253 and r9, r10, r9, lsl #ROW_SHIFT2MSHW @ R9=0xFFFF0000 & ((a3+b3)<<5)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
254 and r8, r11, r8, asr #ROW_SHIFT @ R8=0x0000FFFF & ((a2+b2)>>11)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
255 orr r8, r8, r9
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
256 str r8, [r14, #4]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
257
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
258 sub r8, r4, r7 @ R8=a3-b3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
259 sub r9, r3, r5 @ R9=a2-b2
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
260 and r9, r10, r9, lsl #ROW_SHIFT2MSHW @ R9=0xFFFF0000 & ((a2-b2)<<5)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
261 and r8, r11, r8, asr #ROW_SHIFT @ R8=0x0000FFFF & ((a3-b3)>>11)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
262 orr r8, r8, r9
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
263 str r8, [r14, #8]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
264
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
265 sub r8, r2, r1 @ R8=a1-b1
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
266 sub r9, r6, r0 @ R9=a0-b0
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
267 and r9, r10, r9, lsl #ROW_SHIFT2MSHW @ R9=0xFFFF0000 & ((a0-b0)<<5)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
268 and r8, r11, r8, asr #ROW_SHIFT @ R8=0x0000FFFF & ((a1-b1)>>11)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
269 orr r8, r8, r9
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
270 str r8, [r14, #12]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
271
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
272 bal __end_row_loop
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
273
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
274 __almost_empty_row:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
275 @@ the row was empty, except ROWr16[0], now, management of this special case
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
276 @@ at this point, R0=block, R14=&block[n], R12=__const_ptr_, R1=ROWr32[0], R2=ROWr32[1],
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
277 @@ R3=ROWr32[2], R4=ROWr32[3], R5=(temp), R6=ROWr16[0], R7=ROWr16[1],
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
278 @@ R8=0xFFFF (temp), R9-R11 free
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
279 mov r8, #0x10000 @ R8=0xFFFF (2 steps needed!) it saves a ldr call (because of delay run).
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
280 sub r8, r8, #1 @ R8 is now ready.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
281 and r5, r8, r6, lsl #3 @ R5=R8 & (R6<<3)= (ROWr16[0]<<3) & 0xFFFF
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
282 orr r5, r5, r5, lsl #16 @ R5=R5 | (R5<<16)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
283 str r5, [r14, #0] @ R14[0]=ROWr32[0]=R5
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
284 str r5, [r14, #4] @ R14[4]=ROWr32[1]=R5
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
285 str r5, [r14, #8] @ R14[8]=ROWr32[2]=R5
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
286 str r5, [r14, #12] @ R14[12]=ROWr32[3]=R5
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
287
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
288 __end_row_loop:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
289 @@ at this point, R0-R11 (free)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
290 @@ R12=__const_ptr_, R14=&block[n]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
291 ldr r0, [sp, #0] @ R0=block
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
292 teq r0, r14 @ compare current &block[8*n] to block, when block is reached, the loop is finished.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
293 sub r14, r14, #16
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
294 bne __row_loop
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
295
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
296
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
297
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
298 @@ at this point, R0=block, R1-R11 (free)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
299 @@ R12=__const_ptr_, R14=&block[n]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
300 add r14, r0, #14 @ R14=&block[7], better start from the last col, and decrease the value until col=0, i.e. R14=block.
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
301 __col_loop:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
302
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
303 __b_evaluation2:
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
304 @@ at this point, R0=block (temp), R1-R11 (free)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
305 @@ R12=__const_ptr_, R14=&block[n]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
306 @@ proceed with b0-b3 first, followed by a0-a3
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
307 @@ MUL16(b0, W1, col[8x1]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
308 @@ MUL16(b1, W3, col[8x1]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
309 @@ MUL16(b2, W5, col[8x1]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
310 @@ MUL16(b3, W7, col[8x1]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
311 @@ MAC16(b0, W3, col[8x3]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
312 @@ MAC16(b1, -W7, col[8x3]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
313 @@ MAC16(b2, -W1, col[8x3]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
314 @@ MAC16(b3, -W5, col[8x3]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
315 ldr r8, [r12, #offW1] @ R8=W1
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
316 ldrsh r7, [r14, #16]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
317 mul r0, r8, r7 @ R0=W1*ROWr16[1]=b0 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
318 ldr r9, [r12, #offW3] @ R9=W3
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
319 ldr r10, [r12, #offW5] @ R10=W5
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
320 mul r1, r9, r7 @ R1=W3*ROWr16[1]=b1 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
321 ldr r11, [r12, #offW7] @ R11=W7
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
322 mul r5, r10, r7 @ R5=W5*ROWr16[1]=b2 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
323 ldrsh r2, [r14, #48]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
324 mul r7, r11, r7 @ R7=W7*ROWr16[1]=b3 (ROWr16[1] must be the second arg, to have the possibility to save 1 cycle)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
325 teq r2, #0 @ if 0, then avoid muls
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
326 mlane r0, r9, r2, r0 @ R0+=W3*ROWr16[3]=b0 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
327 rsbne r2, r2, #0 @ R2=-ROWr16[3]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
328 mlane r1, r11, r2, r1 @ R1-=W7*ROWr16[3]=b1 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
329 mlane r5, r8, r2, r5 @ R5-=W1*ROWr16[3]=b2 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
330 mlane r7, r10, r2, r7 @ R7-=W5*ROWr16[3]=b3 (ROWr16[3] must be the second arg, to have the possibility to save 1 cycle)
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
331
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
332 @@ at this point, R0=b0, R1=b1, R2 (free), R3 (free), R4 (free),
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
333 @@ R5=b2, R6 (free), R7=b3, R8=W1, R9=W3, R10=W5, R11=W7,
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
334 @@ R12=__const_ptr_, R14=&block[n]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
335 @@ MAC16(b0, W5, col[5x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
336 @@ MAC16(b2, W7, col[5x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
337 @@ MAC16(b3, W3, col[5x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
338 @@ MAC16(b1, -W1, col[5x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
339 @@ MAC16(b0, W7, col[7x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
340 @@ MAC16(b2, W3, col[7x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
341 @@ MAC16(b3, -W1, col[7x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
342 @@ MAC16(b1, -W5, col[7x8]);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
343 ldrsh r3, [r14, #80] @ R3=COLr16[5x8]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
344 teq r3, #0 @ if 0 then avoid muls
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
345 mlane r0, r10, r3, r0 @ R0+=W5*ROWr16[5x8]=b0
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
346 mlane r5, r11, r3, r5 @ R5+=W7*ROWr16[5x8]=b2
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
347 mlane r7, r9, r3, r7 @ R7+=W3*ROWr16[5x8]=b3
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
348 rsbne r3, r3, #0 @ R3=-ROWr16[5x8]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
349 ldrsh r4, [r14, #112] @ R4=COLr16[7x8]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
350 mlane r1, r8, r3, r1 @ R7-=W1*ROWr16[5x8]=b1
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
351 @@ R3 is free now
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
352 teq r4, #0 @ if 0 then avoid muls
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
353 mlane r0, r11, r4, r0 @ R0+=W7*ROWr16[7x8]=b0
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
354 mlane r5, r9, r4, r5 @ R5+=W3*ROWr16[7x8]=b2
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
355 rsbne r4, r4, #0 @ R4=-ROWr16[7x8]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
356 mlane r7, r8, r4, r7 @ R7-=W1*ROWr16[7x8]=b3
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
357 mlane r1, r10, r4, r1 @ R1-=W5*ROWr16[7x8]=b1
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
358 @@ R4 is free now
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
359 __end_b_evaluation2:
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
360 @@ at this point, R0=b0, R1=b1, R2 (free), R3 (free), R4 (free),
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
361 @@ R5=b2, R6 (free), R7=b3, R8 (free), R9 (free), R10 (free), R11 (free),
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
362 @@ R12=__const_ptr_, R14=&block[n]
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
363
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
364 __a_evaluation2:
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
365 @@ a0 = (W4 * col[8x0]) + (1 << (COL_SHIFT - 1));
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
366 @@ a1 = a0 + W6 * row[2];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
367 @@ a2 = a0 - W6 * row[2];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
368 @@ a3 = a0 - W2 * row[2];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
369 @@ a0 = a0 + W2 * row[2];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
370 ldrsh r6, [r14, #0]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
371 ldr r9, [r12, #offW4] @ R9=W4
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
372 mul r6, r9, r6 @ R6=W4*ROWr16[0]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
373 ldr r10, [r12, #offW6] @ R10=W6
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
374 ldrsh r4, [r14, #32] @ R4=ROWr16[2] (a3 not defined yet)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
375 add r6, r6, #COL_SHIFTED_1 @ R6=W4*ROWr16[0] + 1<<(COL_SHIFT-1) (a0)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
376 mul r11, r10, r4 @ R11=W6*ROWr16[2]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
377 ldr r8, [r12, #offW2] @ R8=W2
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
378 add r2, r6, r11 @ R2=a0+W6*ROWr16[2] (a1)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
379 sub r3, r6, r11 @ R3=a0-W6*ROWr16[2] (a2)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
380 mul r11, r8, r4 @ R11=W2*ROWr16[2]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
381 sub r4, r6, r11 @ R4=a0-W2*ROWr16[2] (a3)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
382 add r6, r6, r11 @ R6=a0+W2*ROWr16[2] (a0)
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
383
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
384 @@ at this point, R0=b0, R1=b1, R2=a1, R3=a2, R4=a3,
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
385 @@ R5=b2, R6=a0, R7=b3, R8=W2, R9=W4, R10=W6, R11 (free),
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
386 @@ R12=__const_ptr_, R14=&block[n]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
387 @@ a0 += W4*row[4]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
388 @@ a1 -= W4*row[4]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
389 @@ a2 -= W4*row[4]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
390 @@ a3 += W4*row[4]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
391 ldrsh r11, [r14, #64] @ R11=ROWr16[4]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
392 teq r11, #0 @ if null avoid muls
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
393 mulne r11, r9, r11 @ R11=W4*ROWr16[4]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
394 @@ R9 is free now
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
395 addne r6, r6, r11 @ R6+=W4*ROWr16[4] (a0)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
396 subne r2, r2, r11 @ R2-=W4*ROWr16[4] (a1)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
397 subne r3, r3, r11 @ R3-=W4*ROWr16[4] (a2)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
398 ldrsh r9, [r14, #96] @ R9=ROWr16[6]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
399 addne r4, r4, r11 @ R4+=W4*ROWr16[4] (a3)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
400 @@ W6 alone is no more useful, save W2*ROWr16[6] in it instead
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
401 teq r9, #0 @ if null avoid muls
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
402 mulne r11, r10, r9 @ R11=W6*ROWr16[6]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
403 addne r6, r6, r11 @ R6+=W6*ROWr16[6] (a0)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
404 mulne r10, r8, r9 @ R10=W2*ROWr16[6]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
405 @@ a0 += W6*row[6];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
406 @@ a3 -= W6*row[6];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
407 @@ a1 -= W2*row[6];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
408 @@ a2 += W2*row[6];
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
409 subne r4, r4, r11 @ R4-=W6*ROWr16[6] (a3)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
410 subne r2, r2, r10 @ R2-=W2*ROWr16[6] (a1)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
411 addne r3, r3, r10 @ R3+=W2*ROWr16[6] (a2)
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
412 __end_a_evaluation2:
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
413 @@ at this point, R0=b0, R1=b1, R2=a1, R3=a2, R4=a3,
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
414 @@ R5=b2, R6=a0, R7=b3, R8 (free), R9 (free), R10 (free), R11 (free),
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
415 @@ R12=__const_ptr_, R14=&block[n]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
416 @@ col[0 ] = ((a0 + b0) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
417 @@ col[8 ] = ((a1 + b1) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
418 @@ col[16] = ((a2 + b2) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
419 @@ col[24] = ((a3 + b3) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
420 @@ col[32] = ((a3 - b3) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
421 @@ col[40] = ((a2 - b2) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
422 @@ col[48] = ((a1 - b1) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
423 @@ col[56] = ((a0 - b0) >> COL_SHIFT);
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
424 @@@@@ no optimisation here @@@@@
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
425 add r8, r6, r0 @ R8=a0+b0
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
426 add r9, r2, r1 @ R9=a1+b1
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
427 mov r8, r8, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
428 mov r9, r9, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
429 strh r8, [r14, #0]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
430 strh r9, [r14, #16]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
431 add r8, r3, r5 @ R8=a2+b2
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
432 add r9, r4, r7 @ R9=a3+b3
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
433 mov r8, r8, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
434 mov r9, r9, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
435 strh r8, [r14, #32]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
436 strh r9, [r14, #48]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
437 sub r8, r4, r7 @ R8=a3-b3
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
438 sub r9, r3, r5 @ R9=a2-b2
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
439 mov r8, r8, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
440 mov r9, r9, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
441 strh r8, [r14, #64]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
442 strh r9, [r14, #80]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
443 sub r8, r2, r1 @ R8=a1-b1
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
444 sub r9, r6, r0 @ R9=a0-b0
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
445 mov r8, r8, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
446 mov r9, r9, asr #COL_SHIFT
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
447 strh r8, [r14, #96]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
448 strh r9, [r14, #112]
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
449
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
450 __end_col_loop:
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
451 @@ at this point, R0-R11 (free)
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
452 @@ R12=__const_ptr_, R14=&block[n]
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
453 ldr r0, [sp, #0] @ R0=block
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
454 teq r0, r14 @ compare current &block[n] to block, when block is reached, the loop is finished.
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
455 sub r14, r14, #2
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
456 bne __col_loop
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
457
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
458
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
459
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
460
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
461 __end_simple_idct_ARM:
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
462 @@ restore registers to previous status!
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
463 add sp, sp, #8 @@ the local variables!
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
464 ldmfd sp!, {r4-r11, r15} @@ update PC with LR content.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
465
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
466
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
467
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
468 @@ kind of sub-function, here not to overload the common case.
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
469 __end_bef_a_evaluation:
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
470 add r2, r6, r11 @ R2=a0+W6*ROWr16[2] (a1)
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
471 mul r11, r8, r4 @ R11=W2*ROWr16[2]
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
472 sub r4, r6, r11 @ R4=a0-W2*ROWr16[2] (a3)
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
473 add r6, r6, r11 @ R6=a0+W2*ROWr16[2] (a0)
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
474 bal __end_a_evaluation
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
475
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
476
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
477 __constant_ptr__: @@ see #defines at the beginning of the source code for values.
2979
bfabfdf9ce55 COSMETICS: tabs --> spaces, some prettyprinting
diego
parents: 2967
diff changeset
478 .align
1347
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
479 .word W1
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
480 .word W2
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
481 .word W3
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
482 .word W4
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
483 .word W5
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
484 .word W6
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
485 .word W7
cca26199ab17 Optimized simple idct for arm by Frederic 'dilb' Boulay <dilb@handhelds.org>. Currently licensed under the GPLv2, but the author allowed to license it under the LGPL, feel free to change
al3x
parents:
diff changeset
486 .word MASK_MSHW