annotate arm/dsputil_vfp.S @ 11032:01bd040f8607 libavcodec

Unroll main loop so the edge==0 case is seperate. This allows many things to be simplified away. h264 decoder is overall 1% faster with a mbaff sample and 0.1% slower with the cathedral sample, probably because the slow loop filter code must be loaded into the code cache for each first MB of each row but isnt used for the following MBs.
author michael
date Thu, 28 Jan 2010 01:24:25 +0000
parents bdcc1c52f223
children 361a5fcb4393
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
1 /*
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
2 * Copyright (c) 2008 Siarhei Siamashka <ssvb@users.sourceforge.net>
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
3 *
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
4 * This file is part of FFmpeg.
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
5 *
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
6 * FFmpeg is free software; you can redistribute it and/or
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
7 * modify it under the terms of the GNU Lesser General Public
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
8 * License as published by the Free Software Foundation; either
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
9 * version 2.1 of the License, or (at your option) any later version.
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
10 *
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
11 * FFmpeg is distributed in the hope that it will be useful,
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
14 * Lesser General Public License for more details.
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
15 *
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
16 * You should have received a copy of the GNU Lesser General Public
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
17 * License along with FFmpeg; if not, write to the Free Software
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
18 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
19 */
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
20
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
21 #include "config.h"
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
22 #include "asm.S"
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
23
10348
bdcc1c52f223 ARM: use undocumented .syntax directive to enable UAL syntax
mru
parents: 8590
diff changeset
24 .syntax unified
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
25 /*
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
26 * VFP is a floating point coprocessor used in some ARM cores. VFP11 has 1 cycle
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
27 * throughput for almost all the instructions (except for double precision
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
28 * arithmetics), but rather high latency. Latency is 4 cycles for loads and 8 cycles
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
29 * for arithmetic operations. Scheduling code to avoid pipeline stalls is very
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
30 * important for performance. One more interesting feature is that VFP has
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
31 * independent load/store and arithmetics pipelines, so it is possible to make
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
32 * them work simultaneously and get more than 1 operation per cycle. Load/store
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
33 * pipeline can process 2 single precision floating point values per cycle and
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
34 * supports bulk loads and stores for large sets of registers. Arithmetic operations
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
35 * can be done on vectors, which allows to keep the arithmetics pipeline busy,
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
36 * while the processor may issue and execute other instructions. Detailed
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
37 * optimization manuals can be found at http://www.arm.com
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
38 */
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
39
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
40 /**
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
41 * ARM VFP optimized implementation of 'vector_fmul_c' function.
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
42 * Assume that len is a positive number and is multiple of 8
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
43 */
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
44 @ void ff_vector_fmul_vfp(float *dst, const float *src, int len)
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
45 function ff_vector_fmul_vfp, export=1
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
46 vpush {d8-d15}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
47 mov r3, r0
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
48 fmrx r12, fpscr
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
49 orr r12, r12, #(3 << 16) /* set vector size to 4 */
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
50 fmxr fpscr, r12
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
51
8252
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
52 vldmia r3!, {s0-s3}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
53 vldmia r1!, {s8-s11}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
54 vldmia r3!, {s4-s7}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
55 vldmia r1!, {s12-s15}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
56 vmul.f32 s8, s0, s8
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
57 1:
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
58 subs r2, r2, #16
8252
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
59 vmul.f32 s12, s4, s12
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
60 vldmiage r3!, {s16-s19}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
61 vldmiage r1!, {s24-s27}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
62 vldmiage r3!, {s20-s23}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
63 vldmiage r1!, {s28-s31}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
64 vmulge.f32 s24, s16, s24
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
65 vstmia r0!, {s8-s11}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
66 vstmia r0!, {s12-s15}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
67 vmulge.f32 s28, s20, s28
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
68 vldmiagt r3!, {s0-s3}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
69 vldmiagt r1!, {s8-s11}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
70 vldmiagt r3!, {s4-s7}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
71 vldmiagt r1!, {s12-s15}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
72 vmulge.f32 s8, s0, s8
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
73 vstmiage r0!, {s24-s27}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
74 vstmiage r0!, {s28-s31}
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
75 bgt 1b
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
76
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
77 bic r12, r12, #(7 << 16) /* set vector size back to 1 */
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
78 fmxr fpscr, r12
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
79 vpop {d8-d15}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
80 bx lr
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
81 .endfunc
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
82
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
83 /**
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
84 * ARM VFP optimized implementation of 'vector_fmul_reverse_c' function.
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
85 * Assume that len is a positive number and is multiple of 8
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
86 */
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
87 @ void ff_vector_fmul_reverse_vfp(float *dst, const float *src0,
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
88 @ const float *src1, int len)
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
89 function ff_vector_fmul_reverse_vfp, export=1
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
90 vpush {d8-d15}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
91 add r2, r2, r3, lsl #2
8252
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
92 vldmdb r2!, {s0-s3}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
93 vldmia r1!, {s8-s11}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
94 vldmdb r2!, {s4-s7}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
95 vldmia r1!, {s12-s15}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
96 vmul.f32 s8, s3, s8
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
97 vmul.f32 s9, s2, s9
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
98 vmul.f32 s10, s1, s10
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
99 vmul.f32 s11, s0, s11
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
100 1:
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
101 subs r3, r3, #16
8252
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
102 vldmdbge r2!, {s16-s19}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
103 vmul.f32 s12, s7, s12
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
104 vldmiage r1!, {s24-s27}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
105 vmul.f32 s13, s6, s13
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
106 vldmdbge r2!, {s20-s23}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
107 vmul.f32 s14, s5, s14
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
108 vldmiage r1!, {s28-s31}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
109 vmul.f32 s15, s4, s15
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
110 vmulge.f32 s24, s19, s24
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
111 vldmdbgt r2!, {s0-s3}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
112 vmulge.f32 s25, s18, s25
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
113 vstmia r0!, {s8-s13}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
114 vmulge.f32 s26, s17, s26
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
115 vldmiagt r1!, {s8-s11}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
116 vmulge.f32 s27, s16, s27
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
117 vmulge.f32 s28, s23, s28
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
118 vldmdbgt r2!, {s4-s7}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
119 vmulge.f32 s29, s22, s29
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
120 vstmia r0!, {s14-s15}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
121 vmulge.f32 s30, s21, s30
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
122 vmulge.f32 s31, s20, s31
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
123 vmulge.f32 s8, s3, s8
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
124 vldmiagt r1!, {s12-s15}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
125 vmulge.f32 s9, s2, s9
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
126 vmulge.f32 s10, s1, s10
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
127 vstmiage r0!, {s24-s27}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
128 vmulge.f32 s11, s0, s11
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
129 vstmiage r0!, {s28-s31}
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
130 bgt 1b
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
131
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
132 vpop {d8-d15}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
133 bx lr
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
134 .endfunc
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
135
8590
7a463923ecd1 Change semantic of CONFIG_*, HAVE_* and ARCH_*.
aurel
parents: 8359
diff changeset
136 #if HAVE_ARMV6
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
137 /**
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
138 * ARM VFP optimized float to int16 conversion.
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
139 * Assume that len is a positive number and is multiple of 8, destination
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
140 * buffer is at least 4 bytes aligned (8 bytes alignment is better for
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
141 * performance), little endian byte sex
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
142 */
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
143 @ void ff_float_to_int16_vfp(int16_t *dst, const float *src, int len)
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
144 function ff_float_to_int16_vfp, export=1
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
145 push {r4-r8,lr}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
146 vpush {d8-d11}
8252
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
147 vldmia r1!, {s16-s23}
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
148 vcvt.s32.f32 s0, s16
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
149 vcvt.s32.f32 s1, s17
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
150 vcvt.s32.f32 s2, s18
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
151 vcvt.s32.f32 s3, s19
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
152 vcvt.s32.f32 s4, s20
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
153 vcvt.s32.f32 s5, s21
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
154 vcvt.s32.f32 s6, s22
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
155 vcvt.s32.f32 s7, s23
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
156 1:
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
157 subs r2, r2, #8
8252
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
158 vmov r3, r4, s0, s1
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
159 vmov r5, r6, s2, s3
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
160 vmov r7, r8, s4, s5
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
161 vmov ip, lr, s6, s7
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
162 vldmiagt r1!, {s16-s23}
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
163 ssat r4, #16, r4
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
164 ssat r3, #16, r3
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
165 ssat r6, #16, r6
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
166 ssat r5, #16, r5
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
167 pkhbt r3, r3, r4, lsl #16
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
168 pkhbt r4, r5, r6, lsl #16
8252
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
169 vcvtgt.s32.f32 s0, s16
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
170 vcvtgt.s32.f32 s1, s17
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
171 vcvtgt.s32.f32 s2, s18
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
172 vcvtgt.s32.f32 s3, s19
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
173 vcvtgt.s32.f32 s4, s20
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
174 vcvtgt.s32.f32 s5, s21
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
175 vcvtgt.s32.f32 s6, s22
92008e82ce6c ARM: convert VFP code to UAL syntax
mru
parents: 8071
diff changeset
176 vcvtgt.s32.f32 s7, s23
8071
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
177 ssat r8, #16, r8
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
178 ssat r7, #16, r7
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
179 ssat lr, #16, lr
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
180 ssat ip, #16, ip
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
181 pkhbt r5, r7, r8, lsl #16
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
182 pkhbt r6, ip, lr, lsl #16
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
183 stmia r0!, {r3-r6}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
184 bgt 1b
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
185
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
186 vpop {d8-d11}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
187 pop {r4-r8,pc}
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
188 .endfunc
2487a9db02a0 ARM: move VFP DSP functions to dsputils_vfp.S
mru
parents:
diff changeset
189 #endif