annotate ppc/dsputil_ppc.c @ 1708:dea5b2946999 libavcodec

interlaced motion estimation interlaced mpeg2 encoding P & B frames rate distored interlaced mb decission alternate scantable support 4mv encoding fixes (thats also why the regression tests change) passing height to most dsp functions interlaced mpeg4 encoding (no direct mode MBs yet) various related cleanups disabled old motion estimaton algorithms (log, full, ...) they will either be fixed or removed
author michael
date Tue, 30 Dec 2003 16:07:57 +0000
parents 6a4cfc5f9f96
children b370288f004d
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
1 /*
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
2 * Copyright (c) 2002 Brian Foley
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
3 * Copyright (c) 2002 Dieter Shirley
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
4 *
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
5 * This library is free software; you can redistribute it and/or
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
6 * modify it under the terms of the GNU Lesser General Public
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
7 * License as published by the Free Software Foundation; either
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
8 * version 2 of the License, or (at your option) any later version.
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
9 *
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
10 * This library is distributed in the hope that it will be useful,
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
13 * Lesser General Public License for more details.
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
14 *
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
15 * You should have received a copy of the GNU Lesser General Public
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
16 * License along with this library; if not, write to the Free Software
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
18 */
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
19
638
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
20 #include "../dsputil.h"
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
21
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
22 #include "dsputil_ppc.h"
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
23
638
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
24 #ifdef HAVE_ALTIVEC
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
25 #include "dsputil_altivec.h"
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
26 #endif
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
27
1578
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
28 extern void fdct_altivec(int16_t *block);
1092
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
29 extern void idct_put_altivec(uint8_t *dest, int line_size, int16_t *block);
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
30 extern void idct_add_altivec(uint8_t *dest, int line_size, int16_t *block);
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
31
894
a408778eff87 altivec accelerated v-resample patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents: 884
diff changeset
32 int mm_flags = 0;
a408778eff87 altivec accelerated v-resample patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents: 884
diff changeset
33
995
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
34 int mm_support(void)
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
35 {
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
36 int result = 0;
1511
587258262aa5 recommit (of patch, as cvslog msg didnt apply cleanly)
michael
parents: 1352
diff changeset
37 #ifdef HAVE_ALTIVEC
995
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
38 if (has_altivec()) {
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
39 result |= MM_ALTIVEC;
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
40 }
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
41 #endif /* result */
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
42 return result;
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
43 }
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
44
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
45 #ifdef POWERPC_PERFORMANCE_REPORT
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
46 unsigned long long perfdata[POWERPC_NUM_PMC_ENABLED][powerpc_perf_total][powerpc_data_total];
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
47 /* list below must match enum in dsputil_ppc.h */
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
48 static unsigned char* perfname[] = {
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
49 "fft_calc_altivec",
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
50 "gmc1_altivec",
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
51 "dct_unquantize_h263_altivec",
1578
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
52 "fdct_altivec",
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
53 "idct_add_altivec",
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
54 "idct_put_altivec",
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
55 "put_pixels16_altivec",
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
56 "avg_pixels16_altivec",
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
57 "avg_pixels8_altivec",
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
58 "put_pixels8_xy2_altivec",
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
59 "put_no_rnd_pixels8_xy2_altivec",
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
60 "put_pixels16_xy2_altivec",
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
61 "put_no_rnd_pixels16_xy2_altivec",
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
62 "clear_blocks_dcbz32_ppc",
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
63 "clear_blocks_dcbz128_ppc"
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
64 };
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
65 #include <stdio.h>
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
66 #endif
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
67
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
68 #ifdef POWERPC_PERFORMANCE_REPORT
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
69 void powerpc_display_perf_report(void)
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
70 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
71 int i, j;
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
72 fprintf(stderr, "PowerPC performance report\n Values are from the PMC registers, and represent whatever the registers are set to record.\n");
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
73 for(i = 0 ; i < powerpc_perf_total ; i++)
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
74 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
75 for (j = 0; j < POWERPC_NUM_PMC_ENABLED ; j++)
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
76 {
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
77 if (perfdata[j][i][powerpc_data_num] != (unsigned long long)0)
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
78 fprintf(stderr,
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
79 " Function \"%s\" (pmc%d):\n\tmin: %llu\n\tmax: %llu\n\tavg: %1.2lf (%llu)\n",
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
80 perfname[i],
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
81 j+1,
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
82 perfdata[j][i][powerpc_data_min],
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
83 perfdata[j][i][powerpc_data_max],
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
84 (double)perfdata[j][i][powerpc_data_sum] /
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
85 (double)perfdata[j][i][powerpc_data_num],
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
86 perfdata[j][i][powerpc_data_num]);
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
87 }
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
88 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
89 }
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
90 #endif /* POWERPC_PERFORMANCE_REPORT */
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
91
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
92 /* ***** WARNING ***** WARNING ***** WARNING ***** */
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
93 /*
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
94 clear_blocks_dcbz32_ppc will not work properly
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
95 on PowerPC processors with a cache line size
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
96 not equal to 32 bytes.
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
97 Fortunately all processor used by Apple up to
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
98 at least the 7450 (aka second generation G4)
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
99 use 32 bytes cache line.
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
100 This is due to the use of the 'dcbz' instruction.
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
101 It simply clear to zero a single cache line,
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
102 so you need to know the cache line size to use it !
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
103 It's absurd, but it's fast...
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
104
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
105 update 24/06/2003 : Apple released yesterday the G5,
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
106 with a PPC970. cache line size : 128 bytes. Oups.
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
107 The semantic of dcbz was changed, it always clear
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
108 32 bytes. so the function below will work, but will
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
109 be slow. So I fixed check_dcbz_effect to use dcbzl,
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
110 which is defined to clear a cache line (as dcbz before).
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
111 So we still can distinguish, and use dcbz (32 bytes)
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
112 or dcbzl (one cache line) as required.
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
113
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
114 see <http://developer.apple.com/technotes/tn/tn2087.html>
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
115 and <http://developer.apple.com/technotes/tn/tn2086.html>
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
116 */
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
117 void clear_blocks_dcbz32_ppc(DCTELEM *blocks)
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
118 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
119 POWERPC_PERF_DECLARE(powerpc_clear_blocks_dcbz32, 1);
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
120 register int misal = ((unsigned long)blocks & 0x00000010);
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
121 register int i = 0;
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
122 POWERPC_PERF_START_COUNT(powerpc_clear_blocks_dcbz32, 1);
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
123 #if 1
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
124 if (misal) {
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
125 ((unsigned long*)blocks)[0] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
126 ((unsigned long*)blocks)[1] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
127 ((unsigned long*)blocks)[2] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
128 ((unsigned long*)blocks)[3] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
129 i += 16;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
130 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
131 for ( ; i < sizeof(DCTELEM)*6*64 ; i += 32) {
1340
09b8fe0f0139 PPC fixes & clean-up patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1334
diff changeset
132 asm volatile("dcbz %0,%1" : : "b" (blocks), "r" (i) : "memory");
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
133 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
134 if (misal) {
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
135 ((unsigned long*)blocks)[188] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
136 ((unsigned long*)blocks)[189] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
137 ((unsigned long*)blocks)[190] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
138 ((unsigned long*)blocks)[191] = 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
139 i += 16;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
140 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
141 #else
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
142 memset(blocks, 0, sizeof(DCTELEM)*6*64);
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
143 #endif
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
144 POWERPC_PERF_STOP_COUNT(powerpc_clear_blocks_dcbz32, 1);
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
145 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
146
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
147 /* same as above, when dcbzl clear a whole 128B cache line
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
148 i.e. the PPC970 aka G5 */
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
149 #ifndef NO_DCBZL
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
150 void clear_blocks_dcbz128_ppc(DCTELEM *blocks)
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
151 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
152 POWERPC_PERF_DECLARE(powerpc_clear_blocks_dcbz128, 1);
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
153 register int misal = ((unsigned long)blocks & 0x0000007f);
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
154 register int i = 0;
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
155 POWERPC_PERF_START_COUNT(powerpc_clear_blocks_dcbz128, 1);
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
156 #if 1
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
157 if (misal) {
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
158 // we could probably also optimize this case,
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
159 // but there's not much point as the machines
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
160 // aren't available yet (2003-06-26)
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
161 memset(blocks, 0, sizeof(DCTELEM)*6*64);
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
162 }
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
163 else
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
164 for ( ; i < sizeof(DCTELEM)*6*64 ; i += 128) {
1340
09b8fe0f0139 PPC fixes & clean-up patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1334
diff changeset
165 asm volatile("dcbzl %0,%1" : : "b" (blocks), "r" (i) : "memory");
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
166 }
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
167 #else
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
168 memset(blocks, 0, sizeof(DCTELEM)*6*64);
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
169 #endif
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
170 POWERPC_PERF_STOP_COUNT(powerpc_clear_blocks_dcbz128, 1);
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
171 }
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
172 #else
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
173 void clear_blocks_dcbz128_ppc(DCTELEM *blocks)
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
174 {
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
175 memset(blocks, 0, sizeof(DCTELEM)*6*64);
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
176 }
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
177 #endif
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
178
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
179 #ifndef NO_DCBZL
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
180 /* check dcbz report how many bytes are set to 0 by dcbz */
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
181 /* update 24/06/2003 : replace dcbz by dcbzl to get
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
182 the intended effect (Apple "fixed" dcbz)
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
183 unfortunately this cannot be used unless the assembler
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
184 knows about dcbzl ... */
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
185 long check_dcbzl_effect(void)
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
186 {
1033
b4172ff70d27 Altivec on non darwin systems patch by Romain Dolbeau
bellard
parents: 1024
diff changeset
187 register char *fakedata = (char*)av_malloc(1024);
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
188 register char *fakedata_middle;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
189 register long zero = 0;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
190 register long i = 0;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
191 long count = 0;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
192
1033
b4172ff70d27 Altivec on non darwin systems patch by Romain Dolbeau
bellard
parents: 1024
diff changeset
193 if (!fakedata)
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
194 {
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
195 return 0L;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
196 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
197
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
198 fakedata_middle = (fakedata + 512);
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
199
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
200 memset(fakedata, 0xFF, 1024);
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
201
1340
09b8fe0f0139 PPC fixes & clean-up patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1334
diff changeset
202 /* below the constraint "b" seems to mean "Address base register"
09b8fe0f0139 PPC fixes & clean-up patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1334
diff changeset
203 in gcc-3.3 / RS/6000 speaks. seems to avoid using r0, so.... */
09b8fe0f0139 PPC fixes & clean-up patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1334
diff changeset
204 asm volatile("dcbzl %0, %1" : : "b" (fakedata_middle), "r" (zero));
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
205
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
206 for (i = 0; i < 1024 ; i ++)
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
207 {
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
208 if (fakedata[i] == (char)0)
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
209 count++;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
210 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
211
1033
b4172ff70d27 Altivec on non darwin systems patch by Romain Dolbeau
bellard
parents: 1024
diff changeset
212 av_free(fakedata);
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
213
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
214 return count;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
215 }
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
216 #else
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
217 long check_dcbzl_effect(void)
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
218 {
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
219 return 0;
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
220 }
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
221 #endif
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
222
1092
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
223 void dsputil_init_ppc(DSPContext* c, AVCodecContext *avctx)
638
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
224 {
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
225 // Common optimizations whether Altivec is available or not
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
226
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
227 switch (check_dcbzl_effect()) {
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
228 case 32:
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
229 c->clear_blocks = clear_blocks_dcbz32_ppc;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
230 break;
1334
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
231 case 128:
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
232 c->clear_blocks = clear_blocks_dcbz128_ppc;
80c46c310a91 PPC970 patch + cpu-specific tuning support by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1092
diff changeset
233 break;
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
234 default:
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
235 break;
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
236 }
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
237
1511
587258262aa5 recommit (of patch, as cvslog msg didnt apply cleanly)
michael
parents: 1352
diff changeset
238 #ifdef HAVE_ALTIVEC
638
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
239 if (has_altivec()) {
894
a408778eff87 altivec accelerated v-resample patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents: 884
diff changeset
240 mm_flags |= MM_ALTIVEC;
a408778eff87 altivec accelerated v-resample patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents: 884
diff changeset
241
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
242 // Altivec specific optimisations
1708
dea5b2946999 interlaced motion estimation
michael
parents: 1578
diff changeset
243 c->pix_abs[0][1] = sad16_x2_altivec;
dea5b2946999 interlaced motion estimation
michael
parents: 1578
diff changeset
244 c->pix_abs[0][2] = sad16_y2_altivec;
dea5b2946999 interlaced motion estimation
michael
parents: 1578
diff changeset
245 c->pix_abs[0][3] = sad16_xy2_altivec;
dea5b2946999 interlaced motion estimation
michael
parents: 1578
diff changeset
246 c->pix_abs[0][0] = sad16_altivec;
dea5b2946999 interlaced motion estimation
michael
parents: 1578
diff changeset
247 c->pix_abs[1][0] = sad8_altivec;
dea5b2946999 interlaced motion estimation
michael
parents: 1578
diff changeset
248 c->sad[0]= sad16_altivec;
dea5b2946999 interlaced motion estimation
michael
parents: 1578
diff changeset
249 c->sad[1]= sad8_altivec;
878
6ea69518e5f7 altivec optimizations patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents: 856
diff changeset
250 c->pix_norm1 = pix_norm1_altivec;
981
8bec850dc9c7 altivec patches by Romain Dolbeau
bellard
parents: 978
diff changeset
251 c->sse[1]= sse8_altivec;
8bec850dc9c7 altivec patches by Romain Dolbeau
bellard
parents: 978
diff changeset
252 c->sse[0]= sse16_altivec;
856
3c6df37177dd * using DSPContext - so each codec could use its local (sub)set of CPU extension
kabi
parents: 828
diff changeset
253 c->pix_sum = pix_sum_altivec;
3c6df37177dd * using DSPContext - so each codec could use its local (sub)set of CPU extension
kabi
parents: 828
diff changeset
254 c->diff_pixels = diff_pixels_altivec;
3c6df37177dd * using DSPContext - so each codec could use its local (sub)set of CPU extension
kabi
parents: 828
diff changeset
255 c->get_pixels = get_pixels_altivec;
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
256 // next one disabled as it's untested.
995
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
257 #if 0
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
258 c->add_bytes= add_bytes_altivec;
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
259 #endif /* 0 */
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 995
diff changeset
260 c->put_pixels_tab[0][0] = put_pixels16_altivec;
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
261 /* the tow functions do the same thing, so use the same code */
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
262 c->put_no_rnd_pixels_tab[0][0] = put_pixels16_altivec;
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 995
diff changeset
263 c->avg_pixels_tab[0][0] = avg_pixels16_altivec;
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
264 // next one disabled as it's untested.
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
265 #if 0
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
266 c->avg_pixels_tab[1][0] = avg_pixels8_altivec;
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
267 #endif /* 0 */
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
268 c->put_pixels_tab[1][3] = put_pixels8_xy2_altivec;
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
269 c->put_no_rnd_pixels_tab[1][3] = put_no_rnd_pixels8_xy2_altivec;
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
270 c->put_pixels_tab[0][3] = put_pixels16_xy2_altivec;
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
271 c->put_no_rnd_pixels_tab[0][3] = put_no_rnd_pixels16_xy2_altivec;
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
272
995
edc10966b081 altivec jumbo patch by (Romain Dolbeau <dolbeaur at club-internet dot fr>)
michaelni
parents: 981
diff changeset
273 c->gmc1 = gmc1_altivec;
1092
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
274
1578
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
275 #ifdef CONFIG_ENCODERS
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
276 if (avctx->dct_algo == FF_DCT_AUTO ||
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
277 avctx->dct_algo == FF_DCT_ALTIVEC)
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
278 {
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
279 c->fdct = fdct_altivec;
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
280 }
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
281 #endif //CONFIG_ENCODERS
6a4cfc5f9f96 AltiVec optimized fdct patch by (James Klicman <james at klicman dot org>)
michael
parents: 1511
diff changeset
282
1092
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
283 if ((avctx->idct_algo == FF_IDCT_AUTO) ||
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
284 (avctx->idct_algo == FF_IDCT_ALTIVEC))
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
285 {
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
286 c->idct_put = idct_put_altivec;
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
287 c->idct_add = idct_add_altivec;
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
288 #ifndef ALTIVEC_USE_REFERENCE_C_CODE
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
289 c->idct_permutation_type = FF_TRANSPOSE_IDCT_PERM;
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
290 #else /* ALTIVEC_USE_REFERENCE_C_CODE */
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
291 c->idct_permutation_type = FF_NO_IDCT_PERM;
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
292 #endif /* ALTIVEC_USE_REFERENCE_C_CODE */
f59c3f66363b MpegEncContext.(i)dct_* -> DspContext.(i)dct_*
michaelni
parents: 1033
diff changeset
293 }
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
294
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
295 #ifdef POWERPC_PERFORMANCE_REPORT
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 995
diff changeset
296 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
297 int i, j;
1015
35cf2f4a0f8c PPC perf, PPC clear_block, AltiVec put_pixels8_xy2 patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1009
diff changeset
298 for (i = 0 ; i < powerpc_perf_total ; i++)
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 995
diff changeset
299 {
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
300 for (j = 0; j < POWERPC_NUM_PMC_ENABLED ; j++)
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
301 {
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
302 perfdata[j][i][powerpc_data_min] = (unsigned long long)0xFFFFFFFFFFFFFFFF;
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
303 perfdata[j][i][powerpc_data_max] = (unsigned long long)0x0000000000000000;
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
304 perfdata[j][i][powerpc_data_sum] = (unsigned long long)0x0000000000000000;
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
305 perfdata[j][i][powerpc_data_num] = (unsigned long long)0x0000000000000000;
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
306 }
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
307 }
1009
3b7cc8e4b83f AltiVec perf (take 2), plus a couple AltiVec functions by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 995
diff changeset
308 }
1352
e8ff4783f188 1) remove TBL support in PPC performance. It's much more useful to use the
michaelni
parents: 1340
diff changeset
309 #endif /* POWERPC_PERFORMANCE_REPORT */
638
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
310 } else
1024
9cc1031e1864 More AltiVec MC functions patch by (Romain Dolbeau <dolbeau at irisa dot fr>)
michaelni
parents: 1015
diff changeset
311 #endif /* HAVE_ALTIVEC */
638
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
312 {
828
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
313 // Non-AltiVec PPC optimisations
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
314
ace3ccd18dd2 Altivec Patch (Mark III) by (Dieter Shirley <dieters at schemasoft dot com>)
michaelni
parents: 748
diff changeset
315 // ... pending ...
638
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
316 }
0012f75c92bb altivec build tidyup patch by (Brian Foley <bfoley at compsoc dot nuigalway dot ie>)
michaelni
parents:
diff changeset
317 }