Mercurial > audlegacy-plugins
annotate src/madplug/SFMT-sse2.h @ 2206:7c7e6f5c494e
added myself to authors
author | Eugene Zagidullin <e.asphyx@gmail.com> |
---|---|
date | Tue, 04 Dec 2007 02:11:02 +0300 |
parents | b8dd67ad7b86 |
children |
rev | line source |
---|---|
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
1 /** |
1386
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
2 * @file SFMT-sse2.h |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
3 * @brief SIMD oriented Fast Mersenne Twister(SFMT) for Intel SSE2 |
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
4 * |
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
5 * @author Mutsuo Saito (Hiroshima University) |
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
6 * @author Makoto Matsumoto (Hiroshima University) |
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
7 * |
1386
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
8 * @note We assume LITTLE ENDIAN in this file |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
9 * |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
10 * Copyright (C) 2006, 2007 Mutsuo Saito, Makoto Matsumoto and Hiroshima |
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
11 * University. All rights reserved. |
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
12 * |
1386
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
13 * The new BSD License is applied to this software, see LICENSE.txt |
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
14 */ |
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
15 |
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
16 #ifndef SFMT_SSE2_H |
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
17 #define SFMT_SSE2_H |
1386
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
18 |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
19 inline static __m128i mm_recursion(__m128i *a, __m128i *b, __m128i c, |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
20 __m128i d, __m128i mask) ALWAYSINLINE; |
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
21 |
1386
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
22 /** |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
23 * This function represents the recursion formula. |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
24 * @param a a 128-bit part of the interal state array |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
25 * @param b a 128-bit part of the interal state array |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
26 * @param c a 128-bit part of the interal state array |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
27 * @param d a 128-bit part of the interal state array |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
28 * @param mask 128-bit mask |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
29 * @return output |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
30 */ |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
31 inline static __m128i mm_recursion(__m128i *a, __m128i *b, |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
32 __m128i c, __m128i d, __m128i mask) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
33 __m128i v, x, y, z; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
34 |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
35 x = _mm_load_si128(a); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
36 y = _mm_srli_epi32(*b, SR1); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
37 z = _mm_srli_si128(c, SR2); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
38 v = _mm_slli_epi32(d, SL1); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
39 z = _mm_xor_si128(z, x); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
40 z = _mm_xor_si128(z, v); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
41 x = _mm_slli_si128(x, SL2); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
42 y = _mm_and_si128(y, mask); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
43 z = _mm_xor_si128(z, x); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
44 z = _mm_xor_si128(z, y); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
45 return z; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
46 } |
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
47 |
1386
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
48 /** |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
49 * This function fills the internal state array with pseudorandom |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
50 * integers. |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
51 */ |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
52 inline static void gen_rand_all(void) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
53 int i; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
54 __m128i r, r1, r2, mask; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
55 mask = _mm_set_epi32(MSK4, MSK3, MSK2, MSK1); |
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
56 |
1386
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
57 r1 = _mm_load_si128(&sfmt[N - 2].si); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
58 r2 = _mm_load_si128(&sfmt[N - 1].si); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
59 for (i = 0; i < N - POS1; i++) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
60 r = mm_recursion(&sfmt[i].si, &sfmt[i + POS1].si, r1, r2, mask); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
61 _mm_store_si128(&sfmt[i].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
62 r1 = r2; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
63 r2 = r; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
64 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
65 for (; i < N; i++) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
66 r = mm_recursion(&sfmt[i].si, &sfmt[i + POS1 - N].si, r1, r2, mask); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
67 _mm_store_si128(&sfmt[i].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
68 r1 = r2; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
69 r2 = r; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
70 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
71 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
72 |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
73 /** |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
74 * This function fills the user-specified array with pseudorandom |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
75 * integers. |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
76 * |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
77 * @param array an 128-bit array to be filled by pseudorandom numbers. |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
78 * @param size number of 128-bit pesudorandom numbers to be generated. |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
79 */ |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
80 inline static void gen_rand_array(w128_t *array, int size) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
81 int i, j; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
82 __m128i r, r1, r2, mask; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
83 mask = _mm_set_epi32(MSK4, MSK3, MSK2, MSK1); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
84 |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
85 r1 = _mm_load_si128(&sfmt[N - 2].si); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
86 r2 = _mm_load_si128(&sfmt[N - 1].si); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
87 for (i = 0; i < N - POS1; i++) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
88 r = mm_recursion(&sfmt[i].si, &sfmt[i + POS1].si, r1, r2, mask); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
89 _mm_store_si128(&array[i].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
90 r1 = r2; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
91 r2 = r; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
92 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
93 for (; i < N; i++) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
94 r = mm_recursion(&sfmt[i].si, &array[i + POS1 - N].si, r1, r2, mask); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
95 _mm_store_si128(&array[i].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
96 r1 = r2; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
97 r2 = r; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
98 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
99 /* main loop */ |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
100 for (; i < size - N; i++) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
101 r = mm_recursion(&array[i - N].si, &array[i + POS1 - N].si, r1, r2, |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
102 mask); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
103 _mm_store_si128(&array[i].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
104 r1 = r2; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
105 r2 = r; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
106 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
107 for (j = 0; j < 2 * N - size; j++) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
108 r = _mm_load_si128(&array[j + size - N].si); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
109 _mm_store_si128(&sfmt[j].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
110 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
111 for (; i < size; i++) { |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
112 r = mm_recursion(&array[i - N].si, &array[i + POS1 - N].si, r1, r2, |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
113 mask); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
114 _mm_store_si128(&array[i].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
115 _mm_store_si128(&sfmt[j++].si, r); |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
116 r1 = r2; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
117 r2 = r; |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
118 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
119 } |
b8dd67ad7b86
update SFMT files to version 1.3. please let me know if it break on altivec box.
Yoshiki Yazawa <yaz@cc.rim.or.jp>
parents:
922
diff
changeset
|
120 |
922
7e14701aef54
[svn] - replace random number generator in dithering code with SIMD-oriented Fast Mersenne Twister (SFMT). it reduces CPU load on SSE2 or AltiVec capable platform.
yaz
parents:
diff
changeset
|
121 #endif |