# HG changeset patch # User reimar # Date 1350821438 0 # Node ID b924f0df5a1df6dc123f529eee19b8359d759fb1 # Parent bc0898c7399b967cdb9f8db6f88da8977ba90ee3 Remove our internal mp3lib copy. We have FFmpeg as integrated decoder, and the external mpg123 library should include all important improvements from our mp3lib and is actually properly maintained. diff -r bc0898c7399b -r b924f0df5a1d Makefile --- a/Makefile Sun Oct 21 11:14:13 2012 +0000 +++ b/Makefile Sun Oct 21 12:10:38 2012 +0000 @@ -164,21 +164,6 @@ SRCS_COMMON-$(MNG) += libmpdemux/demux_mng.c SRCS_COMMON-$(MPG123) += libmpcodecs/ad_mpg123.c -SRCS_MP3LIB-X86-$(HAVE_AMD3DNOW) += mp3lib/dct36_3dnow.c \ - mp3lib/dct64_3dnow.c -SRCS_MP3LIB-X86-$(HAVE_AMD3DNOWEXT) += mp3lib/dct36_k7.c \ - mp3lib/dct64_k7.c -SRCS_MP3LIB-X86-$(HAVE_MMX) += mp3lib/dct64_mmx.c -SRCS_MP3LIB-$(ARCH_X86_32) += mp3lib/decode_i586.c \ - $(SRCS_MP3LIB-X86-yes) -SRCS_MP3LIB-$(HAVE_ALTIVEC) += mp3lib/dct64_altivec.c -SRCS_MP3LIB-$(HAVE_MMX) += mp3lib/decode_mmx.c -SRCS_MP3LIB-$(HAVE_SSE) += mp3lib/dct64_sse.c -SRCS_MP3LIB += mp3lib/sr1.c \ - $(SRCS_MP3LIB-yes) -SRCS_COMMON-$(MP3LIB) += libmpcodecs/ad_mp3lib.c \ - $(SRCS_MP3LIB) - SRCS_COMMON-$(MUSEPACK) += libmpcodecs/ad_mpc.c \ libmpdemux/demux_mpc.c SRCS_COMMON-$(NATIVE_RTSP) += stream/stream_rtsp.c \ @@ -1035,9 +1020,7 @@ loader/qtx/list$(EXESUF) loader/qtx/qtxload$(EXESUF): CFLAGS += -g loader/qtx/list$(EXESUF) loader/qtx/qtxload$(EXESUF): $(LOADER_TEST_OBJS) -mp3lib/test$(EXESUF) mp3lib/test2$(EXESUF): $(SRCS_MP3LIB:.c=.o) libvo/aclib.o cpudetect.o $(TEST_OBJS) - -TESTS = codecs2html codec-cfg-test libvo/aspecttest mp3lib/test mp3lib/test2 +TESTS = codecs2html codec-cfg-test libvo/aspecttest ifdef ARCH_X86_32 TESTS += loader/qtx/list loader/qtx/qtxload diff -r bc0898c7399b -r b924f0df5a1d configure --- a/configure Sun Oct 21 11:14:13 2012 +0000 +++ b/configure Sun Oct 21 12:10:38 2012 +0000 @@ -449,7 +449,6 @@ --disable-twolame disable Twolame (MPEG layer 2) encoding [autodetect] --enable-xmms enable XMMS input plugin support [disabled] --enable-libdca enable libdca support [autodetect] - --disable-mp3lib disable builtin mp3lib [autodetect] --disable-liba52 disable liba52 [autodetect] --disable-libmpeg2 disable libmpeg2 [autodetect] --disable-libmpeg2-internal disable builtin libmpeg2 [autodetect] @@ -757,7 +756,6 @@ _libgsm=auto _theora=auto _mpg123=auto -_mp3lib=auto _liba52=auto _libdca=auto _libmpeg2=auto @@ -1156,8 +1154,6 @@ --disable-theora) _theora=no ;; --enable-mpg123) _mpg123=yes ;; --disable-mpg123) _mpg123=no ;; - --enable-mp3lib) _mp3lib=yes ;; - --disable-mp3lib) _mp3lib=no ;; --enable-liba52) _liba52=yes ;; --disable-liba52) _liba52=no ;; --enable-libdca) _libdca=yes ;; @@ -6410,19 +6406,6 @@ fi echores "$_theora" -echocheck "mp3lib support" -if test "$_mp3lib" = auto ; then - test "$cc_vendor" = intel && test "$_cc_major" -le 10 -o "$_cc_major" -eq 11 -a "$_cc_minor" -eq 0 && _mp3lib=no || _mp3lib=yes -fi -if test "$_mp3lib" = yes ; then - def_mp3lib='#define CONFIG_MP3LIB 1' - codecmodules="mp3lib(internal) $codecmodules" -else - def_mp3lib='#undef CONFIG_MP3LIB' - nocodecmodules="mp3lib(internal) $nocodecmodules" -fi -echores "$_mp3lib" - # Any version of libmpg123 that knows MPG123_RESYNC_LIMIT shall be fine. # That is, 1.2.0 onwards. Recommened is 1.14 onwards, though. echocheck "mpg123 support" @@ -8251,7 +8234,6 @@ MGA = $_mga MNG = $_mng MP3LAME = $_mp3lame -MP3LIB = $_mp3lib MPG123 = $_mpg123 MUSEPACK = $_musepack NAS = $_nas @@ -8636,7 +8618,6 @@ $def_mp3lame $def_mp3lame_preset $def_mp3lame_preset_medium -$def_mp3lib $def_mpg123 $def_musepack $def_speex diff -r bc0898c7399b -r b924f0df5a1d etc/codecs.conf --- a/etc/codecs.conf Sun Oct 21 11:14:13 2012 +0000 +++ b/etc/codecs.conf Sun Oct 21 12:10:38 2012 +0000 @@ -5048,25 +5048,10 @@ driver ffmpeg dll "sonic" -audiocodec mp3 - ; this is preferred over ffmp2/ffmp3 since it is faster due to using - ; floating point and there are even broken mkv files where the audio - ; needs to be parsed, making this codec work more reliably - info "mp3lib MPEG layer-2, layer-3" - status buggy - comment "Barely maintained, miscompiles with newer gcc versions" - format 0x50 ; layer-1 && layer-2 - format 0x55 ; layer-3 - format 0x5500736d ; "ms\0\x55" older mp3 fcc (MOV files) - format 0x5000736d ; "ms\0\x50" older mp2 fcc (MOV files) - format 0x55005354 ; broken file - fourcc ".mp3" ; CBR/VBR MP3 (MOV files) - fourcc "MP3 " ; used in .nsv files - fourcc "LAME" ; used in mythtv .nuv files - driver mp3lib - audiocodec mpg123 ; this is preferred over ffmp2/ffmp3 since it is faster, generally + ; and there are even broken mkv files where the audio + ; needs to be parsed, making this codec work more reliably info "MPEG 1.0/2.0/2.5 layers I, II, III" status working comment "High-performance decoder using libmpg123." diff -r bc0898c7399b -r b924f0df5a1d libmpcodecs/ad.c --- a/libmpcodecs/ad.c Sun Oct 21 11:14:13 2012 +0000 +++ b/libmpcodecs/ad.c Sun Oct 21 12:10:38 2012 +0000 @@ -66,9 +66,6 @@ #ifdef CONFIG_MPG123 &mpcodecs_ad_mpg123, #endif -#ifdef CONFIG_MP3LIB - &mpcodecs_ad_mp3lib, -#endif #ifdef CONFIG_LIBA52 &mpcodecs_ad_liba52, #endif diff -r bc0898c7399b -r b924f0df5a1d libmpcodecs/ad_mp3lib.c --- a/libmpcodecs/ad_mp3lib.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,99 +0,0 @@ -/* - * This file is part of MPlayer. - * - * MPlayer is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * MPlayer is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License along - * with MPlayer; if not, write to the Free Software Foundation, Inc., - * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. - */ - -#include -#include -#include - -#include "config.h" - -#include "ad_internal.h" -#include "dec_audio.h" -#include "ad_mp3lib.h" - -static const ad_info_t info = -{ - "MPEG layer-2, layer-3", - "mp3lib", - "Nick Kurshev", - "mpg123", - "Optimized to MMX/SSE/3Dnow!" -}; - -LIBAD_EXTERN(mp3lib) - -#include "mp3lib/mp3.h" - -static sh_audio_t* dec_audio_sh=NULL; - -// MP3 decoder buffer callback: -int mplayer_audio_read(char *buf,int size){ - return demux_read_data(dec_audio_sh->ds,buf,size); -} - -static int preinit(sh_audio_t *sh) -{ - sh->audio_out_minsize=32*36*2*2; //4608; - return 1; -} - -static int init(sh_audio_t *sh) -{ - // MPEG Audio: - dec_audio_sh=sh; // save sh_audio for the callback: -// MP3_Init(fakemono,mplayer_accel,&mplayer_audio_read); // TODO!!! -#ifdef CONFIG_FAKE_MONO - MP3_Init(fakemono); -#else - MP3_Init(); -#endif - MP3_samplerate=MP3_channels=0; - sh->a_buffer_len=MP3_DecodeFrame(sh->a_buffer,-1); - if(!sh->a_buffer_len) return 0; // unsupported layer/format - sh->channels=2; // hack - sh->samplesize=2; - sh->samplerate=MP3_samplerate; - sh->i_bps=MP3_bitrate*(1000/8); - MP3_PrintHeader(); - return 1; -} - -static void uninit(sh_audio_t *sh) -{ -} - -static int control(sh_audio_t *sh,int cmd,void* arg, ...) -{ - switch(cmd) - { - case ADCTRL_RESYNC_STREAM: - MP3_DecodeFrame(NULL,-2); // resync - MP3_DecodeFrame(NULL,-2); // resync - MP3_DecodeFrame(NULL,-2); // resync - return CONTROL_TRUE; - case ADCTRL_SKIP_FRAME: - MP3_DecodeFrame(NULL,-2); // skip MPEG frame - return CONTROL_TRUE; - } - return CONTROL_UNKNOWN; -} - -static int decode_audio(sh_audio_t *sh_audio,unsigned char *buf,int minlen,int maxlen) -{ - return MP3_DecodeFrame(buf,-1); -} diff -r bc0898c7399b -r b924f0df5a1d libmpcodecs/ad_mp3lib.h --- a/libmpcodecs/ad_mp3lib.h Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,24 +0,0 @@ -/* - * This file is part of MPlayer. - * - * MPlayer is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * MPlayer is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License along - * with MPlayer; if not, write to the Free Software Foundation, Inc., - * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. - */ - -#ifndef MPLAYER_AD_MP3LIB_H -#define MPLAYER_AD_MP3LIB_H - -int mplayer_audio_read(char *buf, int size); - -#endif /* MPLAYER_AD_MP3LIB_H */ diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct12.c --- a/mp3lib/dct12.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,139 +0,0 @@ -/* - * new DCT12 - */ -static void dct12(real *in,real *rawout1,real *rawout2,register real *wi,register real *ts) -{ -#define DCT12_PART1 \ - in5 = in[5*3]; \ - in5 += (in4 = in[4*3]); \ - in4 += (in3 = in[3*3]); \ - in3 += (in2 = in[2*3]); \ - in2 += (in1 = in[1*3]); \ - in1 += (in0 = in[0*3]); \ - \ - in5 += in3; in3 += in1; \ - \ - in2 *= COS6_1; \ - in3 *= COS6_1; \ - -#define DCT12_PART2 \ - in0 += in4 * COS6_2; \ - \ - in4 = in0 + in2; \ - in0 -= in2; \ - \ - in1 += in5 * COS6_2; \ - \ - in5 = (in1 + in3) * tfcos12[0]; \ - in1 = (in1 - in3) * tfcos12[2]; \ - \ - in3 = in4 + in5; \ - in4 -= in5; \ - \ - in2 = in0 + in1; \ - in0 -= in1; - - - { - real in0,in1,in2,in3,in4,in5; - register real *out1 = rawout1; - ts[SBLIMIT*0] = out1[0]; ts[SBLIMIT*1] = out1[1]; ts[SBLIMIT*2] = out1[2]; - ts[SBLIMIT*3] = out1[3]; ts[SBLIMIT*4] = out1[4]; ts[SBLIMIT*5] = out1[5]; - - DCT12_PART1 - - { - real tmp0,tmp1 = (in0 - in4); - { - real tmp2 = (in1 - in5) * tfcos12[1]; - tmp0 = tmp1 + tmp2; - tmp1 -= tmp2; - } - ts[(17-1)*SBLIMIT] = out1[17-1] + tmp0 * wi[11-1]; - ts[(12+1)*SBLIMIT] = out1[12+1] + tmp0 * wi[6+1]; - ts[(6 +1)*SBLIMIT] = out1[6 +1] + tmp1 * wi[1]; - ts[(11-1)*SBLIMIT] = out1[11-1] + tmp1 * wi[5-1]; - } - - DCT12_PART2 - - ts[(17-0)*SBLIMIT] = out1[17-0] + in2 * wi[11-0]; - ts[(12+0)*SBLIMIT] = out1[12+0] + in2 * wi[6+0]; - ts[(12+2)*SBLIMIT] = out1[12+2] + in3 * wi[6+2]; - ts[(17-2)*SBLIMIT] = out1[17-2] + in3 * wi[11-2]; - - ts[(6+0)*SBLIMIT] = out1[6+0] + in0 * wi[0]; - ts[(11-0)*SBLIMIT] = out1[11-0] + in0 * wi[5-0]; - ts[(6+2)*SBLIMIT] = out1[6+2] + in4 * wi[2]; - ts[(11-2)*SBLIMIT] = out1[11-2] + in4 * wi[5-2]; - } - - in++; - - { - real in0,in1,in2,in3,in4,in5; - register real *out2 = rawout2; - - DCT12_PART1 - - { - real tmp0,tmp1 = (in0 - in4); - { - real tmp2 = (in1 - in5) * tfcos12[1]; - tmp0 = tmp1 + tmp2; - tmp1 -= tmp2; - } - out2[5-1] = tmp0 * wi[11-1]; - out2[0+1] = tmp0 * wi[6+1]; - ts[(12+1)*SBLIMIT] += tmp1 * wi[1]; - ts[(17-1)*SBLIMIT] += tmp1 * wi[5-1]; - } - - DCT12_PART2 - - out2[5-0] = in2 * wi[11-0]; - out2[0+0] = in2 * wi[6+0]; - out2[0+2] = in3 * wi[6+2]; - out2[5-2] = in3 * wi[11-2]; - - ts[(12+0)*SBLIMIT] += in0 * wi[0]; - ts[(17-0)*SBLIMIT] += in0 * wi[5-0]; - ts[(12+2)*SBLIMIT] += in4 * wi[2]; - ts[(17-2)*SBLIMIT] += in4 * wi[5-2]; - } - - in++; - - { - real in0,in1,in2,in3,in4,in5; - register real *out2 = rawout2; - out2[12]=out2[13]=out2[14]=out2[15]=out2[16]=out2[17]=0.0; - - DCT12_PART1 - - { - real tmp0,tmp1 = (in0 - in4); - { - real tmp2 = (in1 - in5) * tfcos12[1]; - tmp0 = tmp1 + tmp2; - tmp1 -= tmp2; - } - out2[11-1] = tmp0 * wi[11-1]; - out2[6 +1] = tmp0 * wi[6+1]; - out2[0+1] += tmp1 * wi[1]; - out2[5-1] += tmp1 * wi[5-1]; - } - - DCT12_PART2 - - out2[11-0] = in2 * wi[11-0]; - out2[6 +0] = in2 * wi[6+0]; - out2[6 +2] = in3 * wi[6+2]; - out2[11-2] = in3 * wi[11-2]; - - out2[0+0] += in0 * wi[0]; - out2[5-0] += in0 * wi[5-0]; - out2[0+2] += in4 * wi[2]; - out2[5-2] += in4 * wi[5-2]; - } -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct36.c --- a/mp3lib/dct36.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,269 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* -// This is an optimized DCT from Jeff Tsay's maplay 1.2+ package. -// Saved one multiplication by doing the 'twiddle factor' stuff -// together with the window mul. (MH) -// -// This uses Byeong Gi Lee's Fast Cosine Transform algorithm, but the -// 9 point IDCT needs to be reduced further. Unfortunately, I don't -// know how to do that, because 9 is not an even number. - Jeff. -// -////////////////////////////////////////////////////////////////// -// -// 9 Point Inverse Discrete Cosine Transform -// -// This piece of code is Copyright 1997 Mikko Tommila and is freely usable -// by anybody. The algorithm itself is of course in the public domain. -// -// Again derived heuristically from the 9-point WFTA. -// -// The algorithm is optimized (?) for speed, not for small rounding errors or -// good readability. -// -// 36 additions, 11 multiplications -// -// Again this is very likely sub-optimal. -// -// The code is optimized to use a minimum number of temporary variables, -// so it should compile quite well even on 8-register Intel x86 processors. -// This makes the code quite obfuscated and very difficult to understand. -// -// References: -// [1] S. Winograd: "On Computing the Discrete Fourier Transform", -// Mathematics of Computation, Volume 32, Number 141, January 1978, -// Pages 175-199 -*/ - -/*------------------------------------------------------------------*/ -/* */ -/* Function: Calculation of the inverse MDCT */ -/* */ -/*------------------------------------------------------------------*/ - -static void dct36(real *inbuf,real *o1,real *o2,real *wintab,real *tsbuf) -{ -#ifdef NEW_DCT9 - real tmp[18]; -#endif - - { - register real *in = inbuf; - - in[17]+=in[16]; in[16]+=in[15]; in[15]+=in[14]; - in[14]+=in[13]; in[13]+=in[12]; in[12]+=in[11]; - in[11]+=in[10]; in[10]+=in[9]; in[9] +=in[8]; - in[8] +=in[7]; in[7] +=in[6]; in[6] +=in[5]; - in[5] +=in[4]; in[4] +=in[3]; in[3] +=in[2]; - in[2] +=in[1]; in[1] +=in[0]; - - in[17]+=in[15]; in[15]+=in[13]; in[13]+=in[11]; in[11]+=in[9]; - in[9] +=in[7]; in[7] +=in[5]; in[5] +=in[3]; in[3] +=in[1]; - - -#ifdef NEW_DCT9 - { - real t0, t1, t2, t3, t4, t5, t6, t7; - - t1 = COS6_2 * in[12]; - t2 = COS6_2 * (in[8] + in[16] - in[4]); - - t3 = in[0] + t1; - t4 = in[0] - t1 - t1; - t5 = t4 - t2; - - t0 = cos9[0] * (in[4] + in[8]); - t1 = cos9[1] * (in[8] - in[16]); - - tmp[4] = t4 + t2 + t2; - t2 = cos9[2] * (in[4] + in[16]); - - t6 = t3 - t0 - t2; - t0 += t3 + t1; - t3 += t2 - t1; - - t2 = cos18[0] * (in[2] + in[10]); - t4 = cos18[1] * (in[10] - in[14]); - t7 = COS6_1 * in[6]; - - t1 = t2 + t4 + t7; - tmp[0] = t0 + t1; - tmp[8] = t0 - t1; - t1 = cos18[2] * (in[2] + in[14]); - t2 += t1 - t7; - - tmp[3] = t3 + t2; - t0 = COS6_1 * (in[10] + in[14] - in[2]); - tmp[5] = t3 - t2; - - t4 -= t1 + t7; - - tmp[1] = t5 - t0; - tmp[7] = t5 + t0; - tmp[2] = t6 + t4; - tmp[6] = t6 - t4; - } - - { - real t0, t1, t2, t3, t4, t5, t6, t7; - - t1 = COS6_2 * in[13]; - t2 = COS6_2 * (in[9] + in[17] - in[5]); - - t3 = in[1] + t1; - t4 = in[1] - t1 - t1; - t5 = t4 - t2; - - t0 = cos9[0] * (in[5] + in[9]); - t1 = cos9[1] * (in[9] - in[17]); - - tmp[13] = (t4 + t2 + t2) * tfcos36[17-13]; - t2 = cos9[2] * (in[5] + in[17]); - - t6 = t3 - t0 - t2; - t0 += t3 + t1; - t3 += t2 - t1; - - t2 = cos18[0] * (in[3] + in[11]); - t4 = cos18[1] * (in[11] - in[15]); - t7 = COS6_1 * in[7]; - - t1 = t2 + t4 + t7; - tmp[17] = (t0 + t1) * tfcos36[17-17]; - tmp[9] = (t0 - t1) * tfcos36[17-9]; - t1 = cos18[2] * (in[3] + in[15]); - t2 += t1 - t7; - - tmp[14] = (t3 + t2) * tfcos36[17-14]; - t0 = COS6_1 * (in[11] + in[15] - in[3]); - tmp[12] = (t3 - t2) * tfcos36[17-12]; - - t4 -= t1 + t7; - - tmp[16] = (t5 - t0) * tfcos36[17-16]; - tmp[10] = (t5 + t0) * tfcos36[17-10]; - tmp[15] = (t6 + t4) * tfcos36[17-15]; - tmp[11] = (t6 - t4) * tfcos36[17-11]; - } - -#define MACRO(v) { \ - real tmpval; \ - real sum0 = tmp[(v)]; \ - real sum1 = tmp[17-(v)]; \ - out2[9+(v)] = (tmpval = sum0 + sum1) * w[27+(v)]; \ - out2[8-(v)] = tmpval * w[26-(v)]; \ - sum0 -= sum1; \ - ts[SBLIMIT*(8-(v))] = out1[8-(v)] + sum0 * w[8-(v)]; \ - ts[SBLIMIT*(9+(v))] = out1[9+(v)] + sum0 * w[9+(v)]; } - -{ - register real *out2 = o2; - register real *w = wintab; - register real *out1 = o1; - register real *ts = tsbuf; - - MACRO(0); - MACRO(1); - MACRO(2); - MACRO(3); - MACRO(4); - MACRO(5); - MACRO(6); - MACRO(7); - MACRO(8); -} - -#else - - { - -#define MACRO0(v) { \ - real tmp; \ - out2[9+(v)] = (tmp = sum0 + sum1) * w[27+(v)]; \ - out2[8-(v)] = tmp * w[26-(v)]; } \ - sum0 -= sum1; \ - ts[SBLIMIT*(8-(v))] = out1[8-(v)] + sum0 * w[8-(v)]; \ - ts[SBLIMIT*(9+(v))] = out1[9+(v)] + sum0 * w[9+(v)]; -#define MACRO1(v) { \ - real sum0, sum1; \ - sum0 = tmp1a + tmp2a; \ - sum1 = (tmp1b + tmp2b) * tfcos36[(v)]; \ - MACRO0(v); } -#define MACRO2(v) { \ - real sum0, sum1; \ - sum0 = tmp2a - tmp1a; \ - sum1 = (tmp2b - tmp1b) * tfcos36[(v)]; \ - MACRO0(v); } - - register const real *c = COS9; - register real *out2 = o2; - register real *w = wintab; - register real *out1 = o1; - register real *ts = tsbuf; - - real ta33,ta66,tb33,tb66; - - ta33 = in[2*3+0] * c[3]; - ta66 = in[2*6+0] * c[6]; - tb33 = in[2*3+1] * c[3]; - tb66 = in[2*6+1] * c[6]; - - { - real tmp1a,tmp2a,tmp1b,tmp2b; - tmp1a = in[2*1+0] * c[1] + ta33 + in[2*5+0] * c[5] + in[2*7+0] * c[7]; - tmp1b = in[2*1+1] * c[1] + tb33 + in[2*5+1] * c[5] + in[2*7+1] * c[7]; - tmp2a = in[2*0+0] + in[2*2+0] * c[2] + in[2*4+0] * c[4] + ta66 + in[2*8+0] * c[8]; - tmp2b = in[2*0+1] + in[2*2+1] * c[2] + in[2*4+1] * c[4] + tb66 + in[2*8+1] * c[8]; - - MACRO1(0); - MACRO2(8); - } - - { - real tmp1a,tmp2a,tmp1b,tmp2b; - tmp1a = ( in[2*1+0] - in[2*5+0] - in[2*7+0] ) * c[3]; - tmp1b = ( in[2*1+1] - in[2*5+1] - in[2*7+1] ) * c[3]; - tmp2a = ( in[2*2+0] - in[2*4+0] - in[2*8+0] ) * c[6] - in[2*6+0] + in[2*0+0]; - tmp2b = ( in[2*2+1] - in[2*4+1] - in[2*8+1] ) * c[6] - in[2*6+1] + in[2*0+1]; - - MACRO1(1); - MACRO2(7); - } - - { - real tmp1a,tmp2a,tmp1b,tmp2b; - tmp1a = in[2*1+0] * c[5] - ta33 - in[2*5+0] * c[7] + in[2*7+0] * c[1]; - tmp1b = in[2*1+1] * c[5] - tb33 - in[2*5+1] * c[7] + in[2*7+1] * c[1]; - tmp2a = in[2*0+0] - in[2*2+0] * c[8] - in[2*4+0] * c[2] + ta66 + in[2*8+0] * c[4]; - tmp2b = in[2*0+1] - in[2*2+1] * c[8] - in[2*4+1] * c[2] + tb66 + in[2*8+1] * c[4]; - - MACRO1(2); - MACRO2(6); - } - - { - real tmp1a,tmp2a,tmp1b,tmp2b; - tmp1a = in[2*1+0] * c[7] - ta33 + in[2*5+0] * c[1] - in[2*7+0] * c[5]; - tmp1b = in[2*1+1] * c[7] - tb33 + in[2*5+1] * c[1] - in[2*7+1] * c[5]; - tmp2a = in[2*0+0] - in[2*2+0] * c[4] + in[2*4+0] * c[8] + ta66 - in[2*8+0] * c[2]; - tmp2b = in[2*0+1] - in[2*2+1] * c[4] + in[2*4+1] * c[8] + tb66 - in[2*8+1] * c[2]; - - MACRO1(3); - MACRO2(5); - } - - { - real sum0,sum1; - sum0 = in[2*0+0] - in[2*2+0] + in[2*4+0] - in[2*6+0] + in[2*8+0]; - sum1 = (in[2*0+1] - in[2*2+1] + in[2*4+1] - in[2*6+1] + in[2*8+1] ) * tfcos36[4]; - MACRO0(4); - } - } -#endif - - } -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct36_3dnow.c --- a/mp3lib/dct36_3dnow.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,502 +0,0 @@ -/* - * dct36_3dnow.c - 3DNow! optimized dct36() - * - * This code based 'dct36_3dnow.s' by Syuuhei Kashiyama - * , only two types of changes have been made: - * - * - removed PREFETCH instruction for speedup - * - changed function name for support 3DNow! automatic detection - * - * You can find Kashiyama's original 3dnow! support patch - * (for mpg123-0.59o) at - * http://user.ecc.u-tokyo.ac.jp/~g810370/linux-simd/ (Japanese). - * - * by KIMURA Takuhiro - until 31.Mar.1999 - * - after 1.Apr.1999 - * - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - * - * Original disclaimer: - * The author of this program disclaim whole expressed or implied - * warranties with regard to this program, and in no event shall the - * author of this program liable to whatever resulted from the use of - * this program. Use it at your own risk. - * - * 2003/06/21: Moved to GCC inline assembly - Alex Beregszaszi - */ - -#include "config.h" -#include "mangle.h" -#include "mpg123.h" -#include "libavutil/x86_cpu.h" - -#ifdef DCT36_OPTIMIZE_FOR_K7 -void dct36_3dnowex(real *inbuf, real *o1, - real *o2, real *wintab, real *tsbuf) -#else -void dct36_3dnow(real *inbuf, real *o1, - real *o2, real *wintab, real *tsbuf) -#endif -{ - __asm__ volatile( - "movq (%%"REG_a"),%%mm0\n\t" - "movq 4(%%"REG_a"),%%mm1\n\t" - "pfadd %%mm1,%%mm0\n\t" - "movq %%mm0,4(%%"REG_a")\n\t" - "psrlq $32,%%mm1\n\t" - "movq 12(%%"REG_a"),%%mm2\n\t" - "punpckldq %%mm2,%%mm1\n\t" - "pfadd %%mm2,%%mm1\n\t" - "movq %%mm1,12(%%"REG_a")\n\t" - "psrlq $32,%%mm2\n\t" - "movq 20(%%"REG_a"),%%mm3\n\t" - "punpckldq %%mm3,%%mm2\n\t" - "pfadd %%mm3,%%mm2\n\t" - "movq %%mm2,20(%%"REG_a")\n\t" - "psrlq $32,%%mm3\n\t" - "movq 28(%%"REG_a"),%%mm4\n\t" - "punpckldq %%mm4,%%mm3\n\t" - "pfadd %%mm4,%%mm3\n\t" - "movq %%mm3,28(%%"REG_a")\n\t" - "psrlq $32,%%mm4\n\t" - "movq 36(%%"REG_a"),%%mm5\n\t" - "punpckldq %%mm5,%%mm4\n\t" - "pfadd %%mm5,%%mm4\n\t" - "movq %%mm4,36(%%"REG_a")\n\t" - "psrlq $32,%%mm5\n\t" - "movq 44(%%"REG_a"),%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movq %%mm5,44(%%"REG_a")\n\t" - "psrlq $32,%%mm6\n\t" - "movq 52(%%"REG_a"),%%mm7\n\t" - "punpckldq %%mm7,%%mm6\n\t" - "pfadd %%mm7,%%mm6\n\t" - "movq %%mm6,52(%%"REG_a")\n\t" - "psrlq $32,%%mm7\n\t" - "movq 60(%%"REG_a"),%%mm0\n\t" - "punpckldq %%mm0,%%mm7\n\t" - "pfadd %%mm0,%%mm7\n\t" - "movq %%mm7,60(%%"REG_a")\n\t" - "psrlq $32,%%mm0\n\t" - "movd 68(%%"REG_a"),%%mm1\n\t" - "pfadd %%mm1,%%mm0\n\t" - "movd %%mm0,68(%%"REG_a")\n\t" - "movd 4(%%"REG_a"),%%mm0\n\t" - "movd 12(%%"REG_a"),%%mm1\n\t" - "punpckldq %%mm1,%%mm0\n\t" - "punpckldq 20(%%"REG_a"),%%mm1\n\t" - "pfadd %%mm1,%%mm0\n\t" - "movd %%mm0,12(%%"REG_a")\n\t" - "psrlq $32,%%mm0\n\t" - "movd %%mm0,20(%%"REG_a")\n\t" - "psrlq $32,%%mm1\n\t" - "movd 28(%%"REG_a"),%%mm2\n\t" - "punpckldq %%mm2,%%mm1\n\t" - "punpckldq 36(%%"REG_a"),%%mm2\n\t" - "pfadd %%mm2,%%mm1\n\t" - "movd %%mm1,28(%%"REG_a")\n\t" - "psrlq $32,%%mm1\n\t" - "movd %%mm1,36(%%"REG_a")\n\t" - "psrlq $32,%%mm2\n\t" - "movd 44(%%"REG_a"),%%mm3\n\t" - "punpckldq %%mm3,%%mm2\n\t" - "punpckldq 52(%%"REG_a"),%%mm3\n\t" - "pfadd %%mm3,%%mm2\n\t" - "movd %%mm2,44(%%"REG_a")\n\t" - "psrlq $32,%%mm2\n\t" - "movd %%mm2,52(%%"REG_a")\n\t" - "psrlq $32,%%mm3\n\t" - "movd 60(%%"REG_a"),%%mm4\n\t" - "punpckldq %%mm4,%%mm3\n\t" - "punpckldq 68(%%"REG_a"),%%mm4\n\t" - "pfadd %%mm4,%%mm3\n\t" - "movd %%mm3,60(%%"REG_a")\n\t" - "psrlq $32,%%mm3\n\t" - "movd %%mm3,68(%%"REG_a")\n\t" - - "movq 24(%%"REG_a"),%%mm0\n\t" - "movq 48(%%"REG_a"),%%mm1\n\t" - "movd "MANGLE(COS9)"+12,%%mm2\n\t" - "punpckldq %%mm2,%%mm2\n\t" - "movd "MANGLE(COS9)"+24,%%mm3\n\t" - "punpckldq %%mm3,%%mm3\n\t" - "pfmul %%mm2,%%mm0\n\t" - "pfmul %%mm3,%%mm1\n\t" - "push %%"REG_a"\n\t" - "movl $1,%%eax\n\t" - "movd %%eax,%%mm7\n\t" - "pi2fd %%mm7,%%mm7\n\t" - "pop %%"REG_a"\n\t" - "movq 8(%%"REG_a"),%%mm2\n\t" - "movd "MANGLE(COS9)"+4,%%mm3\n\t" - "punpckldq %%mm3,%%mm3\n\t" - "pfmul %%mm3,%%mm2\n\t" - "pfadd %%mm0,%%mm2\n\t" - "movq 40(%%"REG_a"),%%mm3\n\t" - "movd "MANGLE(COS9)"+20,%%mm4\n\t" - "punpckldq %%mm4,%%mm4\n\t" - "pfmul %%mm4,%%mm3\n\t" - "pfadd %%mm3,%%mm2\n\t" - "movq 56(%%"REG_a"),%%mm3\n\t" - "movd "MANGLE(COS9)"+28,%%mm4\n\t" - "punpckldq %%mm4,%%mm4\n\t" - "pfmul %%mm4,%%mm3\n\t" - "pfadd %%mm3,%%mm2\n\t" - "movq (%%"REG_a"),%%mm3\n\t" - "movq 16(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+8,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfadd %%mm4,%%mm3\n\t" - "movq 32(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+16,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfadd %%mm4,%%mm3\n\t" - "pfadd %%mm1,%%mm3\n\t" - "movq 64(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+32,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfadd %%mm4,%%mm3\n\t" - "movq %%mm2,%%mm4\n\t" - "pfadd %%mm3,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+0,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 108(%%"REG_d"),%%mm6\n\t" - "punpckldq 104(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" -#ifdef DCT36_OPTIMIZE_FOR_K7 - "pswapd %%mm5,%%mm5\n\t" - "movq %%mm5,32(%%"REG_c")\n\t" -#else - "movd %%mm5,36(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,32(%%"REG_c")\n\t" -#endif - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 32(%%"REG_d"),%%mm6\n\t" - "punpckldq 36(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 32(%%"REG_S"),%%mm6\n\t" - "punpckldq 36(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,1024(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,1152(%%"REG_D")\n\t" - "movq %%mm3,%%mm4\n\t" - "pfsub %%mm2,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+32,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 140(%%"REG_d"),%%mm6\n\t" - "punpckldq 72(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,68(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,0(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 0(%%"REG_d"),%%mm6\n\t" - "punpckldq 68(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 0(%%"REG_S"),%%mm6\n\t" - "punpckldq 68(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,0(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,2176(%%"REG_D")\n\t" - "movq 8(%%"REG_a"),%%mm2\n\t" - "movq 40(%%"REG_a"),%%mm3\n\t" - "pfsub %%mm3,%%mm2\n\t" - "movq 56(%%"REG_a"),%%mm3\n\t" - "pfsub %%mm3,%%mm2\n\t" - "movd "MANGLE(COS9)"+12,%%mm3\n\t" - "punpckldq %%mm3,%%mm3\n\t" - "pfmul %%mm3,%%mm2\n\t" - "movq 16(%%"REG_a"),%%mm3\n\t" - "movq 32(%%"REG_a"),%%mm4\n\t" - "pfsub %%mm4,%%mm3\n\t" - "movq 64(%%"REG_a"),%%mm4\n\t" - "pfsub %%mm4,%%mm3\n\t" - "movd "MANGLE(COS9)"+24,%%mm4\n\t" - "punpckldq %%mm4,%%mm4\n\t" - "pfmul %%mm4,%%mm3\n\t" - "movq 48(%%"REG_a"),%%mm4\n\t" - "pfsub %%mm4,%%mm3\n\t" - "movq (%%"REG_a"),%%mm4\n\t" - "pfadd %%mm4,%%mm3\n\t" - "movq %%mm2,%%mm4\n\t" - "pfadd %%mm3,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+4,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 112(%%"REG_d"),%%mm6\n\t" - "punpckldq 100(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,40(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,28(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 28(%%"REG_d"),%%mm6\n\t" - "punpckldq 40(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 28(%%"REG_S"),%%mm6\n\t" - "punpckldq 40(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,896(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,1280(%%"REG_D")\n\t" - "movq %%mm3,%%mm4\n\t" - "pfsub %%mm2,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+28,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 136(%%"REG_d"),%%mm6\n\t" - "punpckldq 76(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,64(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,4(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 4(%%"REG_d"),%%mm6\n\t" - "punpckldq 64(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 4(%%"REG_S"),%%mm6\n\t" - "punpckldq 64(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,128(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,2048(%%"REG_D")\n\t" - - "movq 8(%%"REG_a"),%%mm2\n\t" - "movd "MANGLE(COS9)"+20,%%mm3\n\t" - "punpckldq %%mm3,%%mm3\n\t" - "pfmul %%mm3,%%mm2\n\t" - "pfsub %%mm0,%%mm2\n\t" - "movq 40(%%"REG_a"),%%mm3\n\t" - "movd "MANGLE(COS9)"+28,%%mm4\n\t" - "punpckldq %%mm4,%%mm4\n\t" - "pfmul %%mm4,%%mm3\n\t" - "pfsub %%mm3,%%mm2\n\t" - "movq 56(%%"REG_a"),%%mm3\n\t" - "movd "MANGLE(COS9)"+4,%%mm4\n\t" - "punpckldq %%mm4,%%mm4\n\t" - "pfmul %%mm4,%%mm3\n\t" - "pfadd %%mm3,%%mm2\n\t" - "movq (%%"REG_a"),%%mm3\n\t" - "movq 16(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+32,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfsub %%mm4,%%mm3\n\t" - "movq 32(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+8,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfsub %%mm4,%%mm3\n\t" - "pfadd %%mm1,%%mm3\n\t" - "movq 64(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+16,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfadd %%mm4,%%mm3\n\t" - "movq %%mm2,%%mm4\n\t" - "pfadd %%mm3,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+8,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 116(%%"REG_d"),%%mm6\n\t" - "punpckldq 96(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,44(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,24(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 24(%%"REG_d"),%%mm6\n\t" - "punpckldq 44(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 24(%%"REG_S"),%%mm6\n\t" - "punpckldq 44(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,768(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,1408(%%"REG_D")\n\t" - "movq %%mm3,%%mm4\n\t" - "pfsub %%mm2,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+24,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 132(%%"REG_d"),%%mm6\n\t" - "punpckldq 80(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,60(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,8(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 8(%%"REG_d"),%%mm6\n\t" - "punpckldq 60(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 8(%%"REG_S"),%%mm6\n\t" - "punpckldq 60(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,256(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,1920(%%"REG_D")\n\t" - "movq 8(%%"REG_a"),%%mm2\n\t" - "movd "MANGLE(COS9)"+28,%%mm3\n\t" - "punpckldq %%mm3,%%mm3\n\t" - "pfmul %%mm3,%%mm2\n\t" - "pfsub %%mm0,%%mm2\n\t" - "movq 40(%%"REG_a"),%%mm3\n\t" - "movd "MANGLE(COS9)"+4,%%mm4\n\t" - "punpckldq %%mm4,%%mm4\n\t" - "pfmul %%mm4,%%mm3\n\t" - "pfadd %%mm3,%%mm2\n\t" - "movq 56(%%"REG_a"),%%mm3\n\t" - "movd "MANGLE(COS9)"+20,%%mm4\n\t" - "punpckldq %%mm4,%%mm4\n\t" - "pfmul %%mm4,%%mm3\n\t" - "pfsub %%mm3,%%mm2\n\t" - "movq (%%"REG_a"),%%mm3\n\t" - "movq 16(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+16,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfsub %%mm4,%%mm3\n\t" - "movq 32(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+32,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfadd %%mm4,%%mm3\n\t" - "pfadd %%mm1,%%mm3\n\t" - "movq 64(%%"REG_a"),%%mm4\n\t" - "movd "MANGLE(COS9)"+8,%%mm5\n\t" - "punpckldq %%mm5,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "pfsub %%mm4,%%mm3\n\t" - "movq %%mm2,%%mm4\n\t" - "pfadd %%mm3,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+12,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 120(%%"REG_d"),%%mm6\n\t" - "punpckldq 92(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,48(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,20(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 20(%%"REG_d"),%%mm6\n\t" - "punpckldq 48(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 20(%%"REG_S"),%%mm6\n\t" - "punpckldq 48(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,640(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,1536(%%"REG_D")\n\t" - "movq %%mm3,%%mm4\n\t" - "pfsub %%mm2,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+20,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 128(%%"REG_d"),%%mm6\n\t" - "punpckldq 84(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,56(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,12(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 12(%%"REG_d"),%%mm6\n\t" - "punpckldq 56(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 12(%%"REG_S"),%%mm6\n\t" - "punpckldq 56(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,384(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,1792(%%"REG_D")\n\t" - - "movq (%%"REG_a"),%%mm4\n\t" - "movq 16(%%"REG_a"),%%mm3\n\t" - "pfsub %%mm3,%%mm4\n\t" - "movq 32(%%"REG_a"),%%mm3\n\t" - "pfadd %%mm3,%%mm4\n\t" - "movq 48(%%"REG_a"),%%mm3\n\t" - "pfsub %%mm3,%%mm4\n\t" - "movq 64(%%"REG_a"),%%mm3\n\t" - "pfadd %%mm3,%%mm4\n\t" - "movq %%mm7,%%mm5\n\t" - "punpckldq "MANGLE(tfcos36)"+16,%%mm5\n\t" - "pfmul %%mm5,%%mm4\n\t" - "movq %%mm4,%%mm5\n\t" - "pfacc %%mm5,%%mm5\n\t" - "movd 124(%%"REG_d"),%%mm6\n\t" - "punpckldq 88(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd %%mm5,52(%%"REG_c")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,16(%%"REG_c")\n\t" - "movq %%mm4,%%mm6\n\t" - "punpckldq %%mm6,%%mm5\n\t" - "pfsub %%mm6,%%mm5\n\t" - "punpckhdq %%mm5,%%mm5\n\t" - "movd 16(%%"REG_d"),%%mm6\n\t" - "punpckldq 52(%%"REG_d"),%%mm6\n\t" - "pfmul %%mm6,%%mm5\n\t" - "movd 16(%%"REG_S"),%%mm6\n\t" - "punpckldq 52(%%"REG_S"),%%mm6\n\t" - "pfadd %%mm6,%%mm5\n\t" - "movd %%mm5,512(%%"REG_D")\n\t" - "psrlq $32,%%mm5\n\t" - "movd %%mm5,1664(%%"REG_D")\n\t" - - "femms\n\t" - : - : "a" (inbuf), "S" (o1), "c" (o2), "d" (wintab), "D" (tsbuf) - : "memory"); -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct36_k7.c --- a/mp3lib/dct36_k7.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,34 +0,0 @@ -/* - * dct36_k7.c - 3DNowEx(DSP)! optimized dct36() - * - * This code based 'dct36_3dnow.s' by Syuuhei Kashiyama - * , only two types of changes have been made: - * - * - added new opcode PSWAPD - * - removed PREFETCH instruction for speedup - * - changed function name for support 3DNowEx! automatic detection - * - * note: because K7 processors are an aggresive out-of-order three-way - * superscalar ones instruction order is not significand for them. - * - * You can find Kashiyama's original 3dnow! support patch - * (for mpg123-0.59o) at - * http://user.ecc.u-tokyo.ac.jp/~g810370/linux-simd/ (Japanese). - * - * by KIMURA Takuhiro - until 31.Mar.1999 - * - after 1.Apr.1999 - * - * Original disclaimer: - * The author of this program disclaim whole expressed or implied - * warranties with regard to this program, and in no event shall the - * author of this program liable to whatever resulted from the use of - * this program. Use it at your own risk. - * - * Modified by Nick Kurshev - * - * 2003/06/21: Moved to GCC inline assembly - Alex Beregszaszi - */ - -#define DCT36_OPTIMIZE_FOR_K7 - -#include "dct36_3dnow.c" diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct64.c --- a/mp3lib/dct64.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,323 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* - * Discrete Cosine Tansform (DCT) for subband synthesis - * optimized for machines with no auto-increment. - * The performance is highly compiler dependend. Maybe - * the dct64.c version for 'normal' processor may be faster - * even for Intel processors. - */ - -static void dct64_1(real *out0,real *out1,real *b1,real *b2,real *samples) -{ - - { - register real *costab = mp3lib_pnts[0]; - - b1[0x00] = samples[0x00] + samples[0x1F]; - b1[0x1F] = (samples[0x00] - samples[0x1F]) * costab[0x0]; - - b1[0x01] = samples[0x01] + samples[0x1E]; - b1[0x1E] = (samples[0x01] - samples[0x1E]) * costab[0x1]; - - b1[0x02] = samples[0x02] + samples[0x1D]; - b1[0x1D] = (samples[0x02] - samples[0x1D]) * costab[0x2]; - - b1[0x03] = samples[0x03] + samples[0x1C]; - b1[0x1C] = (samples[0x03] - samples[0x1C]) * costab[0x3]; - - b1[0x04] = samples[0x04] + samples[0x1B]; - b1[0x1B] = (samples[0x04] - samples[0x1B]) * costab[0x4]; - - b1[0x05] = samples[0x05] + samples[0x1A]; - b1[0x1A] = (samples[0x05] - samples[0x1A]) * costab[0x5]; - - b1[0x06] = samples[0x06] + samples[0x19]; - b1[0x19] = (samples[0x06] - samples[0x19]) * costab[0x6]; - - b1[0x07] = samples[0x07] + samples[0x18]; - b1[0x18] = (samples[0x07] - samples[0x18]) * costab[0x7]; - - b1[0x08] = samples[0x08] + samples[0x17]; - b1[0x17] = (samples[0x08] - samples[0x17]) * costab[0x8]; - - b1[0x09] = samples[0x09] + samples[0x16]; - b1[0x16] = (samples[0x09] - samples[0x16]) * costab[0x9]; - - b1[0x0A] = samples[0x0A] + samples[0x15]; - b1[0x15] = (samples[0x0A] - samples[0x15]) * costab[0xA]; - - b1[0x0B] = samples[0x0B] + samples[0x14]; - b1[0x14] = (samples[0x0B] - samples[0x14]) * costab[0xB]; - - b1[0x0C] = samples[0x0C] + samples[0x13]; - b1[0x13] = (samples[0x0C] - samples[0x13]) * costab[0xC]; - - b1[0x0D] = samples[0x0D] + samples[0x12]; - b1[0x12] = (samples[0x0D] - samples[0x12]) * costab[0xD]; - - b1[0x0E] = samples[0x0E] + samples[0x11]; - b1[0x11] = (samples[0x0E] - samples[0x11]) * costab[0xE]; - - b1[0x0F] = samples[0x0F] + samples[0x10]; - b1[0x10] = (samples[0x0F] - samples[0x10]) * costab[0xF]; - } - - - { - register real *costab = mp3lib_pnts[1]; - - b2[0x00] = b1[0x00] + b1[0x0F]; - b2[0x0F] = (b1[0x00] - b1[0x0F]) * costab[0]; - b2[0x01] = b1[0x01] + b1[0x0E]; - b2[0x0E] = (b1[0x01] - b1[0x0E]) * costab[1]; - b2[0x02] = b1[0x02] + b1[0x0D]; - b2[0x0D] = (b1[0x02] - b1[0x0D]) * costab[2]; - b2[0x03] = b1[0x03] + b1[0x0C]; - b2[0x0C] = (b1[0x03] - b1[0x0C]) * costab[3]; - b2[0x04] = b1[0x04] + b1[0x0B]; - b2[0x0B] = (b1[0x04] - b1[0x0B]) * costab[4]; - b2[0x05] = b1[0x05] + b1[0x0A]; - b2[0x0A] = (b1[0x05] - b1[0x0A]) * costab[5]; - b2[0x06] = b1[0x06] + b1[0x09]; - b2[0x09] = (b1[0x06] - b1[0x09]) * costab[6]; - b2[0x07] = b1[0x07] + b1[0x08]; - b2[0x08] = (b1[0x07] - b1[0x08]) * costab[7]; - - b2[0x10] = b1[0x10] + b1[0x1F]; - b2[0x1F] = (b1[0x1F] - b1[0x10]) * costab[0]; - b2[0x11] = b1[0x11] + b1[0x1E]; - b2[0x1E] = (b1[0x1E] - b1[0x11]) * costab[1]; - b2[0x12] = b1[0x12] + b1[0x1D]; - b2[0x1D] = (b1[0x1D] - b1[0x12]) * costab[2]; - b2[0x13] = b1[0x13] + b1[0x1C]; - b2[0x1C] = (b1[0x1C] - b1[0x13]) * costab[3]; - b2[0x14] = b1[0x14] + b1[0x1B]; - b2[0x1B] = (b1[0x1B] - b1[0x14]) * costab[4]; - b2[0x15] = b1[0x15] + b1[0x1A]; - b2[0x1A] = (b1[0x1A] - b1[0x15]) * costab[5]; - b2[0x16] = b1[0x16] + b1[0x19]; - b2[0x19] = (b1[0x19] - b1[0x16]) * costab[6]; - b2[0x17] = b1[0x17] + b1[0x18]; - b2[0x18] = (b1[0x18] - b1[0x17]) * costab[7]; - } - - { - register real *costab = mp3lib_pnts[2]; - - b1[0x00] = b2[0x00] + b2[0x07]; - b1[0x07] = (b2[0x00] - b2[0x07]) * costab[0]; - b1[0x01] = b2[0x01] + b2[0x06]; - b1[0x06] = (b2[0x01] - b2[0x06]) * costab[1]; - b1[0x02] = b2[0x02] + b2[0x05]; - b1[0x05] = (b2[0x02] - b2[0x05]) * costab[2]; - b1[0x03] = b2[0x03] + b2[0x04]; - b1[0x04] = (b2[0x03] - b2[0x04]) * costab[3]; - - b1[0x08] = b2[0x08] + b2[0x0F]; - b1[0x0F] = (b2[0x0F] - b2[0x08]) * costab[0]; - b1[0x09] = b2[0x09] + b2[0x0E]; - b1[0x0E] = (b2[0x0E] - b2[0x09]) * costab[1]; - b1[0x0A] = b2[0x0A] + b2[0x0D]; - b1[0x0D] = (b2[0x0D] - b2[0x0A]) * costab[2]; - b1[0x0B] = b2[0x0B] + b2[0x0C]; - b1[0x0C] = (b2[0x0C] - b2[0x0B]) * costab[3]; - - b1[0x10] = b2[0x10] + b2[0x17]; - b1[0x17] = (b2[0x10] - b2[0x17]) * costab[0]; - b1[0x11] = b2[0x11] + b2[0x16]; - b1[0x16] = (b2[0x11] - b2[0x16]) * costab[1]; - b1[0x12] = b2[0x12] + b2[0x15]; - b1[0x15] = (b2[0x12] - b2[0x15]) * costab[2]; - b1[0x13] = b2[0x13] + b2[0x14]; - b1[0x14] = (b2[0x13] - b2[0x14]) * costab[3]; - - b1[0x18] = b2[0x18] + b2[0x1F]; - b1[0x1F] = (b2[0x1F] - b2[0x18]) * costab[0]; - b1[0x19] = b2[0x19] + b2[0x1E]; - b1[0x1E] = (b2[0x1E] - b2[0x19]) * costab[1]; - b1[0x1A] = b2[0x1A] + b2[0x1D]; - b1[0x1D] = (b2[0x1D] - b2[0x1A]) * costab[2]; - b1[0x1B] = b2[0x1B] + b2[0x1C]; - b1[0x1C] = (b2[0x1C] - b2[0x1B]) * costab[3]; - } - - { - register real const cos0 = mp3lib_pnts[3][0]; - register real const cos1 = mp3lib_pnts[3][1]; - - b2[0x00] = b1[0x00] + b1[0x03]; - b2[0x03] = (b1[0x00] - b1[0x03]) * cos0; - b2[0x01] = b1[0x01] + b1[0x02]; - b2[0x02] = (b1[0x01] - b1[0x02]) * cos1; - - b2[0x04] = b1[0x04] + b1[0x07]; - b2[0x07] = (b1[0x07] - b1[0x04]) * cos0; - b2[0x05] = b1[0x05] + b1[0x06]; - b2[0x06] = (b1[0x06] - b1[0x05]) * cos1; - - b2[0x08] = b1[0x08] + b1[0x0B]; - b2[0x0B] = (b1[0x08] - b1[0x0B]) * cos0; - b2[0x09] = b1[0x09] + b1[0x0A]; - b2[0x0A] = (b1[0x09] - b1[0x0A]) * cos1; - - b2[0x0C] = b1[0x0C] + b1[0x0F]; - b2[0x0F] = (b1[0x0F] - b1[0x0C]) * cos0; - b2[0x0D] = b1[0x0D] + b1[0x0E]; - b2[0x0E] = (b1[0x0E] - b1[0x0D]) * cos1; - - b2[0x10] = b1[0x10] + b1[0x13]; - b2[0x13] = (b1[0x10] - b1[0x13]) * cos0; - b2[0x11] = b1[0x11] + b1[0x12]; - b2[0x12] = (b1[0x11] - b1[0x12]) * cos1; - - b2[0x14] = b1[0x14] + b1[0x17]; - b2[0x17] = (b1[0x17] - b1[0x14]) * cos0; - b2[0x15] = b1[0x15] + b1[0x16]; - b2[0x16] = (b1[0x16] - b1[0x15]) * cos1; - - b2[0x18] = b1[0x18] + b1[0x1B]; - b2[0x1B] = (b1[0x18] - b1[0x1B]) * cos0; - b2[0x19] = b1[0x19] + b1[0x1A]; - b2[0x1A] = (b1[0x19] - b1[0x1A]) * cos1; - - b2[0x1C] = b1[0x1C] + b1[0x1F]; - b2[0x1F] = (b1[0x1F] - b1[0x1C]) * cos0; - b2[0x1D] = b1[0x1D] + b1[0x1E]; - b2[0x1E] = (b1[0x1E] - b1[0x1D]) * cos1; - } - - { - register real const cos0 = mp3lib_pnts[4][0]; - - b1[0x00] = b2[0x00] + b2[0x01]; - b1[0x01] = (b2[0x00] - b2[0x01]) * cos0; - b1[0x02] = b2[0x02] + b2[0x03]; - b1[0x03] = (b2[0x03] - b2[0x02]) * cos0; - b1[0x02] += b1[0x03]; - - b1[0x04] = b2[0x04] + b2[0x05]; - b1[0x05] = (b2[0x04] - b2[0x05]) * cos0; - b1[0x06] = b2[0x06] + b2[0x07]; - b1[0x07] = (b2[0x07] - b2[0x06]) * cos0; - b1[0x06] += b1[0x07]; - b1[0x04] += b1[0x06]; - b1[0x06] += b1[0x05]; - b1[0x05] += b1[0x07]; - - b1[0x08] = b2[0x08] + b2[0x09]; - b1[0x09] = (b2[0x08] - b2[0x09]) * cos0; - b1[0x0A] = b2[0x0A] + b2[0x0B]; - b1[0x0B] = (b2[0x0B] - b2[0x0A]) * cos0; - b1[0x0A] += b1[0x0B]; - - b1[0x0C] = b2[0x0C] + b2[0x0D]; - b1[0x0D] = (b2[0x0C] - b2[0x0D]) * cos0; - b1[0x0E] = b2[0x0E] + b2[0x0F]; - b1[0x0F] = (b2[0x0F] - b2[0x0E]) * cos0; - b1[0x0E] += b1[0x0F]; - b1[0x0C] += b1[0x0E]; - b1[0x0E] += b1[0x0D]; - b1[0x0D] += b1[0x0F]; - - b1[0x10] = b2[0x10] + b2[0x11]; - b1[0x11] = (b2[0x10] - b2[0x11]) * cos0; - b1[0x12] = b2[0x12] + b2[0x13]; - b1[0x13] = (b2[0x13] - b2[0x12]) * cos0; - b1[0x12] += b1[0x13]; - - b1[0x14] = b2[0x14] + b2[0x15]; - b1[0x15] = (b2[0x14] - b2[0x15]) * cos0; - b1[0x16] = b2[0x16] + b2[0x17]; - b1[0x17] = (b2[0x17] - b2[0x16]) * cos0; - b1[0x16] += b1[0x17]; - b1[0x14] += b1[0x16]; - b1[0x16] += b1[0x15]; - b1[0x15] += b1[0x17]; - - b1[0x18] = b2[0x18] + b2[0x19]; - b1[0x19] = (b2[0x18] - b2[0x19]) * cos0; - b1[0x1A] = b2[0x1A] + b2[0x1B]; - b1[0x1B] = (b2[0x1B] - b2[0x1A]) * cos0; - b1[0x1A] += b1[0x1B]; - - b1[0x1C] = b2[0x1C] + b2[0x1D]; - b1[0x1D] = (b2[0x1C] - b2[0x1D]) * cos0; - b1[0x1E] = b2[0x1E] + b2[0x1F]; - b1[0x1F] = (b2[0x1F] - b2[0x1E]) * cos0; - b1[0x1E] += b1[0x1F]; - b1[0x1C] += b1[0x1E]; - b1[0x1E] += b1[0x1D]; - b1[0x1D] += b1[0x1F]; - } - - out0[0x10*16] = b1[0x00]; - out0[0x10*12] = b1[0x04]; - out0[0x10* 8] = b1[0x02]; - out0[0x10* 4] = b1[0x06]; - out0[0x10* 0] = b1[0x01]; - out1[0x10* 0] = b1[0x01]; - out1[0x10* 4] = b1[0x05]; - out1[0x10* 8] = b1[0x03]; - out1[0x10*12] = b1[0x07]; - - b1[0x08] += b1[0x0C]; - out0[0x10*14] = b1[0x08]; - b1[0x0C] += b1[0x0a]; - out0[0x10*10] = b1[0x0C]; - b1[0x0A] += b1[0x0E]; - out0[0x10* 6] = b1[0x0A]; - b1[0x0E] += b1[0x09]; - out0[0x10* 2] = b1[0x0E]; - b1[0x09] += b1[0x0D]; - out1[0x10* 2] = b1[0x09]; - b1[0x0D] += b1[0x0B]; - out1[0x10* 6] = b1[0x0D]; - b1[0x0B] += b1[0x0F]; - out1[0x10*10] = b1[0x0B]; - out1[0x10*14] = b1[0x0F]; - - b1[0x18] += b1[0x1C]; - out0[0x10*15] = b1[0x10] + b1[0x18]; - out0[0x10*13] = b1[0x18] + b1[0x14]; - b1[0x1C] += b1[0x1a]; - out0[0x10*11] = b1[0x14] + b1[0x1C]; - out0[0x10* 9] = b1[0x1C] + b1[0x12]; - b1[0x1A] += b1[0x1E]; - out0[0x10* 7] = b1[0x12] + b1[0x1A]; - out0[0x10* 5] = b1[0x1A] + b1[0x16]; - b1[0x1E] += b1[0x19]; - out0[0x10* 3] = b1[0x16] + b1[0x1E]; - out0[0x10* 1] = b1[0x1E] + b1[0x11]; - b1[0x19] += b1[0x1D]; - out1[0x10* 1] = b1[0x11] + b1[0x19]; - out1[0x10* 3] = b1[0x19] + b1[0x15]; - b1[0x1D] += b1[0x1B]; - out1[0x10* 5] = b1[0x15] + b1[0x1D]; - out1[0x10* 7] = b1[0x1D] + b1[0x13]; - b1[0x1B] += b1[0x1F]; - out1[0x10* 9] = b1[0x13] + b1[0x1B]; - out1[0x10*11] = b1[0x1B] + b1[0x17]; - out1[0x10*13] = b1[0x17] + b1[0x1F]; - out1[0x10*15] = b1[0x1F]; -} - -/* - * the call via dct64 is a trick to force GCC to use - * (new) registers for the b1,b2 pointer to the bufs[xx] field - */ -static void dct64(real *a,real *b,real *c) -{ - real bufs[0x40]; - dct64_1(a,b,bufs,bufs+0x20,c); -} - -void mp3lib_dct64(real *a,real *b,real *c) -{ - real bufs[0x40]; - dct64_1(a,b,bufs,bufs+0x20,c); -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct64_3dnow.c --- a/mp3lib/dct64_3dnow.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,929 +0,0 @@ -/* -* This code was taken from http://www.mpg123.org -* See ChangeLog of mpg123-0.59s-pre.1 for detail -* Applied to mplayer by Nick Kurshev -* Partial 3dnow! optimization by Nick Kurshev -* -* TODO: optimize scalar 3dnow! code -* Warning: Phases 7 & 8 are not tested -*/ - -#include "config.h" -#include "mangle.h" -#include "mpg123.h" -#include "libavutil/x86_cpu.h" - -static unsigned long long int attribute_used __attribute__((aligned(8))) x_plus_minus_3dnow = 0x8000000000000000ULL; -static float attribute_used plus_1f = 1.0; - -void dct64_MMX_3dnow(short *a,short *b,real *c) -{ - char tmp[256]; - __asm__ volatile( -" mov %2,%%"REG_a"\n\t" - -" lea 128+%3,%%"REG_d"\n\t" -" mov %0,%%"REG_S"\n\t" -" mov %1,%%"REG_D"\n\t" -" mov $"MANGLE(costab_mmx)",%%"REG_b"\n\t" -" lea %3,%%"REG_c"\n\t" - -/* Phase 1*/ -" movq (%%"REG_a"), %%mm0\n\t" -" movq 8(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 120(%%"REG_a"), %%mm1\n\t" -" movq 112(%%"REG_a"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_d")\n\t" -" movq %%mm4, 8(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul (%%"REG_b"), %%mm3\n\t" -" pfmul 8(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 124(%%"REG_d")\n\t" -" movd %%mm7, 116(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 120(%%"REG_d")\n\t" -" movd %%mm7, 112(%%"REG_d")\n\t" - -" movq 16(%%"REG_a"), %%mm0\n\t" -" movq 24(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 104(%%"REG_a"), %%mm1\n\t" -" movq 96(%%"REG_a"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 16(%%"REG_d")\n\t" -" movq %%mm4, 24(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 16(%%"REG_b"), %%mm3\n\t" -" pfmul 24(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 108(%%"REG_d")\n\t" -" movd %%mm7, 100(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 104(%%"REG_d")\n\t" -" movd %%mm7, 96(%%"REG_d")\n\t" - -" movq 32(%%"REG_a"), %%mm0\n\t" -" movq 40(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 88(%%"REG_a"), %%mm1\n\t" -" movq 80(%%"REG_a"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 32(%%"REG_d")\n\t" -" movq %%mm4, 40(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 32(%%"REG_b"), %%mm3\n\t" -" pfmul 40(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 92(%%"REG_d")\n\t" -" movd %%mm7, 84(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 88(%%"REG_d")\n\t" -" movd %%mm7, 80(%%"REG_d")\n\t" - -" movq 48(%%"REG_a"), %%mm0\n\t" -" movq 56(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 72(%%"REG_a"), %%mm1\n\t" -" movq 64(%%"REG_a"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 48(%%"REG_d")\n\t" -" movq %%mm4, 56(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 48(%%"REG_b"), %%mm3\n\t" -" pfmul 56(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 76(%%"REG_d")\n\t" -" movd %%mm7, 68(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 72(%%"REG_d")\n\t" -" movd %%mm7, 64(%%"REG_d")\n\t" - -/* Phase 2*/ - -" movq (%%"REG_d"), %%mm0\n\t" -" movq 8(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 56(%%"REG_d"), %%mm1\n\t" -" movq 48(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_c")\n\t" -" movq %%mm4, 8(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 64(%%"REG_b"), %%mm3\n\t" -" pfmul 72(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 60(%%"REG_c")\n\t" -" movd %%mm7, 52(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 56(%%"REG_c")\n\t" -" movd %%mm7, 48(%%"REG_c")\n\t" - -" movq 16(%%"REG_d"), %%mm0\n\t" -" movq 24(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 40(%%"REG_d"), %%mm1\n\t" -" movq 32(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 16(%%"REG_c")\n\t" -" movq %%mm4, 24(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 80(%%"REG_b"), %%mm3\n\t" -" pfmul 88(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 44(%%"REG_c")\n\t" -" movd %%mm7, 36(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 40(%%"REG_c")\n\t" -" movd %%mm7, 32(%%"REG_c")\n\t" - -/* Phase 3*/ - -" movq 64(%%"REG_d"), %%mm0\n\t" -" movq 72(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 120(%%"REG_d"), %%mm1\n\t" -" movq 112(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 64(%%"REG_c")\n\t" -" movq %%mm4, 72(%%"REG_c")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 64(%%"REG_b"), %%mm3\n\t" -" pfmul 72(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 124(%%"REG_c")\n\t" -" movd %%mm7, 116(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 120(%%"REG_c")\n\t" -" movd %%mm7, 112(%%"REG_c")\n\t" - -" movq 80(%%"REG_d"), %%mm0\n\t" -" movq 88(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 104(%%"REG_d"), %%mm1\n\t" -" movq 96(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 80(%%"REG_c")\n\t" -" movq %%mm4, 88(%%"REG_c")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 80(%%"REG_b"), %%mm3\n\t" -" pfmul 88(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 108(%%"REG_c")\n\t" -" movd %%mm7, 100(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 104(%%"REG_c")\n\t" -" movd %%mm7, 96(%%"REG_c")\n\t" - -/* Phase 4*/ - -" movq (%%"REG_c"), %%mm0\n\t" -" movq 8(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 24(%%"REG_c"), %%mm1\n\t" -" movq 16(%%"REG_c"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_d")\n\t" -" movq %%mm4, 8(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 96(%%"REG_b"), %%mm3\n\t" -" pfmul 104(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 28(%%"REG_d")\n\t" -" movd %%mm7, 20(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 24(%%"REG_d")\n\t" -" movd %%mm7, 16(%%"REG_d")\n\t" - -" movq 32(%%"REG_c"), %%mm0\n\t" -" movq 40(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 56(%%"REG_c"), %%mm1\n\t" -" movq 48(%%"REG_c"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 32(%%"REG_d")\n\t" -" movq %%mm4, 40(%%"REG_d")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 96(%%"REG_b"), %%mm3\n\t" -" pfmul 104(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 60(%%"REG_d")\n\t" -" movd %%mm7, 52(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 56(%%"REG_d")\n\t" -" movd %%mm7, 48(%%"REG_d")\n\t" - -" movq 64(%%"REG_c"), %%mm0\n\t" -" movq 72(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 88(%%"REG_c"), %%mm1\n\t" -" movq 80(%%"REG_c"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 64(%%"REG_d")\n\t" -" movq %%mm4, 72(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 96(%%"REG_b"), %%mm3\n\t" -" pfmul 104(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 92(%%"REG_d")\n\t" -" movd %%mm7, 84(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 88(%%"REG_d")\n\t" -" movd %%mm7, 80(%%"REG_d")\n\t" - -" movq 96(%%"REG_c"), %%mm0\n\t" -" movq 104(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 120(%%"REG_c"), %%mm1\n\t" -" movq 112(%%"REG_c"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 96(%%"REG_d")\n\t" -" movq %%mm4, 104(%%"REG_d")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 96(%%"REG_b"), %%mm3\n\t" -" pfmul 104(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 124(%%"REG_d")\n\t" -" movd %%mm7, 116(%%"REG_d")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 120(%%"REG_d")\n\t" -" movd %%mm7, 112(%%"REG_d")\n\t" - -/* Phase 5 */ - -" movq (%%"REG_d"), %%mm0\n\t" -" movq 16(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 8(%%"REG_d"), %%mm1\n\t" -" movq 24(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_c")\n\t" -" movq %%mm4, 16(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 112(%%"REG_b"), %%mm3\n\t" -" pfmul 112(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 12(%%"REG_c")\n\t" -" movd %%mm7, 28(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 8(%%"REG_c")\n\t" -" movd %%mm7, 24(%%"REG_c")\n\t" - -" movq 32(%%"REG_d"), %%mm0\n\t" -" movq 48(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 40(%%"REG_d"), %%mm1\n\t" -" movq 56(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 32(%%"REG_c")\n\t" -" movq %%mm4, 48(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 112(%%"REG_b"), %%mm3\n\t" -" pfmul 112(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 44(%%"REG_c")\n\t" -" movd %%mm7, 60(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 40(%%"REG_c")\n\t" -" movd %%mm7, 56(%%"REG_c")\n\t" - -" movq 64(%%"REG_d"), %%mm0\n\t" -" movq 80(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 72(%%"REG_d"), %%mm1\n\t" -" movq 88(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 64(%%"REG_c")\n\t" -" movq %%mm4, 80(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 112(%%"REG_b"), %%mm3\n\t" -" pfmul 112(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 76(%%"REG_c")\n\t" -" movd %%mm7, 92(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 72(%%"REG_c")\n\t" -" movd %%mm7, 88(%%"REG_c")\n\t" - -" movq 96(%%"REG_d"), %%mm0\n\t" -" movq 112(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" movq 104(%%"REG_d"), %%mm1\n\t" -" movq 120(%%"REG_d"), %%mm5\n\t" -/* n.b.: pswapd*/ -" movq %%mm1, %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm5\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" punpckldq %%mm6, %%mm5\n\t" -/**/ -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 96(%%"REG_c")\n\t" -" movq %%mm4, 112(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 112(%%"REG_b"), %%mm3\n\t" -" pfmul 112(%%"REG_b"), %%mm7\n\t" -" movd %%mm3, 108(%%"REG_c")\n\t" -" movd %%mm7, 124(%%"REG_c")\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm7\n\t" -" movd %%mm3, 104(%%"REG_c")\n\t" -" movd %%mm7, 120(%%"REG_c")\n\t" - -/* Phase 6. This is the end of easy road. */ -/* Code below is coded in scalar mode. Should be optimized */ - -" movd "MANGLE(plus_1f)", %%mm6\n\t" -" punpckldq 120(%%"REG_b"), %%mm6\n\t" /* mm6 = 1.0 | 120(%%"REG_b")*/ -" movq "MANGLE(x_plus_minus_3dnow)", %%mm7\n\t" /* mm7 = +1 | -1 */ - -" movq 32(%%"REG_c"), %%mm0\n\t" -" movq 64(%%"REG_c"), %%mm2\n\t" -" movq %%mm0, %%mm1\n\t" -" movq %%mm2, %%mm3\n\t" -" pxor %%mm7, %%mm1\n\t" -" pxor %%mm7, %%mm3\n\t" -" pfacc %%mm1, %%mm0\n\t" -" pfacc %%mm3, %%mm2\n\t" -" pfmul %%mm6, %%mm0\n\t" -" pfmul %%mm6, %%mm2\n\t" -" movq %%mm0, 32(%%"REG_d")\n\t" -" movq %%mm2, 64(%%"REG_d")\n\t" - -" movd 44(%%"REG_c"), %%mm0\n\t" -" movd 40(%%"REG_c"), %%mm2\n\t" -" movd 120(%%"REG_b"), %%mm3\n\t" -" punpckldq 76(%%"REG_c"), %%mm0\n\t" -" punpckldq 72(%%"REG_c"), %%mm2\n\t" -" punpckldq %%mm3, %%mm3\n\t" -" movq %%mm0, %%mm4\n\t" -" movq %%mm2, %%mm5\n\t" -" pfsub %%mm2, %%mm0\n\t" -" pfmul %%mm3, %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pfadd %%mm5, %%mm0\n\t" -" pfadd %%mm4, %%mm0\n\t" -" movq %%mm0, %%mm2\n\t" -" punpckldq %%mm1, %%mm0\n\t" -" punpckhdq %%mm1, %%mm2\n\t" -" movq %%mm0, 40(%%"REG_d")\n\t" -" movq %%mm2, 72(%%"REG_d")\n\t" - -" movd 48(%%"REG_c"), %%mm3\n\t" -" movd 60(%%"REG_c"), %%mm2\n\t" -" pfsub 52(%%"REG_c"), %%mm3\n\t" -" pfsub 56(%%"REG_c"), %%mm2\n\t" -" pfmul 120(%%"REG_b"), %%mm3\n\t" -" pfmul 120(%%"REG_b"), %%mm2\n\t" -" movq %%mm2, %%mm1\n\t" - -" pfadd 56(%%"REG_c"), %%mm1\n\t" -" pfadd 60(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 48(%%"REG_c"), %%mm0\n\t" -" pfadd 52(%%"REG_c"), %%mm0\n\t" -" pfadd %%mm3, %%mm1\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" pfadd %%mm3, %%mm2\n\t" -" punpckldq %%mm2, %%mm0\n\t" -" movq %%mm1, 56(%%"REG_d")\n\t" -" movq %%mm0, 48(%%"REG_d")\n\t" - -/*---*/ - -" movd 92(%%"REG_c"), %%mm1\n\t" -" pfsub 88(%%"REG_c"), %%mm1\n\t" -" pfmul 120(%%"REG_b"), %%mm1\n\t" -" movd %%mm1, 92(%%"REG_d")\n\t" -" pfadd 92(%%"REG_c"), %%mm1\n\t" -" pfadd 88(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 80(%%"REG_c"), %%mm0\n\t" -" pfadd 84(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 80(%%"REG_d")\n\t" - -" movd 80(%%"REG_c"), %%mm0\n\t" -" pfsub 84(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" pfadd %%mm0, %%mm1\n\t" -" pfadd 92(%%"REG_d"), %%mm0\n\t" -" punpckldq %%mm1, %%mm0\n\t" -" movq %%mm0, 84(%%"REG_d")\n\t" - -" movq 96(%%"REG_c"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pxor %%mm7, %%mm1\n\t" -" pfacc %%mm1, %%mm0\n\t" -" pfmul %%mm6, %%mm0\n\t" -" movq %%mm0, 96(%%"REG_d")\n\t" - -" movd 108(%%"REG_c"), %%mm0\n\t" -" pfsub 104(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, 108(%%"REG_d")\n\t" -" pfadd 104(%%"REG_c"), %%mm0\n\t" -" pfadd 108(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 104(%%"REG_d")\n\t" - -" movd 124(%%"REG_c"), %%mm1\n\t" -" pfsub 120(%%"REG_c"), %%mm1\n\t" -" pfmul 120(%%"REG_b"), %%mm1\n\t" -" movd %%mm1, 124(%%"REG_d")\n\t" -" pfadd 120(%%"REG_c"), %%mm1\n\t" -" pfadd 124(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 112(%%"REG_c"), %%mm0\n\t" -" pfadd 116(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 112(%%"REG_d")\n\t" - -" movd 112(%%"REG_c"), %%mm0\n\t" -" pfsub 116(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" pfadd %%mm0,%%mm1\n\t" -" pfadd 124(%%"REG_d"), %%mm0\n\t" -" punpckldq %%mm1, %%mm0\n\t" -" movq %%mm0, 116(%%"REG_d")\n\t" - -// this code is broken, there is nothing modifying the z flag above. -#if 0 -" jnz .L01\n\t" - -/* Phase 7*/ -/* Code below is coded in scalar mode. Should be optimized */ - -" movd (%%"REG_c"), %%mm0\n\t" -" pfadd 4(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 1024(%%"REG_S")\n\t" - -" movd (%%"REG_c"), %%mm0\n\t" -" pfsub 4(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, (%%"REG_S")\n\t" -" movd %%mm0, (%%"REG_D")\n\t" - -" movd 12(%%"REG_c"), %%mm0\n\t" -" pfsub 8(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, 512(%%"REG_D")\n\t" -" pfadd 12(%%"REG_c"), %%mm0\n\t" -" pfadd 8(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 512(%%"REG_S")\n\t" - -" movd 16(%%"REG_c"), %%mm0\n\t" -" pfsub 20(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movq %%mm0, %%mm3\n\t" - -" movd 28(%%"REG_c"), %%mm0\n\t" -" pfsub 24(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, 768(%%"REG_D")\n\t" -" movq %%mm0, %%mm2\n\t" - -" pfadd 24(%%"REG_c"), %%mm0\n\t" -" pfadd 28(%%"REG_c"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" - -" pfadd 16(%%"REG_c"), %%mm0\n\t" -" pfadd 20(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 768(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm1\n\t" -" movd %%mm1, 256(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm2\n\t" -" movd %%mm2, 256(%%"REG_D")\n\t" - -/* Phase 8*/ - -" movq 32(%%"REG_d"), %%mm0\n\t" -" movq 48(%%"REG_d"), %%mm1\n\t" -" pfadd 48(%%"REG_d"), %%mm0\n\t" -" pfadd 40(%%"REG_d"), %%mm1\n\t" -" movd %%mm0, 896(%%"REG_S")\n\t" -" movd %%mm1, 640(%%"REG_S")\n\t" -" psrlq $32, %%mm0\n\t" -" psrlq $32, %%mm1\n\t" -" movd %%mm0, 128(%%"REG_D")\n\t" -" movd %%mm1, 384(%%"REG_D")\n\t" - -" movd 40(%%"REG_d"), %%mm0\n\t" -" pfadd 56(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 384(%%"REG_S")\n\t" - -" movd 56(%%"REG_d"), %%mm0\n\t" -" pfadd 36(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 128(%%"REG_S")\n\t" - -" movd 60(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 896(%%"REG_D")\n\t" -" pfadd 44(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 640(%%"REG_D")\n\t" - -" movq 96(%%"REG_d"), %%mm0\n\t" -" movq 112(%%"REG_d"), %%mm2\n\t" -" movq 104(%%"REG_d"), %%mm4\n\t" -" pfadd 112(%%"REG_d"), %%mm0\n\t" -" pfadd 104(%%"REG_d"), %%mm2\n\t" -" pfadd 120(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm1\n\t" -" movq %%mm2, %%mm3\n\t" -" movq %%mm4, %%mm5\n\t" -" pfadd 64(%%"REG_d"), %%mm0\n\t" -" pfadd 80(%%"REG_d"), %%mm2\n\t" -" pfadd 72(%%"REG_d"), %%mm4\n\t" -" movd %%mm0, 960(%%"REG_S")\n\t" -" movd %%mm2, 704(%%"REG_S")\n\t" -" movd %%mm4, 448(%%"REG_S")\n\t" -" psrlq $32, %%mm0\n\t" -" psrlq $32, %%mm2\n\t" -" psrlq $32, %%mm4\n\t" -" movd %%mm0, 64(%%"REG_D")\n\t" -" movd %%mm2, 320(%%"REG_D")\n\t" -" movd %%mm4, 576(%%"REG_D")\n\t" -" pfadd 80(%%"REG_d"), %%mm1\n\t" -" pfadd 72(%%"REG_d"), %%mm3\n\t" -" pfadd 88(%%"REG_d"), %%mm5\n\t" -" movd %%mm1, 832(%%"REG_S")\n\t" -" movd %%mm3, 576(%%"REG_S")\n\t" -" movd %%mm5, 320(%%"REG_S")\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm5\n\t" -" movd %%mm1, 192(%%"REG_D")\n\t" -" movd %%mm3, 448(%%"REG_D")\n\t" -" movd %%mm5, 704(%%"REG_D")\n\t" - -" movd 120(%%"REG_d"), %%mm0\n\t" -" pfadd 100(%%"REG_d"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pfadd 88(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 192(%%"REG_S")\n\t" -" pfadd 68(%%"REG_d"), %%mm1\n\t" -" movd %%mm1, 64(%%"REG_S")\n\t" - -" movd 124(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 960(%%"REG_D")\n\t" -" pfadd 92(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 832(%%"REG_D")\n\t" - -" jmp .L_bye\n\t" -".L01:\n\t" -#endif -/* Phase 9*/ - -" movq (%%"REG_c"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pxor %%mm7, %%mm1\n\t" -" pfacc %%mm1, %%mm0\n\t" -" pfmul %%mm6, %%mm0\n\t" -" pf2id %%mm0, %%mm0\n\t" -" packssdw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movw %%ax, 512(%%"REG_S")\n\t" -" shr $16, %%"REG_a"\n\t" -" movw %%ax, (%%"REG_S")\n\t" - -" movd 12(%%"REG_c"), %%mm0\n\t" -" pfsub 8(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" pf2id %%mm0, %%mm7\n\t" -" packssdw %%mm7, %%mm7\n\t" -" movd %%mm7, %%"REG_a"\n\t" -" movw %%ax, 256(%%"REG_D")\n\t" -" pfadd 12(%%"REG_c"), %%mm0\n\t" -" pfadd 8(%%"REG_c"), %%mm0\n\t" -" pf2id %%mm0, %%mm0\n\t" -" packssdw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movw %%ax, 256(%%"REG_S")\n\t" - -" movd 16(%%"REG_c"), %%mm3\n\t" -" pfsub 20(%%"REG_c"), %%mm3\n\t" -" pfmul 120(%%"REG_b"), %%mm3\n\t" -" movq %%mm3, %%mm2\n\t" - -" movd 28(%%"REG_c"), %%mm2\n\t" -" pfsub 24(%%"REG_c"), %%mm2\n\t" -" pfmul 120(%%"REG_b"), %%mm2\n\t" -" movq %%mm2, %%mm1\n\t" - -" pf2id %%mm2, %%mm7\n\t" -" packssdw %%mm7, %%mm7\n\t" -" movd %%mm7, %%"REG_a"\n\t" -" movw %%ax, 384(%%"REG_D")\n\t" - -" pfadd 24(%%"REG_c"), %%mm1\n\t" -" pfadd 28(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 16(%%"REG_c"), %%mm0\n\t" -" pfadd 20(%%"REG_c"), %%mm0\n\t" -" pf2id %%mm0, %%mm0\n\t" -" packssdw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movw %%ax, 384(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm1\n\t" -" pf2id %%mm1, %%mm1\n\t" -" packssdw %%mm1, %%mm1\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movw %%ax, 128(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm2\n\t" -" pf2id %%mm2, %%mm2\n\t" -" packssdw %%mm2, %%mm2\n\t" -" movd %%mm2, %%"REG_a"\n\t" -" movw %%ax, 128(%%"REG_D")\n\t" - -/* Phase 10*/ - -" movq 32(%%"REG_d"), %%mm0\n\t" -" movq 48(%%"REG_d"), %%mm1\n\t" -" pfadd 48(%%"REG_d"), %%mm0\n\t" -" pfadd 40(%%"REG_d"), %%mm1\n\t" -" pf2id %%mm0, %%mm0\n\t" -" pf2id %%mm1, %%mm1\n\t" -" packssdw %%mm0, %%mm0\n\t" -" packssdw %%mm1, %%mm1\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movd %%mm1, %%"REG_c"\n\t" -" movw %%ax, 448(%%"REG_S")\n\t" -" movw %%cx, 320(%%"REG_S")\n\t" -" shr $16, %%"REG_a"\n\t" -" shr $16, %%"REG_c"\n\t" -" movw %%ax, 64(%%"REG_D")\n\t" -" movw %%cx, 192(%%"REG_D")\n\t" - -" movd 40(%%"REG_d"), %%mm3\n\t" -" movd 56(%%"REG_d"), %%mm4\n\t" -" movd 60(%%"REG_d"), %%mm0\n\t" -" movd 44(%%"REG_d"), %%mm2\n\t" -" movd 120(%%"REG_d"), %%mm5\n\t" -" punpckldq %%mm4, %%mm3\n\t" -" punpckldq 124(%%"REG_d"), %%mm0\n\t" -" pfadd 100(%%"REG_d"), %%mm5\n\t" -" punpckldq 36(%%"REG_d"), %%mm4\n\t" -" punpckldq 92(%%"REG_d"), %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" pfadd %%mm4, %%mm3\n\t" -" pf2id %%mm0, %%mm1\n\t" -" pf2id %%mm3, %%mm3\n\t" -" packssdw %%mm1, %%mm1\n\t" -" packssdw %%mm3, %%mm3\n\t" -" pfadd 88(%%"REG_d"), %%mm5\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movd %%mm3, %%"REG_c"\n\t" -" movw %%ax, 448(%%"REG_D")\n\t" -" movw %%cx, 192(%%"REG_S")\n\t" -" pf2id %%mm5, %%mm5\n\t" -" packssdw %%mm5, %%mm5\n\t" -" shr $16, %%"REG_a"\n\t" -" shr $16, %%"REG_c"\n\t" -" movd %%mm5, %%"REG_b"\n\t" -" movw %%bx, 96(%%"REG_S")\n\t" -" movw %%ax, 480(%%"REG_D")\n\t" -" movw %%cx, 64(%%"REG_S")\n\t" -" pfadd %%mm2, %%mm0\n\t" -" pf2id %%mm0, %%mm0\n\t" -" packssdw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" pfadd 68(%%"REG_d"), %%mm6\n\t" -" movw %%ax, 320(%%"REG_D")\n\t" -" shr $16, %%"REG_a"\n\t" -" pf2id %%mm6, %%mm6\n\t" -" packssdw %%mm6, %%mm6\n\t" -" movd %%mm6, %%"REG_b"\n\t" -" movw %%ax, 416(%%"REG_D")\n\t" -" movw %%bx, 32(%%"REG_S")\n\t" - -" movq 96(%%"REG_d"), %%mm0\n\t" -" movq 112(%%"REG_d"), %%mm2\n\t" -" movq 104(%%"REG_d"), %%mm4\n\t" -" pfadd %%mm2, %%mm0\n\t" -" pfadd %%mm4, %%mm2\n\t" -" pfadd 120(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm1\n\t" -" movq %%mm2, %%mm3\n\t" -" movq %%mm4, %%mm5\n\t" -" pfadd 64(%%"REG_d"), %%mm0\n\t" -" pfadd 80(%%"REG_d"), %%mm2\n\t" -" pfadd 72(%%"REG_d"), %%mm4\n\t" -" pf2id %%mm0, %%mm0\n\t" -" pf2id %%mm2, %%mm2\n\t" -" pf2id %%mm4, %%mm4\n\t" -" packssdw %%mm0, %%mm0\n\t" -" packssdw %%mm2, %%mm2\n\t" -" packssdw %%mm4, %%mm4\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movd %%mm2, %%"REG_c"\n\t" -" movd %%mm4, %%"REG_b"\n\t" -" movw %%ax, 480(%%"REG_S")\n\t" -" movw %%cx, 352(%%"REG_S")\n\t" -" movw %%bx, 224(%%"REG_S")\n\t" -" shr $16, %%"REG_a"\n\t" -" shr $16, %%"REG_c"\n\t" -" shr $16, %%"REG_b"\n\t" -" movw %%ax, 32(%%"REG_D")\n\t" -" movw %%cx, 160(%%"REG_D")\n\t" -" movw %%bx, 288(%%"REG_D")\n\t" -" pfadd 80(%%"REG_d"), %%mm1\n\t" -" pfadd 72(%%"REG_d"), %%mm3\n\t" -" pfadd 88(%%"REG_d"), %%mm5\n\t" -" pf2id %%mm1, %%mm1\n\t" -" pf2id %%mm3, %%mm3\n\t" -" pf2id %%mm5, %%mm5\n\t" -" packssdw %%mm1, %%mm1\n\t" -" packssdw %%mm3, %%mm3\n\t" -" packssdw %%mm5, %%mm5\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movd %%mm3, %%"REG_c"\n\t" -" movd %%mm5, %%"REG_b"\n\t" -" movw %%ax, 416(%%"REG_S")\n\t" -" movw %%cx, 288(%%"REG_S")\n\t" -" movw %%bx, 160(%%"REG_S")\n\t" -" shr $16, %%"REG_a"\n\t" -" shr $16, %%"REG_c"\n\t" -" shr $16, %%"REG_b"\n\t" -" movw %%ax, 96(%%"REG_D")\n\t" -" movw %%cx, 224(%%"REG_D")\n\t" -" movw %%bx, 352(%%"REG_D")\n\t" - -" movsw\n\t" - -".L_bye:\n\t" -" femms\n\t" - : - :"m"(a),"m"(b),"m"(c),"m"(tmp[0]) - :"memory","%eax","%ebx","%ecx","%edx","%esi","%edi"); -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct64_altivec.c --- a/mp3lib/dct64_altivec.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,524 +0,0 @@ - -/* - * Discrete Cosine Tansform (DCT) for Altivec - * Copyright (c) 2004 Romain Dolbeau - * based upon code from "mp3lib/dct64.c" - * This file is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public License - */ - -#include -#include "mpg123.h" - -#ifdef HAVE_ALTIVEC_H -#include -#endif - -// used to build registers permutation vectors (vcprm) -// the 's' are for words in the _s_econd vector -#define WORD_0 0x00,0x01,0x02,0x03 -#define WORD_1 0x04,0x05,0x06,0x07 -#define WORD_2 0x08,0x09,0x0a,0x0b -#define WORD_3 0x0c,0x0d,0x0e,0x0f -#define WORD_s0 0x10,0x11,0x12,0x13 -#define WORD_s1 0x14,0x15,0x16,0x17 -#define WORD_s2 0x18,0x19,0x1a,0x1b -#define WORD_s3 0x1c,0x1d,0x1e,0x1f - -#define vcprm(a,b,c,d) (const vector unsigned char){WORD_ ## a, WORD_ ## b, WORD_ ## c, WORD_ ## d} -#define vcii(a,b,c,d) (const vector float){FLOAT_ ## a, FLOAT_ ## b, FLOAT_ ## c, FLOAT_ ## d} - -#define FOUROF(a) {a,a,a,a} - -// vcprmle is used to keep the same index as in the SSE version. -// it's the same as vcprm, with the index inversed -// ('le' is Little Endian) -#define vcprmle(a,b,c,d) vcprm(d,c,b,a) - -// used to build inverse/identity vectors (vcii) -// n is _n_egative, p is _p_ositive -#define FLOAT_n -1. -#define FLOAT_p 1. - -void dct64_altivec(real *a,real *b,real *c) -{ - real __attribute__ ((aligned(16))) b1[0x20]; - real __attribute__ ((aligned(16))) b2[0x20]; - - real *out0 = a; - real *out1 = b; - real *samples = c; - - const vector float vczero = (const vector float)FOUROF(0.); - const vector unsigned char reverse = (const vector unsigned char)vcprm(3,2,1,0); - - - if (((unsigned long)b1 & 0x0000000F) || - ((unsigned long)b2 & 0x0000000F)) - - { - printf("MISALIGNED:\t%p\t%p\t%p\t%p\t%p\n", - b1, b2, a, b, samples); - } - - -#ifdef ALTIVEC_USE_REFERENCE_C_CODE - - { - register real *costab = mp3lib_pnts[0]; - - b1[0x00] = samples[0x00] + samples[0x1F]; - b1[0x01] = samples[0x01] + samples[0x1E]; - b1[0x02] = samples[0x02] + samples[0x1D]; - b1[0x03] = samples[0x03] + samples[0x1C]; - b1[0x04] = samples[0x04] + samples[0x1B]; - b1[0x05] = samples[0x05] + samples[0x1A]; - b1[0x06] = samples[0x06] + samples[0x19]; - b1[0x07] = samples[0x07] + samples[0x18]; - b1[0x08] = samples[0x08] + samples[0x17]; - b1[0x09] = samples[0x09] + samples[0x16]; - b1[0x0A] = samples[0x0A] + samples[0x15]; - b1[0x0B] = samples[0x0B] + samples[0x14]; - b1[0x0C] = samples[0x0C] + samples[0x13]; - b1[0x0D] = samples[0x0D] + samples[0x12]; - b1[0x0E] = samples[0x0E] + samples[0x11]; - b1[0x0F] = samples[0x0F] + samples[0x10]; - b1[0x10] = (samples[0x0F] - samples[0x10]) * costab[0xF]; - b1[0x11] = (samples[0x0E] - samples[0x11]) * costab[0xE]; - b1[0x12] = (samples[0x0D] - samples[0x12]) * costab[0xD]; - b1[0x13] = (samples[0x0C] - samples[0x13]) * costab[0xC]; - b1[0x14] = (samples[0x0B] - samples[0x14]) * costab[0xB]; - b1[0x15] = (samples[0x0A] - samples[0x15]) * costab[0xA]; - b1[0x16] = (samples[0x09] - samples[0x16]) * costab[0x9]; - b1[0x17] = (samples[0x08] - samples[0x17]) * costab[0x8]; - b1[0x18] = (samples[0x07] - samples[0x18]) * costab[0x7]; - b1[0x19] = (samples[0x06] - samples[0x19]) * costab[0x6]; - b1[0x1A] = (samples[0x05] - samples[0x1A]) * costab[0x5]; - b1[0x1B] = (samples[0x04] - samples[0x1B]) * costab[0x4]; - b1[0x1C] = (samples[0x03] - samples[0x1C]) * costab[0x3]; - b1[0x1D] = (samples[0x02] - samples[0x1D]) * costab[0x2]; - b1[0x1E] = (samples[0x01] - samples[0x1E]) * costab[0x1]; - b1[0x1F] = (samples[0x00] - samples[0x1F]) * costab[0x0]; - - } - { - register real *costab = mp3lib_pnts[1]; - - b2[0x00] = b1[0x00] + b1[0x0F]; - b2[0x01] = b1[0x01] + b1[0x0E]; - b2[0x02] = b1[0x02] + b1[0x0D]; - b2[0x03] = b1[0x03] + b1[0x0C]; - b2[0x04] = b1[0x04] + b1[0x0B]; - b2[0x05] = b1[0x05] + b1[0x0A]; - b2[0x06] = b1[0x06] + b1[0x09]; - b2[0x07] = b1[0x07] + b1[0x08]; - b2[0x08] = (b1[0x07] - b1[0x08]) * costab[7]; - b2[0x09] = (b1[0x06] - b1[0x09]) * costab[6]; - b2[0x0A] = (b1[0x05] - b1[0x0A]) * costab[5]; - b2[0x0B] = (b1[0x04] - b1[0x0B]) * costab[4]; - b2[0x0C] = (b1[0x03] - b1[0x0C]) * costab[3]; - b2[0x0D] = (b1[0x02] - b1[0x0D]) * costab[2]; - b2[0x0E] = (b1[0x01] - b1[0x0E]) * costab[1]; - b2[0x0F] = (b1[0x00] - b1[0x0F]) * costab[0]; - b2[0x10] = b1[0x10] + b1[0x1F]; - b2[0x11] = b1[0x11] + b1[0x1E]; - b2[0x12] = b1[0x12] + b1[0x1D]; - b2[0x13] = b1[0x13] + b1[0x1C]; - b2[0x14] = b1[0x14] + b1[0x1B]; - b2[0x15] = b1[0x15] + b1[0x1A]; - b2[0x16] = b1[0x16] + b1[0x19]; - b2[0x17] = b1[0x17] + b1[0x18]; - b2[0x18] = (b1[0x18] - b1[0x17]) * costab[7]; - b2[0x19] = (b1[0x19] - b1[0x16]) * costab[6]; - b2[0x1A] = (b1[0x1A] - b1[0x15]) * costab[5]; - b2[0x1B] = (b1[0x1B] - b1[0x14]) * costab[4]; - b2[0x1C] = (b1[0x1C] - b1[0x13]) * costab[3]; - b2[0x1D] = (b1[0x1D] - b1[0x12]) * costab[2]; - b2[0x1E] = (b1[0x1E] - b1[0x11]) * costab[1]; - b2[0x1F] = (b1[0x1F] - b1[0x10]) * costab[0]; - - } - - { - register real *costab = mp3lib_pnts[2]; - - b1[0x00] = b2[0x00] + b2[0x07]; - b1[0x01] = b2[0x01] + b2[0x06]; - b1[0x02] = b2[0x02] + b2[0x05]; - b1[0x03] = b2[0x03] + b2[0x04]; - b1[0x04] = (b2[0x03] - b2[0x04]) * costab[3]; - b1[0x05] = (b2[0x02] - b2[0x05]) * costab[2]; - b1[0x06] = (b2[0x01] - b2[0x06]) * costab[1]; - b1[0x07] = (b2[0x00] - b2[0x07]) * costab[0]; - b1[0x08] = b2[0x08] + b2[0x0F]; - b1[0x09] = b2[0x09] + b2[0x0E]; - b1[0x0A] = b2[0x0A] + b2[0x0D]; - b1[0x0B] = b2[0x0B] + b2[0x0C]; - b1[0x0C] = (b2[0x0C] - b2[0x0B]) * costab[3]; - b1[0x0D] = (b2[0x0D] - b2[0x0A]) * costab[2]; - b1[0x0E] = (b2[0x0E] - b2[0x09]) * costab[1]; - b1[0x0F] = (b2[0x0F] - b2[0x08]) * costab[0]; - b1[0x10] = b2[0x10] + b2[0x17]; - b1[0x11] = b2[0x11] + b2[0x16]; - b1[0x12] = b2[0x12] + b2[0x15]; - b1[0x13] = b2[0x13] + b2[0x14]; - b1[0x14] = (b2[0x13] - b2[0x14]) * costab[3]; - b1[0x15] = (b2[0x12] - b2[0x15]) * costab[2]; - b1[0x16] = (b2[0x11] - b2[0x16]) * costab[1]; - b1[0x17] = (b2[0x10] - b2[0x17]) * costab[0]; - b1[0x18] = b2[0x18] + b2[0x1F]; - b1[0x19] = b2[0x19] + b2[0x1E]; - b1[0x1A] = b2[0x1A] + b2[0x1D]; - b1[0x1B] = b2[0x1B] + b2[0x1C]; - b1[0x1C] = (b2[0x1C] - b2[0x1B]) * costab[3]; - b1[0x1D] = (b2[0x1D] - b2[0x1A]) * costab[2]; - b1[0x1E] = (b2[0x1E] - b2[0x19]) * costab[1]; - b1[0x1F] = (b2[0x1F] - b2[0x18]) * costab[0]; - } - -#else /* ALTIVEC_USE_REFERENCE_C_CODE */ - - // How does it work ? - // the first three passes are reproducted in the three block below - // all computations are done on a 4 elements vector - // 'reverse' is a special perumtation vector used to reverse - // the order of the elements inside a vector. - // note that all loads/stores to b1 (b2) between passes 1 and 2 (2 and 3) - // have been removed, all elements are stored inside b1vX (b2vX) - { - register vector float - b1v0, b1v1, b1v2, b1v3, - b1v4, b1v5, b1v6, b1v7; - register vector float - temp1, temp2; - - { - register real *costab = mp3lib_pnts[0]; - - register vector float - samplesv1, samplesv2, samplesv3, samplesv4, - samplesv5, samplesv6, samplesv7, samplesv8, - samplesv9; - register vector unsigned char samples_perm = vec_lvsl(0, samples); - register vector float costabv1, costabv2, costabv3, costabv4, costabv5; - register vector unsigned char costab_perm = vec_lvsl(0, costab); - - samplesv1 = vec_ld(0, samples); - samplesv2 = vec_ld(16, samples); - samplesv1 = vec_perm(samplesv1, samplesv2, samples_perm); - samplesv3 = vec_ld(32, samples); - samplesv2 = vec_perm(samplesv2, samplesv3, samples_perm); - samplesv4 = vec_ld(48, samples); - samplesv3 = vec_perm(samplesv3, samplesv4, samples_perm); - samplesv5 = vec_ld(64, samples); - samplesv4 = vec_perm(samplesv4, samplesv5, samples_perm); - samplesv6 = vec_ld(80, samples); - samplesv5 = vec_perm(samplesv5, samplesv6, samples_perm); - samplesv7 = vec_ld(96, samples); - samplesv6 = vec_perm(samplesv6, samplesv7, samples_perm); - samplesv8 = vec_ld(112, samples); - samplesv7 = vec_perm(samplesv7, samplesv8, samples_perm); - samplesv9 = vec_ld(128, samples); - samplesv8 = vec_perm(samplesv8, samplesv9, samples_perm); - - temp1 = vec_add(samplesv1, - vec_perm(samplesv8, samplesv8, reverse)); - //vec_st(temp1, 0, b1); - b1v0 = temp1; - temp1 = vec_add(samplesv2, - vec_perm(samplesv7, samplesv7, reverse)); - //vec_st(temp1, 16, b1); - b1v1 = temp1; - temp1 = vec_add(samplesv3, - vec_perm(samplesv6, samplesv6, reverse)); - //vec_st(temp1, 32, b1); - b1v2 = temp1; - temp1 = vec_add(samplesv4, - vec_perm(samplesv5, samplesv5, reverse)); - //vec_st(temp1, 48, b1); - b1v3 = temp1; - - costabv1 = vec_ld(0, costab); - costabv2 = vec_ld(16, costab); - costabv1 = vec_perm(costabv1, costabv2, costab_perm); - costabv3 = vec_ld(32, costab); - costabv2 = vec_perm(costabv2, costabv3, costab_perm); - costabv4 = vec_ld(48, costab); - costabv3 = vec_perm(costabv3, costabv4, costab_perm); - costabv5 = vec_ld(64, costab); - costabv4 = vec_perm(costabv4, costabv5, costab_perm); - - temp1 = vec_sub(vec_perm(samplesv4, samplesv4, reverse), - samplesv5); - temp2 = vec_madd(temp1, - vec_perm(costabv4, costabv4, reverse), - vczero); - //vec_st(temp2, 64, b1); - b1v4 = temp2; - - temp1 = vec_sub(vec_perm(samplesv3, samplesv3, reverse), - samplesv6); - temp2 = vec_madd(temp1, - vec_perm(costabv3, costabv3, reverse), - vczero); - //vec_st(temp2, 80, b1); - b1v5 = temp2; - temp1 = vec_sub(vec_perm(samplesv2, samplesv2, reverse), - samplesv7); - temp2 = vec_madd(temp1, - vec_perm(costabv2, costabv2, reverse), - vczero); - //vec_st(temp2, 96, b1); - b1v6 = temp2; - - temp1 = vec_sub(vec_perm(samplesv1, samplesv1, reverse), - samplesv8); - temp2 = vec_madd(temp1, - vec_perm(costabv1, costabv1, reverse), - vczero); - //vec_st(temp2, 112, b1); - b1v7 = temp2; - - } - - { - register vector float - b2v0, b2v1, b2v2, b2v3, - b2v4, b2v5, b2v6, b2v7; - { - register real *costab = mp3lib_pnts[1]; - register vector float costabv1r, costabv2r, costabv1, costabv2, costabv3; - register vector unsigned char costab_perm = vec_lvsl(0, costab); - - costabv1 = vec_ld(0, costab); - costabv2 = vec_ld(16, costab); - costabv1 = vec_perm(costabv1, costabv2, costab_perm); - costabv3 = vec_ld(32, costab); - costabv2 = vec_perm(costabv2, costabv3 , costab_perm); - costabv1r = vec_perm(costabv1, costabv1, reverse); - costabv2r = vec_perm(costabv2, costabv2, reverse); - - temp1 = vec_add(b1v0, vec_perm(b1v3, b1v3, reverse)); - //vec_st(temp1, 0, b2); - b2v0 = temp1; - temp1 = vec_add(b1v1, vec_perm(b1v2, b1v2, reverse)); - //vec_st(temp1, 16, b2); - b2v1 = temp1; - temp2 = vec_sub(vec_perm(b1v1, b1v1, reverse), b1v2); - temp1 = vec_madd(temp2, costabv2r, vczero); - //vec_st(temp1, 32, b2); - b2v2 = temp1; - temp2 = vec_sub(vec_perm(b1v0, b1v0, reverse), b1v3); - temp1 = vec_madd(temp2, costabv1r, vczero); - //vec_st(temp1, 48, b2); - b2v3 = temp1; - temp1 = vec_add(b1v4, vec_perm(b1v7, b1v7, reverse)); - //vec_st(temp1, 64, b2); - b2v4 = temp1; - temp1 = vec_add(b1v5, vec_perm(b1v6, b1v6, reverse)); - //vec_st(temp1, 80, b2); - b2v5 = temp1; - temp2 = vec_sub(b1v6, vec_perm(b1v5, b1v5, reverse)); - temp1 = vec_madd(temp2, costabv2r, vczero); - //vec_st(temp1, 96, b2); - b2v6 = temp1; - temp2 = vec_sub(b1v7, vec_perm(b1v4, b1v4, reverse)); - temp1 = vec_madd(temp2, costabv1r, vczero); - //vec_st(temp1, 112, b2); - b2v7 = temp1; - } - - { - register real *costab = mp3lib_pnts[2]; - - - vector float costabv1r, costabv1, costabv2; - vector unsigned char costab_perm = vec_lvsl(0, costab); - - costabv1 = vec_ld(0, costab); - costabv2 = vec_ld(16, costab); - costabv1 = vec_perm(costabv1, costabv2, costab_perm); - costabv1r = vec_perm(costabv1, costabv1, reverse); - - temp1 = vec_add(b2v0, vec_perm(b2v1, b2v1, reverse)); - vec_st(temp1, 0, b1); - temp2 = vec_sub(vec_perm(b2v0, b2v0, reverse), b2v1); - temp1 = vec_madd(temp2, costabv1r, vczero); - vec_st(temp1, 16, b1); - - temp1 = vec_add(b2v2, vec_perm(b2v3, b2v3, reverse)); - vec_st(temp1, 32, b1); - temp2 = vec_sub(b2v3, vec_perm(b2v2, b2v2, reverse)); - temp1 = vec_madd(temp2, costabv1r, vczero); - vec_st(temp1, 48, b1); - - temp1 = vec_add(b2v4, vec_perm(b2v5, b2v5, reverse)); - vec_st(temp1, 64, b1); - temp2 = vec_sub(vec_perm(b2v4, b2v4, reverse), b2v5); - temp1 = vec_madd(temp2, costabv1r, vczero); - vec_st(temp1, 80, b1); - - temp1 = vec_add(b2v6, vec_perm(b2v7, b2v7, reverse)); - vec_st(temp1, 96, b1); - temp2 = vec_sub(b2v7, vec_perm(b2v6, b2v6, reverse)); - temp1 = vec_madd(temp2, costabv1r, vczero); - vec_st(temp1, 112, b1); - - } - } - } - -#endif /* ALTIVEC_USE_REFERENCE_C_CODE */ - - { - register real const cos0 = mp3lib_pnts[3][0]; - register real const cos1 = mp3lib_pnts[3][1]; - - b2[0x00] = b1[0x00] + b1[0x03]; - b2[0x01] = b1[0x01] + b1[0x02]; - b2[0x02] = (b1[0x01] - b1[0x02]) * cos1; - b2[0x03] = (b1[0x00] - b1[0x03]) * cos0; - b2[0x04] = b1[0x04] + b1[0x07]; - b2[0x05] = b1[0x05] + b1[0x06]; - b2[0x06] = (b1[0x06] - b1[0x05]) * cos1; - b2[0x07] = (b1[0x07] - b1[0x04]) * cos0; - b2[0x08] = b1[0x08] + b1[0x0B]; - b2[0x09] = b1[0x09] + b1[0x0A]; - b2[0x0A] = (b1[0x09] - b1[0x0A]) * cos1; - b2[0x0B] = (b1[0x08] - b1[0x0B]) * cos0; - b2[0x0C] = b1[0x0C] + b1[0x0F]; - b2[0x0D] = b1[0x0D] + b1[0x0E]; - b2[0x0E] = (b1[0x0E] - b1[0x0D]) * cos1; - b2[0x0F] = (b1[0x0F] - b1[0x0C]) * cos0; - b2[0x10] = b1[0x10] + b1[0x13]; - b2[0x11] = b1[0x11] + b1[0x12]; - b2[0x12] = (b1[0x11] - b1[0x12]) * cos1; - b2[0x13] = (b1[0x10] - b1[0x13]) * cos0; - b2[0x14] = b1[0x14] + b1[0x17]; - b2[0x15] = b1[0x15] + b1[0x16]; - b2[0x16] = (b1[0x16] - b1[0x15]) * cos1; - b2[0x17] = (b1[0x17] - b1[0x14]) * cos0; - b2[0x18] = b1[0x18] + b1[0x1B]; - b2[0x19] = b1[0x19] + b1[0x1A]; - b2[0x1A] = (b1[0x19] - b1[0x1A]) * cos1; - b2[0x1B] = (b1[0x18] - b1[0x1B]) * cos0; - b2[0x1C] = b1[0x1C] + b1[0x1F]; - b2[0x1D] = b1[0x1D] + b1[0x1E]; - b2[0x1E] = (b1[0x1E] - b1[0x1D]) * cos1; - b2[0x1F] = (b1[0x1F] - b1[0x1C]) * cos0; - } - - { - register real const cos0 = mp3lib_pnts[4][0]; - - b1[0x00] = b2[0x00] + b2[0x01]; - b1[0x01] = (b2[0x00] - b2[0x01]) * cos0; - b1[0x02] = b2[0x02] + b2[0x03]; - b1[0x03] = (b2[0x03] - b2[0x02]) * cos0; - b1[0x02] += b1[0x03]; - - b1[0x04] = b2[0x04] + b2[0x05]; - b1[0x05] = (b2[0x04] - b2[0x05]) * cos0; - b1[0x06] = b2[0x06] + b2[0x07]; - b1[0x07] = (b2[0x07] - b2[0x06]) * cos0; - b1[0x06] += b1[0x07]; - b1[0x04] += b1[0x06]; - b1[0x06] += b1[0x05]; - b1[0x05] += b1[0x07]; - - b1[0x08] = b2[0x08] + b2[0x09]; - b1[0x09] = (b2[0x08] - b2[0x09]) * cos0; - b1[0x0A] = b2[0x0A] + b2[0x0B]; - b1[0x0B] = (b2[0x0B] - b2[0x0A]) * cos0; - b1[0x0A] += b1[0x0B]; - - b1[0x0C] = b2[0x0C] + b2[0x0D]; - b1[0x0D] = (b2[0x0C] - b2[0x0D]) * cos0; - b1[0x0E] = b2[0x0E] + b2[0x0F]; - b1[0x0F] = (b2[0x0F] - b2[0x0E]) * cos0; - b1[0x0E] += b1[0x0F]; - b1[0x0C] += b1[0x0E]; - b1[0x0E] += b1[0x0D]; - b1[0x0D] += b1[0x0F]; - - b1[0x10] = b2[0x10] + b2[0x11]; - b1[0x11] = (b2[0x10] - b2[0x11]) * cos0; - b1[0x12] = b2[0x12] + b2[0x13]; - b1[0x13] = (b2[0x13] - b2[0x12]) * cos0; - b1[0x12] += b1[0x13]; - - b1[0x14] = b2[0x14] + b2[0x15]; - b1[0x15] = (b2[0x14] - b2[0x15]) * cos0; - b1[0x16] = b2[0x16] + b2[0x17]; - b1[0x17] = (b2[0x17] - b2[0x16]) * cos0; - b1[0x16] += b1[0x17]; - b1[0x14] += b1[0x16]; - b1[0x16] += b1[0x15]; - b1[0x15] += b1[0x17]; - - b1[0x18] = b2[0x18] + b2[0x19]; - b1[0x19] = (b2[0x18] - b2[0x19]) * cos0; - b1[0x1A] = b2[0x1A] + b2[0x1B]; - b1[0x1B] = (b2[0x1B] - b2[0x1A]) * cos0; - b1[0x1A] += b1[0x1B]; - - b1[0x1C] = b2[0x1C] + b2[0x1D]; - b1[0x1D] = (b2[0x1C] - b2[0x1D]) * cos0; - b1[0x1E] = b2[0x1E] + b2[0x1F]; - b1[0x1F] = (b2[0x1F] - b2[0x1E]) * cos0; - b1[0x1E] += b1[0x1F]; - b1[0x1C] += b1[0x1E]; - b1[0x1E] += b1[0x1D]; - b1[0x1D] += b1[0x1F]; - } - - out0[0x10*16] = b1[0x00]; - out0[0x10*12] = b1[0x04]; - out0[0x10* 8] = b1[0x02]; - out0[0x10* 4] = b1[0x06]; - out0[0x10* 0] = b1[0x01]; - out1[0x10* 0] = b1[0x01]; - out1[0x10* 4] = b1[0x05]; - out1[0x10* 8] = b1[0x03]; - out1[0x10*12] = b1[0x07]; - - b1[0x08] += b1[0x0C]; - out0[0x10*14] = b1[0x08]; - b1[0x0C] += b1[0x0a]; - out0[0x10*10] = b1[0x0C]; - b1[0x0A] += b1[0x0E]; - out0[0x10* 6] = b1[0x0A]; - b1[0x0E] += b1[0x09]; - out0[0x10* 2] = b1[0x0E]; - b1[0x09] += b1[0x0D]; - out1[0x10* 2] = b1[0x09]; - b1[0x0D] += b1[0x0B]; - out1[0x10* 6] = b1[0x0D]; - b1[0x0B] += b1[0x0F]; - out1[0x10*10] = b1[0x0B]; - out1[0x10*14] = b1[0x0F]; - - b1[0x18] += b1[0x1C]; - out0[0x10*15] = b1[0x10] + b1[0x18]; - out0[0x10*13] = b1[0x18] + b1[0x14]; - b1[0x1C] += b1[0x1a]; - out0[0x10*11] = b1[0x14] + b1[0x1C]; - out0[0x10* 9] = b1[0x1C] + b1[0x12]; - b1[0x1A] += b1[0x1E]; - out0[0x10* 7] = b1[0x12] + b1[0x1A]; - out0[0x10* 5] = b1[0x1A] + b1[0x16]; - b1[0x1E] += b1[0x19]; - out0[0x10* 3] = b1[0x16] + b1[0x1E]; - out0[0x10* 1] = b1[0x1E] + b1[0x11]; - b1[0x19] += b1[0x1D]; - out1[0x10* 1] = b1[0x11] + b1[0x19]; - out1[0x10* 3] = b1[0x19] + b1[0x15]; - b1[0x1D] += b1[0x1B]; - out1[0x10* 5] = b1[0x15] + b1[0x1D]; - out1[0x10* 7] = b1[0x1D] + b1[0x13]; - b1[0x1B] += b1[0x1F]; - out1[0x10* 9] = b1[0x13] + b1[0x1B]; - out1[0x10*11] = b1[0x1B] + b1[0x17]; - out1[0x10*13] = b1[0x17] + b1[0x1F]; - out1[0x10*15] = b1[0x1F]; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct64_i386.c --- a/mp3lib/dct64_i386.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,319 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* - * Discrete Cosine Tansform (DCT) for subband synthesis - * optimized for machines with no auto-increment. - * The performance is highly compiler dependend. Maybe - * the mpg123_dct64.c version for 'normal' processor may be faster - * even for Intel processors. - */ - -#include "mpg123.h" - -static void mpg123_dct64_1(real * out0, real * out1, real * b1, real * b2, real * samples) -{ - - { - register real *costab = mp3lib_pnts[0]; - - b1[0x00] = samples[0x00] + samples[0x1F]; - b1[0x1F] = (samples[0x00] - samples[0x1F]) * costab[0x0]; - - b1[0x01] = samples[0x01] + samples[0x1E]; - b1[0x1E] = (samples[0x01] - samples[0x1E]) * costab[0x1]; - - b1[0x02] = samples[0x02] + samples[0x1D]; - b1[0x1D] = (samples[0x02] - samples[0x1D]) * costab[0x2]; - - b1[0x03] = samples[0x03] + samples[0x1C]; - b1[0x1C] = (samples[0x03] - samples[0x1C]) * costab[0x3]; - - b1[0x04] = samples[0x04] + samples[0x1B]; - b1[0x1B] = (samples[0x04] - samples[0x1B]) * costab[0x4]; - - b1[0x05] = samples[0x05] + samples[0x1A]; - b1[0x1A] = (samples[0x05] - samples[0x1A]) * costab[0x5]; - - b1[0x06] = samples[0x06] + samples[0x19]; - b1[0x19] = (samples[0x06] - samples[0x19]) * costab[0x6]; - - b1[0x07] = samples[0x07] + samples[0x18]; - b1[0x18] = (samples[0x07] - samples[0x18]) * costab[0x7]; - - b1[0x08] = samples[0x08] + samples[0x17]; - b1[0x17] = (samples[0x08] - samples[0x17]) * costab[0x8]; - - b1[0x09] = samples[0x09] + samples[0x16]; - b1[0x16] = (samples[0x09] - samples[0x16]) * costab[0x9]; - - b1[0x0A] = samples[0x0A] + samples[0x15]; - b1[0x15] = (samples[0x0A] - samples[0x15]) * costab[0xA]; - - b1[0x0B] = samples[0x0B] + samples[0x14]; - b1[0x14] = (samples[0x0B] - samples[0x14]) * costab[0xB]; - - b1[0x0C] = samples[0x0C] + samples[0x13]; - b1[0x13] = (samples[0x0C] - samples[0x13]) * costab[0xC]; - - b1[0x0D] = samples[0x0D] + samples[0x12]; - b1[0x12] = (samples[0x0D] - samples[0x12]) * costab[0xD]; - - b1[0x0E] = samples[0x0E] + samples[0x11]; - b1[0x11] = (samples[0x0E] - samples[0x11]) * costab[0xE]; - - b1[0x0F] = samples[0x0F] + samples[0x10]; - b1[0x10] = (samples[0x0F] - samples[0x10]) * costab[0xF]; - } - - { - register real *costab = mp3lib_pnts[1]; - - b2[0x00] = b1[0x00] + b1[0x0F]; - b2[0x0F] = (b1[0x00] - b1[0x0F]) * costab[0]; - b2[0x01] = b1[0x01] + b1[0x0E]; - b2[0x0E] = (b1[0x01] - b1[0x0E]) * costab[1]; - b2[0x02] = b1[0x02] + b1[0x0D]; - b2[0x0D] = (b1[0x02] - b1[0x0D]) * costab[2]; - b2[0x03] = b1[0x03] + b1[0x0C]; - b2[0x0C] = (b1[0x03] - b1[0x0C]) * costab[3]; - b2[0x04] = b1[0x04] + b1[0x0B]; - b2[0x0B] = (b1[0x04] - b1[0x0B]) * costab[4]; - b2[0x05] = b1[0x05] + b1[0x0A]; - b2[0x0A] = (b1[0x05] - b1[0x0A]) * costab[5]; - b2[0x06] = b1[0x06] + b1[0x09]; - b2[0x09] = (b1[0x06] - b1[0x09]) * costab[6]; - b2[0x07] = b1[0x07] + b1[0x08]; - b2[0x08] = (b1[0x07] - b1[0x08]) * costab[7]; - - b2[0x10] = b1[0x10] + b1[0x1F]; - b2[0x1F] = (b1[0x1F] - b1[0x10]) * costab[0]; - b2[0x11] = b1[0x11] + b1[0x1E]; - b2[0x1E] = (b1[0x1E] - b1[0x11]) * costab[1]; - b2[0x12] = b1[0x12] + b1[0x1D]; - b2[0x1D] = (b1[0x1D] - b1[0x12]) * costab[2]; - b2[0x13] = b1[0x13] + b1[0x1C]; - b2[0x1C] = (b1[0x1C] - b1[0x13]) * costab[3]; - b2[0x14] = b1[0x14] + b1[0x1B]; - b2[0x1B] = (b1[0x1B] - b1[0x14]) * costab[4]; - b2[0x15] = b1[0x15] + b1[0x1A]; - b2[0x1A] = (b1[0x1A] - b1[0x15]) * costab[5]; - b2[0x16] = b1[0x16] + b1[0x19]; - b2[0x19] = (b1[0x19] - b1[0x16]) * costab[6]; - b2[0x17] = b1[0x17] + b1[0x18]; - b2[0x18] = (b1[0x18] - b1[0x17]) * costab[7]; - } - - { - register real *costab = mp3lib_pnts[2]; - - b1[0x00] = b2[0x00] + b2[0x07]; - b1[0x07] = (b2[0x00] - b2[0x07]) * costab[0]; - b1[0x01] = b2[0x01] + b2[0x06]; - b1[0x06] = (b2[0x01] - b2[0x06]) * costab[1]; - b1[0x02] = b2[0x02] + b2[0x05]; - b1[0x05] = (b2[0x02] - b2[0x05]) * costab[2]; - b1[0x03] = b2[0x03] + b2[0x04]; - b1[0x04] = (b2[0x03] - b2[0x04]) * costab[3]; - - b1[0x08] = b2[0x08] + b2[0x0F]; - b1[0x0F] = (b2[0x0F] - b2[0x08]) * costab[0]; - b1[0x09] = b2[0x09] + b2[0x0E]; - b1[0x0E] = (b2[0x0E] - b2[0x09]) * costab[1]; - b1[0x0A] = b2[0x0A] + b2[0x0D]; - b1[0x0D] = (b2[0x0D] - b2[0x0A]) * costab[2]; - b1[0x0B] = b2[0x0B] + b2[0x0C]; - b1[0x0C] = (b2[0x0C] - b2[0x0B]) * costab[3]; - - b1[0x10] = b2[0x10] + b2[0x17]; - b1[0x17] = (b2[0x10] - b2[0x17]) * costab[0]; - b1[0x11] = b2[0x11] + b2[0x16]; - b1[0x16] = (b2[0x11] - b2[0x16]) * costab[1]; - b1[0x12] = b2[0x12] + b2[0x15]; - b1[0x15] = (b2[0x12] - b2[0x15]) * costab[2]; - b1[0x13] = b2[0x13] + b2[0x14]; - b1[0x14] = (b2[0x13] - b2[0x14]) * costab[3]; - - b1[0x18] = b2[0x18] + b2[0x1F]; - b1[0x1F] = (b2[0x1F] - b2[0x18]) * costab[0]; - b1[0x19] = b2[0x19] + b2[0x1E]; - b1[0x1E] = (b2[0x1E] - b2[0x19]) * costab[1]; - b1[0x1A] = b2[0x1A] + b2[0x1D]; - b1[0x1D] = (b2[0x1D] - b2[0x1A]) * costab[2]; - b1[0x1B] = b2[0x1B] + b2[0x1C]; - b1[0x1C] = (b2[0x1C] - b2[0x1B]) * costab[3]; - } - - { - register real const cos0 = mp3lib_pnts[3][0]; - register real const cos1 = mp3lib_pnts[3][1]; - - b2[0x00] = b1[0x00] + b1[0x03]; - b2[0x03] = (b1[0x00] - b1[0x03]) * cos0; - b2[0x01] = b1[0x01] + b1[0x02]; - b2[0x02] = (b1[0x01] - b1[0x02]) * cos1; - - b2[0x04] = b1[0x04] + b1[0x07]; - b2[0x07] = (b1[0x07] - b1[0x04]) * cos0; - b2[0x05] = b1[0x05] + b1[0x06]; - b2[0x06] = (b1[0x06] - b1[0x05]) * cos1; - - b2[0x08] = b1[0x08] + b1[0x0B]; - b2[0x0B] = (b1[0x08] - b1[0x0B]) * cos0; - b2[0x09] = b1[0x09] + b1[0x0A]; - b2[0x0A] = (b1[0x09] - b1[0x0A]) * cos1; - - b2[0x0C] = b1[0x0C] + b1[0x0F]; - b2[0x0F] = (b1[0x0F] - b1[0x0C]) * cos0; - b2[0x0D] = b1[0x0D] + b1[0x0E]; - b2[0x0E] = (b1[0x0E] - b1[0x0D]) * cos1; - - b2[0x10] = b1[0x10] + b1[0x13]; - b2[0x13] = (b1[0x10] - b1[0x13]) * cos0; - b2[0x11] = b1[0x11] + b1[0x12]; - b2[0x12] = (b1[0x11] - b1[0x12]) * cos1; - - b2[0x14] = b1[0x14] + b1[0x17]; - b2[0x17] = (b1[0x17] - b1[0x14]) * cos0; - b2[0x15] = b1[0x15] + b1[0x16]; - b2[0x16] = (b1[0x16] - b1[0x15]) * cos1; - - b2[0x18] = b1[0x18] + b1[0x1B]; - b2[0x1B] = (b1[0x18] - b1[0x1B]) * cos0; - b2[0x19] = b1[0x19] + b1[0x1A]; - b2[0x1A] = (b1[0x19] - b1[0x1A]) * cos1; - - b2[0x1C] = b1[0x1C] + b1[0x1F]; - b2[0x1F] = (b1[0x1F] - b1[0x1C]) * cos0; - b2[0x1D] = b1[0x1D] + b1[0x1E]; - b2[0x1E] = (b1[0x1E] - b1[0x1D]) * cos1; - } - - { - register real const cos0 = mp3lib_pnts[4][0]; - - b1[0x00] = b2[0x00] + b2[0x01]; - b1[0x01] = (b2[0x00] - b2[0x01]) * cos0; - b1[0x02] = b2[0x02] + b2[0x03]; - b1[0x03] = (b2[0x03] - b2[0x02]) * cos0; - b1[0x02] += b1[0x03]; - - b1[0x04] = b2[0x04] + b2[0x05]; - b1[0x05] = (b2[0x04] - b2[0x05]) * cos0; - b1[0x06] = b2[0x06] + b2[0x07]; - b1[0x07] = (b2[0x07] - b2[0x06]) * cos0; - b1[0x06] += b1[0x07]; - b1[0x04] += b1[0x06]; - b1[0x06] += b1[0x05]; - b1[0x05] += b1[0x07]; - - b1[0x08] = b2[0x08] + b2[0x09]; - b1[0x09] = (b2[0x08] - b2[0x09]) * cos0; - b1[0x0A] = b2[0x0A] + b2[0x0B]; - b1[0x0B] = (b2[0x0B] - b2[0x0A]) * cos0; - b1[0x0A] += b1[0x0B]; - - b1[0x0C] = b2[0x0C] + b2[0x0D]; - b1[0x0D] = (b2[0x0C] - b2[0x0D]) * cos0; - b1[0x0E] = b2[0x0E] + b2[0x0F]; - b1[0x0F] = (b2[0x0F] - b2[0x0E]) * cos0; - b1[0x0E] += b1[0x0F]; - b1[0x0C] += b1[0x0E]; - b1[0x0E] += b1[0x0D]; - b1[0x0D] += b1[0x0F]; - - b1[0x10] = b2[0x10] + b2[0x11]; - b1[0x11] = (b2[0x10] - b2[0x11]) * cos0; - b1[0x12] = b2[0x12] + b2[0x13]; - b1[0x13] = (b2[0x13] - b2[0x12]) * cos0; - b1[0x12] += b1[0x13]; - - b1[0x14] = b2[0x14] + b2[0x15]; - b1[0x15] = (b2[0x14] - b2[0x15]) * cos0; - b1[0x16] = b2[0x16] + b2[0x17]; - b1[0x17] = (b2[0x17] - b2[0x16]) * cos0; - b1[0x16] += b1[0x17]; - b1[0x14] += b1[0x16]; - b1[0x16] += b1[0x15]; - b1[0x15] += b1[0x17]; - - b1[0x18] = b2[0x18] + b2[0x19]; - b1[0x19] = (b2[0x18] - b2[0x19]) * cos0; - b1[0x1A] = b2[0x1A] + b2[0x1B]; - b1[0x1B] = (b2[0x1B] - b2[0x1A]) * cos0; - b1[0x1A] += b1[0x1B]; - - b1[0x1C] = b2[0x1C] + b2[0x1D]; - b1[0x1D] = (b2[0x1C] - b2[0x1D]) * cos0; - b1[0x1E] = b2[0x1E] + b2[0x1F]; - b1[0x1F] = (b2[0x1F] - b2[0x1E]) * cos0; - b1[0x1E] += b1[0x1F]; - b1[0x1C] += b1[0x1E]; - b1[0x1E] += b1[0x1D]; - b1[0x1D] += b1[0x1F]; - } - - out0[0x10 * 16] = b1[0x00]; - out0[0x10 * 12] = b1[0x04]; - out0[0x10 * 8] = b1[0x02]; - out0[0x10 * 4] = b1[0x06]; - out0[0x10 * 0] = b1[0x01]; - out1[0x10 * 0] = b1[0x01]; - out1[0x10 * 4] = b1[0x05]; - out1[0x10 * 8] = b1[0x03]; - out1[0x10 * 12] = b1[0x07]; - - b1[0x08] += b1[0x0C]; - out0[0x10 * 14] = b1[0x08]; - b1[0x0C] += b1[0x0a]; - out0[0x10 * 10] = b1[0x0C]; - b1[0x0A] += b1[0x0E]; - out0[0x10 * 6] = b1[0x0A]; - b1[0x0E] += b1[0x09]; - out0[0x10 * 2] = b1[0x0E]; - b1[0x09] += b1[0x0D]; - out1[0x10 * 2] = b1[0x09]; - b1[0x0D] += b1[0x0B]; - out1[0x10 * 6] = b1[0x0D]; - b1[0x0B] += b1[0x0F]; - out1[0x10 * 10] = b1[0x0B]; - out1[0x10 * 14] = b1[0x0F]; - - b1[0x18] += b1[0x1C]; - out0[0x10 * 15] = b1[0x10] + b1[0x18]; - out0[0x10 * 13] = b1[0x18] + b1[0x14]; - b1[0x1C] += b1[0x1a]; - out0[0x10 * 11] = b1[0x14] + b1[0x1C]; - out0[0x10 * 9] = b1[0x1C] + b1[0x12]; - b1[0x1A] += b1[0x1E]; - out0[0x10 * 7] = b1[0x12] + b1[0x1A]; - out0[0x10 * 5] = b1[0x1A] + b1[0x16]; - b1[0x1E] += b1[0x19]; - out0[0x10 * 3] = b1[0x16] + b1[0x1E]; - out0[0x10 * 1] = b1[0x1E] + b1[0x11]; - b1[0x19] += b1[0x1D]; - out1[0x10 * 1] = b1[0x11] + b1[0x19]; - out1[0x10 * 3] = b1[0x19] + b1[0x15]; - b1[0x1D] += b1[0x1B]; - out1[0x10 * 5] = b1[0x15] + b1[0x1D]; - out1[0x10 * 7] = b1[0x1D] + b1[0x13]; - b1[0x1B] += b1[0x1F]; - out1[0x10 * 9] = b1[0x13] + b1[0x1B]; - out1[0x10 * 11] = b1[0x1B] + b1[0x17]; - out1[0x10 * 13] = b1[0x17] + b1[0x1F]; - out1[0x10 * 15] = b1[0x1F]; -} - -/* - * the call via mpg123_dct64 is a trick to force GCC to use - * (new) registers for the b1,b2 pointer to the bufs[xx] field - */ -void mpg123_dct64(real * a, real * b, real * c) -{ - real bufs[0x40]; - - mpg123_dct64_1(a, b, bufs, bufs + 0x20, c); -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct64_k7.c --- a/mp3lib/dct64_k7.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,767 +0,0 @@ -/* -* This code was taken from http://www.mpg123.org -* See ChangeLog of mpg123-0.59s-pre.1 for detail -* Applied to mplayer by Nick Kurshev -* Partial 3dnowex-DSP! optimization by Nick Kurshev -* -* TODO: optimize scalar 3dnow! code -* Warning: Phases 7 & 8 are not tested -*/ - -#include "config.h" -#include "mangle.h" -#include "mpg123.h" -#include "libavutil/x86_cpu.h" - -static unsigned long long int attribute_used __attribute__((aligned(8))) x_plus_minus_3dnow = 0x8000000000000000ULL; -static float attribute_used plus_1f = 1.0; - -void dct64_MMX_3dnowex(short *a,short *b,real *c) -{ - char tmp[256]; - __asm__ volatile( -" mov %2,%%"REG_a"\n\t" - -" lea 128+%3,%%"REG_d"\n\t" -" mov %0,%%"REG_S"\n\t" -" mov %1,%%"REG_D"\n\t" -" mov $"MANGLE(costab_mmx)",%%"REG_b"\n\t" -" lea %3,%%"REG_c"\n\t" - -/* Phase 1*/ -" movq (%%"REG_a"), %%mm0\n\t" -" movq 8(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 120(%%"REG_a"), %%mm1\n\t" -" pswapd 112(%%"REG_a"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_d")\n\t" -" movq %%mm4, 8(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul (%%"REG_b"), %%mm3\n\t" -" pfmul 8(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 120(%%"REG_d")\n\t" -" movq %%mm7, 112(%%"REG_d")\n\t" - -" movq 16(%%"REG_a"), %%mm0\n\t" -" movq 24(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 104(%%"REG_a"), %%mm1\n\t" -" pswapd 96(%%"REG_a"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 16(%%"REG_d")\n\t" -" movq %%mm4, 24(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 16(%%"REG_b"), %%mm3\n\t" -" pfmul 24(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 104(%%"REG_d")\n\t" -" movq %%mm7, 96(%%"REG_d")\n\t" - -" movq 32(%%"REG_a"), %%mm0\n\t" -" movq 40(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 88(%%"REG_a"), %%mm1\n\t" -" pswapd 80(%%"REG_a"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 32(%%"REG_d")\n\t" -" movq %%mm4, 40(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 32(%%"REG_b"), %%mm3\n\t" -" pfmul 40(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 88(%%"REG_d")\n\t" -" movq %%mm7, 80(%%"REG_d")\n\t" - -" movq 48(%%"REG_a"), %%mm0\n\t" -" movq 56(%%"REG_a"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 72(%%"REG_a"), %%mm1\n\t" -" pswapd 64(%%"REG_a"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 48(%%"REG_d")\n\t" -" movq %%mm4, 56(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 48(%%"REG_b"), %%mm3\n\t" -" pfmul 56(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 72(%%"REG_d")\n\t" -" movq %%mm7, 64(%%"REG_d")\n\t" - -/* Phase 2*/ - -" movq (%%"REG_d"), %%mm0\n\t" -" movq 8(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 56(%%"REG_d"), %%mm1\n\t" -" pswapd 48(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_c")\n\t" -" movq %%mm4, 8(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 64(%%"REG_b"), %%mm3\n\t" -" pfmul 72(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 56(%%"REG_c")\n\t" -" movq %%mm7, 48(%%"REG_c")\n\t" - -" movq 16(%%"REG_d"), %%mm0\n\t" -" movq 24(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 40(%%"REG_d"), %%mm1\n\t" -" pswapd 32(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 16(%%"REG_c")\n\t" -" movq %%mm4, 24(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul 80(%%"REG_b"), %%mm3\n\t" -" pfmul 88(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 40(%%"REG_c")\n\t" -" movq %%mm7, 32(%%"REG_c")\n\t" - -/* Phase 3*/ - -" movq 64(%%"REG_d"), %%mm0\n\t" -" movq 72(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 120(%%"REG_d"), %%mm1\n\t" -" pswapd 112(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 64(%%"REG_c")\n\t" -" movq %%mm4, 72(%%"REG_c")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 64(%%"REG_b"), %%mm3\n\t" -" pfmul 72(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 120(%%"REG_c")\n\t" -" movq %%mm7, 112(%%"REG_c")\n\t" - -" movq 80(%%"REG_d"), %%mm0\n\t" -" movq 88(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 104(%%"REG_d"), %%mm1\n\t" -" pswapd 96(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 80(%%"REG_c")\n\t" -" movq %%mm4, 88(%%"REG_c")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul 80(%%"REG_b"), %%mm3\n\t" -" pfmul 88(%%"REG_b"), %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 104(%%"REG_c")\n\t" -" movq %%mm7, 96(%%"REG_c")\n\t" - -/* Phase 4*/ - -" movq 96(%%"REG_b"), %%mm2\n\t" -" movq 104(%%"REG_b"), %%mm6\n\t" - -" movq (%%"REG_c"), %%mm0\n\t" -" movq 8(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 24(%%"REG_c"), %%mm1\n\t" -" pswapd 16(%%"REG_c"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_d")\n\t" -" movq %%mm4, 8(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm6, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 24(%%"REG_d")\n\t" -" movq %%mm7, 16(%%"REG_d")\n\t" - -" movq 32(%%"REG_c"), %%mm0\n\t" -" movq 40(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 56(%%"REG_c"), %%mm1\n\t" -" pswapd 48(%%"REG_c"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 32(%%"REG_d")\n\t" -" movq %%mm4, 40(%%"REG_d")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm6, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 56(%%"REG_d")\n\t" -" movq %%mm7, 48(%%"REG_d")\n\t" - -" movq 64(%%"REG_c"), %%mm0\n\t" -" movq 72(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 88(%%"REG_c"), %%mm1\n\t" -" pswapd 80(%%"REG_c"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 64(%%"REG_d")\n\t" -" movq %%mm4, 72(%%"REG_d")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsub %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm6, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 88(%%"REG_d")\n\t" -" movq %%mm7, 80(%%"REG_d")\n\t" - -" movq 96(%%"REG_c"), %%mm0\n\t" -" movq 104(%%"REG_c"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 120(%%"REG_c"), %%mm1\n\t" -" pswapd 112(%%"REG_c"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 96(%%"REG_d")\n\t" -" movq %%mm4, 104(%%"REG_d")\n\t" -" pfsubr %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm6, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 120(%%"REG_d")\n\t" -" movq %%mm7, 112(%%"REG_d")\n\t" - -/* Phase 5 */ - -" movq 112(%%"REG_b"), %%mm2\n\t" - -" movq (%%"REG_d"), %%mm0\n\t" -" movq 16(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 8(%%"REG_d"), %%mm1\n\t" -" pswapd 24(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, (%%"REG_c")\n\t" -" movq %%mm4, 16(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm2, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 8(%%"REG_c")\n\t" -" movq %%mm7, 24(%%"REG_c")\n\t" - -" movq 32(%%"REG_d"), %%mm0\n\t" -" movq 48(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 40(%%"REG_d"), %%mm1\n\t" -" pswapd 56(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 32(%%"REG_c")\n\t" -" movq %%mm4, 48(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm2, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 40(%%"REG_c")\n\t" -" movq %%mm7, 56(%%"REG_c")\n\t" - -" movq 64(%%"REG_d"), %%mm0\n\t" -" movq 80(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 72(%%"REG_d"), %%mm1\n\t" -" pswapd 88(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 64(%%"REG_c")\n\t" -" movq %%mm4, 80(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm2, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 72(%%"REG_c")\n\t" -" movq %%mm7, 88(%%"REG_c")\n\t" - -" movq 96(%%"REG_d"), %%mm0\n\t" -" movq 112(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm3\n\t" -" movq %%mm4, %%mm7\n\t" -" pswapd 104(%%"REG_d"), %%mm1\n\t" -" pswapd 120(%%"REG_d"), %%mm5\n\t" -" pfadd %%mm1, %%mm0\n\t" -" pfadd %%mm5, %%mm4\n\t" -" movq %%mm0, 96(%%"REG_c")\n\t" -" movq %%mm4, 112(%%"REG_c")\n\t" -" pfsub %%mm1, %%mm3\n\t" -" pfsubr %%mm5, %%mm7\n\t" -" pfmul %%mm2, %%mm3\n\t" -" pfmul %%mm2, %%mm7\n\t" -" pswapd %%mm3, %%mm3\n\t" -" pswapd %%mm7, %%mm7\n\t" -" movq %%mm3, 104(%%"REG_c")\n\t" -" movq %%mm7, 120(%%"REG_c")\n\t" - - -/* Phase 6. This is the end of easy road. */ -/* Code below is coded in scalar mode. Should be optimized */ - -" movd "MANGLE(plus_1f)", %%mm6\n\t" -" punpckldq 120(%%"REG_b"), %%mm6\n\t" /* mm6 = 1.0 | 120(%%"REG_b")*/ -" movq "MANGLE(x_plus_minus_3dnow)", %%mm7\n\t" /* mm7 = +1 | -1 */ - -" movq 32(%%"REG_c"), %%mm0\n\t" -" movq 64(%%"REG_c"), %%mm2\n\t" -" movq %%mm0, %%mm1\n\t" -" movq %%mm2, %%mm3\n\t" -" pxor %%mm7, %%mm1\n\t" -" pxor %%mm7, %%mm3\n\t" -" pfacc %%mm1, %%mm0\n\t" -" pfacc %%mm3, %%mm2\n\t" -" pfmul %%mm6, %%mm0\n\t" -" pfmul %%mm6, %%mm2\n\t" -" movq %%mm0, 32(%%"REG_d")\n\t" -" movq %%mm2, 64(%%"REG_d")\n\t" - -" movd 44(%%"REG_c"), %%mm0\n\t" -" movd 40(%%"REG_c"), %%mm2\n\t" -" movd 120(%%"REG_b"), %%mm3\n\t" -" punpckldq 76(%%"REG_c"), %%mm0\n\t" -" punpckldq 72(%%"REG_c"), %%mm2\n\t" -" punpckldq %%mm3, %%mm3\n\t" -" movq %%mm0, %%mm4\n\t" -" movq %%mm2, %%mm5\n\t" -" pfsub %%mm2, %%mm0\n\t" -" pfmul %%mm3, %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pfadd %%mm5, %%mm0\n\t" -" pfadd %%mm4, %%mm0\n\t" -" movq %%mm0, %%mm2\n\t" -" punpckldq %%mm1, %%mm0\n\t" -" punpckhdq %%mm1, %%mm2\n\t" -" movq %%mm0, 40(%%"REG_d")\n\t" -" movq %%mm2, 72(%%"REG_d")\n\t" - -" movd 48(%%"REG_c"), %%mm3\n\t" -" movd 60(%%"REG_c"), %%mm2\n\t" -" pfsub 52(%%"REG_c"), %%mm3\n\t" -" pfsub 56(%%"REG_c"), %%mm2\n\t" -" pfmul 120(%%"REG_b"), %%mm3\n\t" -" pfmul 120(%%"REG_b"), %%mm2\n\t" -" movq %%mm2, %%mm1\n\t" - -" pfadd 56(%%"REG_c"), %%mm1\n\t" -" pfadd 60(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 48(%%"REG_c"), %%mm0\n\t" -" pfadd 52(%%"REG_c"), %%mm0\n\t" -" pfadd %%mm3, %%mm1\n\t" -" punpckldq %%mm2, %%mm1\n\t" -" pfadd %%mm3, %%mm2\n\t" -" punpckldq %%mm2, %%mm0\n\t" -" movq %%mm1, 56(%%"REG_d")\n\t" -" movq %%mm0, 48(%%"REG_d")\n\t" - -/*---*/ - -" movd 92(%%"REG_c"), %%mm1\n\t" -" pfsub 88(%%"REG_c"), %%mm1\n\t" -" pfmul 120(%%"REG_b"), %%mm1\n\t" -" movd %%mm1, 92(%%"REG_d")\n\t" -" pfadd 92(%%"REG_c"), %%mm1\n\t" -" pfadd 88(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 80(%%"REG_c"), %%mm0\n\t" -" pfadd 84(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 80(%%"REG_d")\n\t" - -" movd 80(%%"REG_c"), %%mm0\n\t" -" pfsub 84(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" pfadd %%mm0, %%mm1\n\t" -" pfadd 92(%%"REG_d"), %%mm0\n\t" -" punpckldq %%mm1, %%mm0\n\t" -" movq %%mm0, 84(%%"REG_d")\n\t" - -" movq 96(%%"REG_c"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pxor %%mm7, %%mm1\n\t" -" pfacc %%mm1, %%mm0\n\t" -" pfmul %%mm6, %%mm0\n\t" -" movq %%mm0, 96(%%"REG_d")\n\t" - -" movd 108(%%"REG_c"), %%mm0\n\t" -" pfsub 104(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, 108(%%"REG_d")\n\t" -" pfadd 104(%%"REG_c"), %%mm0\n\t" -" pfadd 108(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 104(%%"REG_d")\n\t" - -" movd 124(%%"REG_c"), %%mm1\n\t" -" pfsub 120(%%"REG_c"), %%mm1\n\t" -" pfmul 120(%%"REG_b"), %%mm1\n\t" -" movd %%mm1, 124(%%"REG_d")\n\t" -" pfadd 120(%%"REG_c"), %%mm1\n\t" -" pfadd 124(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 112(%%"REG_c"), %%mm0\n\t" -" pfadd 116(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 112(%%"REG_d")\n\t" - -" movd 112(%%"REG_c"), %%mm0\n\t" -" pfsub 116(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" pfadd %%mm0,%%mm1\n\t" -" pfadd 124(%%"REG_d"), %%mm0\n\t" -" punpckldq %%mm1, %%mm0\n\t" -" movq %%mm0, 116(%%"REG_d")\n\t" - -// this code is broken, there is nothing modifying the z flag above. -#if 0 -" jnz .L01\n\t" - -/* Phase 7*/ -/* Code below is coded in scalar mode. Should be optimized */ - -" movd (%%"REG_c"), %%mm0\n\t" -" pfadd 4(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 1024(%%"REG_S")\n\t" - -" movd (%%"REG_c"), %%mm0\n\t" -" pfsub 4(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, (%%"REG_S")\n\t" -" movd %%mm0, (%%"REG_D")\n\t" - -" movd 12(%%"REG_c"), %%mm0\n\t" -" pfsub 8(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, 512(%%"REG_D")\n\t" -" pfadd 12(%%"REG_c"), %%mm0\n\t" -" pfadd 8(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 512(%%"REG_S")\n\t" - -" movd 16(%%"REG_c"), %%mm0\n\t" -" pfsub 20(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movq %%mm0, %%mm3\n\t" - -" movd 28(%%"REG_c"), %%mm0\n\t" -" pfsub 24(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" movd %%mm0, 768(%%"REG_D")\n\t" -" movq %%mm0, %%mm2\n\t" - -" pfadd 24(%%"REG_c"), %%mm0\n\t" -" pfadd 28(%%"REG_c"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" - -" pfadd 16(%%"REG_c"), %%mm0\n\t" -" pfadd 20(%%"REG_c"), %%mm0\n\t" -" movd %%mm0, 768(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm1\n\t" -" movd %%mm1, 256(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm2\n\t" -" movd %%mm2, 256(%%"REG_D")\n\t" - -/* Phase 8*/ - -" movq 32(%%"REG_d"), %%mm0\n\t" -" movq 48(%%"REG_d"), %%mm1\n\t" -" pfadd 48(%%"REG_d"), %%mm0\n\t" -" pfadd 40(%%"REG_d"), %%mm1\n\t" -" movd %%mm0, 896(%%"REG_S")\n\t" -" movd %%mm1, 640(%%"REG_S")\n\t" -" psrlq $32, %%mm0\n\t" -" psrlq $32, %%mm1\n\t" -" movd %%mm0, 128(%%"REG_D")\n\t" -" movd %%mm1, 384(%%"REG_D")\n\t" - -" movd 40(%%"REG_d"), %%mm0\n\t" -" pfadd 56(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 384(%%"REG_S")\n\t" - -" movd 56(%%"REG_d"), %%mm0\n\t" -" pfadd 36(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 128(%%"REG_S")\n\t" - -" movd 60(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 896(%%"REG_D")\n\t" -" pfadd 44(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 640(%%"REG_D")\n\t" - -" movq 96(%%"REG_d"), %%mm0\n\t" -" movq 112(%%"REG_d"), %%mm2\n\t" -" movq 104(%%"REG_d"), %%mm4\n\t" -" pfadd 112(%%"REG_d"), %%mm0\n\t" -" pfadd 104(%%"REG_d"), %%mm2\n\t" -" pfadd 120(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm1\n\t" -" movq %%mm2, %%mm3\n\t" -" movq %%mm4, %%mm5\n\t" -" pfadd 64(%%"REG_d"), %%mm0\n\t" -" pfadd 80(%%"REG_d"), %%mm2\n\t" -" pfadd 72(%%"REG_d"), %%mm4\n\t" -" movd %%mm0, 960(%%"REG_S")\n\t" -" movd %%mm2, 704(%%"REG_S")\n\t" -" movd %%mm4, 448(%%"REG_S")\n\t" -" psrlq $32, %%mm0\n\t" -" psrlq $32, %%mm2\n\t" -" psrlq $32, %%mm4\n\t" -" movd %%mm0, 64(%%"REG_D")\n\t" -" movd %%mm2, 320(%%"REG_D")\n\t" -" movd %%mm4, 576(%%"REG_D")\n\t" -" pfadd 80(%%"REG_d"), %%mm1\n\t" -" pfadd 72(%%"REG_d"), %%mm3\n\t" -" pfadd 88(%%"REG_d"), %%mm5\n\t" -" movd %%mm1, 832(%%"REG_S")\n\t" -" movd %%mm3, 576(%%"REG_S")\n\t" -" movd %%mm5, 320(%%"REG_S")\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm5\n\t" -" movd %%mm1, 192(%%"REG_D")\n\t" -" movd %%mm3, 448(%%"REG_D")\n\t" -" movd %%mm5, 704(%%"REG_D")\n\t" - -" movd 120(%%"REG_d"), %%mm0\n\t" -" pfadd 100(%%"REG_d"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pfadd 88(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 192(%%"REG_S")\n\t" -" pfadd 68(%%"REG_d"), %%mm1\n\t" -" movd %%mm1, 64(%%"REG_S")\n\t" - -" movd 124(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 960(%%"REG_D")\n\t" -" pfadd 92(%%"REG_d"), %%mm0\n\t" -" movd %%mm0, 832(%%"REG_D")\n\t" - -" jmp .L_bye\n\t" -".L01: \n\t" -#endif -/* Phase 9*/ - -" movq (%%"REG_c"), %%mm0\n\t" -" movq %%mm0, %%mm1\n\t" -" pxor %%mm7, %%mm1\n\t" -" pfacc %%mm1, %%mm0\n\t" -" pfmul %%mm6, %%mm0\n\t" -" pf2iw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movw %%ax, 512(%%"REG_S")\n\t" -" psrlq $32, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movw %%ax, (%%"REG_S")\n\t" - -" movd 12(%%"REG_c"), %%mm0\n\t" -" pfsub 8(%%"REG_c"), %%mm0\n\t" -" pfmul 120(%%"REG_b"), %%mm0\n\t" -" pf2iw %%mm0, %%mm7\n\t" -" movd %%mm7, %%"REG_a"\n\t" -" movw %%ax, 256(%%"REG_D")\n\t" -" pfadd 12(%%"REG_c"), %%mm0\n\t" -" pfadd 8(%%"REG_c"), %%mm0\n\t" -" pf2iw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movw %%ax, 256(%%"REG_S")\n\t" - -" movd 16(%%"REG_c"), %%mm3\n\t" -" pfsub 20(%%"REG_c"), %%mm3\n\t" -" pfmul 120(%%"REG_b"), %%mm3\n\t" -" movq %%mm3, %%mm2\n\t" - -" movd 28(%%"REG_c"), %%mm2\n\t" -" pfsub 24(%%"REG_c"), %%mm2\n\t" -" pfmul 120(%%"REG_b"), %%mm2\n\t" -" movq %%mm2, %%mm1\n\t" - -" pf2iw %%mm2, %%mm7\n\t" -" movd %%mm7, %%"REG_a"\n\t" -" movw %%ax, 384(%%"REG_D")\n\t" - -" pfadd 24(%%"REG_c"), %%mm1\n\t" -" pfadd 28(%%"REG_c"), %%mm1\n\t" -" movq %%mm1, %%mm0\n\t" - -" pfadd 16(%%"REG_c"), %%mm0\n\t" -" pfadd 20(%%"REG_c"), %%mm0\n\t" -" pf2iw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movw %%ax, 384(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm1\n\t" -" pf2iw %%mm1, %%mm1\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movw %%ax, 128(%%"REG_S")\n\t" -" pfadd %%mm3, %%mm2\n\t" -" pf2iw %%mm2, %%mm2\n\t" -" movd %%mm2, %%"REG_a"\n\t" -" movw %%ax, 128(%%"REG_D")\n\t" - -/* Phase 10*/ - -" movq 32(%%"REG_d"), %%mm0\n\t" -" movq 48(%%"REG_d"), %%mm1\n\t" -" pfadd 48(%%"REG_d"), %%mm0\n\t" -" pfadd 40(%%"REG_d"), %%mm1\n\t" -" pf2iw %%mm0, %%mm0\n\t" -" pf2iw %%mm1, %%mm1\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movd %%mm1, %%"REG_c"\n\t" -" movw %%ax, 448(%%"REG_S")\n\t" -" movw %%cx, 320(%%"REG_S")\n\t" -" psrlq $32, %%mm0\n\t" -" psrlq $32, %%mm1\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movd %%mm1, %%"REG_c"\n\t" -" movw %%ax, 64(%%"REG_D")\n\t" -" movw %%cx, 192(%%"REG_D")\n\t" - -" movd 40(%%"REG_d"), %%mm3\n\t" -" movd 56(%%"REG_d"), %%mm4\n\t" -" movd 60(%%"REG_d"), %%mm0\n\t" -" movd 44(%%"REG_d"), %%mm2\n\t" -" movd 120(%%"REG_d"), %%mm5\n\t" -" punpckldq %%mm4, %%mm3\n\t" -" punpckldq 124(%%"REG_d"), %%mm0\n\t" -" pfadd 100(%%"REG_d"), %%mm5\n\t" -" punpckldq 36(%%"REG_d"), %%mm4\n\t" -" punpckldq 92(%%"REG_d"), %%mm2\n\t" -" movq %%mm5, %%mm6\n\t" -" pfadd %%mm4, %%mm3\n\t" -" pf2iw %%mm0, %%mm1\n\t" -" pf2iw %%mm3, %%mm3\n\t" -" pfadd 88(%%"REG_d"), %%mm5\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movd %%mm3, %%"REG_c"\n\t" -" movw %%ax, 448(%%"REG_D")\n\t" -" movw %%cx, 192(%%"REG_S")\n\t" -" pf2iw %%mm5, %%mm5\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm3\n\t" -" movd %%mm5, %%"REG_b"\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movd %%mm3, %%"REG_c"\n\t" -" movw %%bx, 96(%%"REG_S")\n\t" -" movw %%ax, 480(%%"REG_D")\n\t" -" movw %%cx, 64(%%"REG_S")\n\t" -" pfadd %%mm2, %%mm0\n\t" -" pf2iw %%mm0, %%mm0\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" pfadd 68(%%"REG_d"), %%mm6\n\t" -" movw %%ax, 320(%%"REG_D")\n\t" -" psrlq $32, %%mm0\n\t" -" pf2iw %%mm6, %%mm6\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movd %%mm6, %%"REG_b"\n\t" -" movw %%ax, 416(%%"REG_D")\n\t" -" movw %%bx, 32(%%"REG_S")\n\t" - -" movq 96(%%"REG_d"), %%mm0\n\t" -" movq 112(%%"REG_d"), %%mm2\n\t" -" movq 104(%%"REG_d"), %%mm4\n\t" -" pfadd %%mm2, %%mm0\n\t" -" pfadd %%mm4, %%mm2\n\t" -" pfadd 120(%%"REG_d"), %%mm4\n\t" -" movq %%mm0, %%mm1\n\t" -" movq %%mm2, %%mm3\n\t" -" movq %%mm4, %%mm5\n\t" -" pfadd 64(%%"REG_d"), %%mm0\n\t" -" pfadd 80(%%"REG_d"), %%mm2\n\t" -" pfadd 72(%%"REG_d"), %%mm4\n\t" -" pf2iw %%mm0, %%mm0\n\t" -" pf2iw %%mm2, %%mm2\n\t" -" pf2iw %%mm4, %%mm4\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movd %%mm2, %%"REG_c"\n\t" -" movd %%mm4, %%"REG_b"\n\t" -" movw %%ax, 480(%%"REG_S")\n\t" -" movw %%cx, 352(%%"REG_S")\n\t" -" movw %%bx, 224(%%"REG_S")\n\t" -" psrlq $32, %%mm0\n\t" -" psrlq $32, %%mm2\n\t" -" psrlq $32, %%mm4\n\t" -" movd %%mm0, %%"REG_a"\n\t" -" movd %%mm2, %%"REG_c"\n\t" -" movd %%mm4, %%"REG_b"\n\t" -" movw %%ax, 32(%%"REG_D")\n\t" -" movw %%cx, 160(%%"REG_D")\n\t" -" movw %%bx, 288(%%"REG_D")\n\t" -" pfadd 80(%%"REG_d"), %%mm1\n\t" -" pfadd 72(%%"REG_d"), %%mm3\n\t" -" pfadd 88(%%"REG_d"), %%mm5\n\t" -" pf2iw %%mm1, %%mm1\n\t" -" pf2iw %%mm3, %%mm3\n\t" -" pf2iw %%mm5, %%mm5\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movd %%mm3, %%"REG_c"\n\t" -" movd %%mm5, %%"REG_b"\n\t" -" movw %%ax, 416(%%"REG_S")\n\t" -" movw %%cx, 288(%%"REG_S")\n\t" -" movw %%bx, 160(%%"REG_S")\n\t" -" psrlq $32, %%mm1\n\t" -" psrlq $32, %%mm3\n\t" -" psrlq $32, %%mm5\n\t" -" movd %%mm1, %%"REG_a"\n\t" -" movd %%mm3, %%"REG_c"\n\t" -" movd %%mm5, %%"REG_b"\n\t" -" movw %%ax, 96(%%"REG_D")\n\t" -" movw %%cx, 224(%%"REG_D")\n\t" -" movw %%bx, 352(%%"REG_D")\n\t" - -" movsw\n\t" - -".L_bye:\n\t" -" femms\n\t" - : - :"m"(a),"m"(b),"m"(c),"m"(tmp[0]) - :"memory","%eax","%ebx","%ecx","%edx","%esi","%edi"); -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct64_mmx.c --- a/mp3lib/dct64_mmx.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,987 +0,0 @@ -/* -* This code was taken from http://www.mpg123.org -* See ChangeLog of mpg123-0.59s-pre.1 for detail -* Applied to mplayer by Nick Kurshev -*/ -#include "config.h" -#include "mangle.h" -#include "mpg123.h" -#include "libavutil/x86_cpu.h" - -void dct64_MMX(short *a,short *b,real *c) -{ - char tmp[256]; - __asm__ volatile( -" mov %2,%%"REG_a"\n\t" -/* Phase 1*/ -" flds (%%"REG_a")\n\t" -" lea 128+%3,%%"REG_d"\n\t" -" fadds 124(%%"REG_a")\n\t" -" mov %0,%%"REG_S"\n\t" -" fstps (%%"REG_d")\n\t" -" mov %1,%%"REG_D"\n\t" - -" flds 4(%%"REG_a")\n\t" -" mov $"MANGLE(costab_mmx)",%%"REG_b"\n\t" -" fadds 120(%%"REG_a")\n\t" -" or %%"REG_c",%%"REG_c"\n\t" -" fstps 4(%%"REG_d")\n\t" - -" flds (%%"REG_a")\n\t" -" lea %3,%%"REG_c"\n\t" -" fsubs 124(%%"REG_a")\n\t" -" fmuls (%%"REG_b")\n\t" -" fstps 124(%%"REG_d")\n\t" - -" flds 4(%%"REG_a")\n\t" -" fsubs 120(%%"REG_a")\n\t" -" fmuls 4(%%"REG_b")\n\t" -" fstps 120(%%"REG_d")\n\t" - -" flds 8(%%"REG_a")\n\t" -" fadds 116(%%"REG_a")\n\t" -" fstps 8(%%"REG_d")\n\t" - -" flds 12(%%"REG_a")\n\t" -" fadds 112(%%"REG_a")\n\t" -" fstps 12(%%"REG_d")\n\t" - -" flds 8(%%"REG_a")\n\t" -" fsubs 116(%%"REG_a")\n\t" -" fmuls 8(%%"REG_b")\n\t" -" fstps 116(%%"REG_d")\n\t" - -" flds 12(%%"REG_a")\n\t" -" fsubs 112(%%"REG_a")\n\t" -" fmuls 12(%%"REG_b")\n\t" -" fstps 112(%%"REG_d")\n\t" - -" flds 16(%%"REG_a")\n\t" -" fadds 108(%%"REG_a")\n\t" -" fstps 16(%%"REG_d")\n\t" - -" flds 20(%%"REG_a")\n\t" -" fadds 104(%%"REG_a")\n\t" -" fstps 20(%%"REG_d")\n\t" - -" flds 16(%%"REG_a")\n\t" -" fsubs 108(%%"REG_a")\n\t" -" fmuls 16(%%"REG_b")\n\t" -" fstps 108(%%"REG_d")\n\t" - -" flds 20(%%"REG_a")\n\t" -" fsubs 104(%%"REG_a")\n\t" -" fmuls 20(%%"REG_b")\n\t" -" fstps 104(%%"REG_d")\n\t" - -" flds 24(%%"REG_a")\n\t" -" fadds 100(%%"REG_a")\n\t" -" fstps 24(%%"REG_d")\n\t" - -" flds 28(%%"REG_a")\n\t" -" fadds 96(%%"REG_a")\n\t" -" fstps 28(%%"REG_d")\n\t" - -" flds 24(%%"REG_a")\n\t" -" fsubs 100(%%"REG_a")\n\t" -" fmuls 24(%%"REG_b")\n\t" -" fstps 100(%%"REG_d")\n\t" - -" flds 28(%%"REG_a")\n\t" -" fsubs 96(%%"REG_a")\n\t" -" fmuls 28(%%"REG_b")\n\t" -" fstps 96(%%"REG_d")\n\t" - -" flds 32(%%"REG_a")\n\t" -" fadds 92(%%"REG_a")\n\t" -" fstps 32(%%"REG_d")\n\t" - -" flds 36(%%"REG_a")\n\t" -" fadds 88(%%"REG_a")\n\t" -" fstps 36(%%"REG_d")\n\t" - -" flds 32(%%"REG_a")\n\t" -" fsubs 92(%%"REG_a")\n\t" -" fmuls 32(%%"REG_b")\n\t" -" fstps 92(%%"REG_d")\n\t" - -" flds 36(%%"REG_a")\n\t" -" fsubs 88(%%"REG_a")\n\t" -" fmuls 36(%%"REG_b")\n\t" -" fstps 88(%%"REG_d")\n\t" - -" flds 40(%%"REG_a")\n\t" -" fadds 84(%%"REG_a")\n\t" -" fstps 40(%%"REG_d")\n\t" - -" flds 44(%%"REG_a")\n\t" -" fadds 80(%%"REG_a")\n\t" -" fstps 44(%%"REG_d")\n\t" - -" flds 40(%%"REG_a")\n\t" -" fsubs 84(%%"REG_a")\n\t" -" fmuls 40(%%"REG_b")\n\t" -" fstps 84(%%"REG_d")\n\t" - -" flds 44(%%"REG_a")\n\t" -" fsubs 80(%%"REG_a")\n\t" -" fmuls 44(%%"REG_b")\n\t" -" fstps 80(%%"REG_d")\n\t" - -" flds 48(%%"REG_a")\n\t" -" fadds 76(%%"REG_a")\n\t" -" fstps 48(%%"REG_d")\n\t" - -" flds 52(%%"REG_a")\n\t" -" fadds 72(%%"REG_a")\n\t" -" fstps 52(%%"REG_d")\n\t" - -" flds 48(%%"REG_a")\n\t" -" fsubs 76(%%"REG_a")\n\t" -" fmuls 48(%%"REG_b")\n\t" -" fstps 76(%%"REG_d")\n\t" - -" flds 52(%%"REG_a")\n\t" -" fsubs 72(%%"REG_a")\n\t" -" fmuls 52(%%"REG_b")\n\t" -" fstps 72(%%"REG_d")\n\t" - -" flds 56(%%"REG_a")\n\t" -" fadds 68(%%"REG_a")\n\t" -" fstps 56(%%"REG_d")\n\t" - -" flds 60(%%"REG_a")\n\t" -" fadds 64(%%"REG_a")\n\t" -" fstps 60(%%"REG_d")\n\t" - -" flds 56(%%"REG_a")\n\t" -" fsubs 68(%%"REG_a")\n\t" -" fmuls 56(%%"REG_b")\n\t" -" fstps 68(%%"REG_d")\n\t" - -" flds 60(%%"REG_a")\n\t" -" fsubs 64(%%"REG_a")\n\t" -" fmuls 60(%%"REG_b")\n\t" -" fstps 64(%%"REG_d")\n\t" - -/* Phase 2*/ - -" flds (%%"REG_d")\n\t" -" fadds 60(%%"REG_d")\n\t" -" fstps (%%"REG_c")\n\t" - -" flds 4(%%"REG_d")\n\t" -" fadds 56(%%"REG_d")\n\t" -" fstps 4(%%"REG_c")\n\t" - -" flds (%%"REG_d")\n\t" -" fsubs 60(%%"REG_d")\n\t" -" fmuls 64(%%"REG_b")\n\t" -" fstps 60(%%"REG_c")\n\t" - -" flds 4(%%"REG_d")\n\t" -" fsubs 56(%%"REG_d")\n\t" -" fmuls 68(%%"REG_b")\n\t" -" fstps 56(%%"REG_c")\n\t" - -" flds 8(%%"REG_d")\n\t" -" fadds 52(%%"REG_d")\n\t" -" fstps 8(%%"REG_c")\n\t" - -" flds 12(%%"REG_d")\n\t" -" fadds 48(%%"REG_d")\n\t" -" fstps 12(%%"REG_c")\n\t" - -" flds 8(%%"REG_d")\n\t" -" fsubs 52(%%"REG_d")\n\t" -" fmuls 72(%%"REG_b")\n\t" -" fstps 52(%%"REG_c")\n\t" - -" flds 12(%%"REG_d")\n\t" -" fsubs 48(%%"REG_d")\n\t" -" fmuls 76(%%"REG_b")\n\t" -" fstps 48(%%"REG_c")\n\t" - -" flds 16(%%"REG_d")\n\t" -" fadds 44(%%"REG_d")\n\t" -" fstps 16(%%"REG_c")\n\t" - -" flds 20(%%"REG_d")\n\t" -" fadds 40(%%"REG_d")\n\t" -" fstps 20(%%"REG_c")\n\t" - -" flds 16(%%"REG_d")\n\t" -" fsubs 44(%%"REG_d")\n\t" -" fmuls 80(%%"REG_b")\n\t" -" fstps 44(%%"REG_c")\n\t" - -" flds 20(%%"REG_d")\n\t" -" fsubs 40(%%"REG_d")\n\t" -" fmuls 84(%%"REG_b")\n\t" -" fstps 40(%%"REG_c")\n\t" - -" flds 24(%%"REG_d")\n\t" -" fadds 36(%%"REG_d")\n\t" -" fstps 24(%%"REG_c")\n\t" - -" flds 28(%%"REG_d")\n\t" -" fadds 32(%%"REG_d")\n\t" -" fstps 28(%%"REG_c")\n\t" - -" flds 24(%%"REG_d")\n\t" -" fsubs 36(%%"REG_d")\n\t" -" fmuls 88(%%"REG_b")\n\t" -" fstps 36(%%"REG_c")\n\t" - -" flds 28(%%"REG_d")\n\t" -" fsubs 32(%%"REG_d")\n\t" -" fmuls 92(%%"REG_b")\n\t" -" fstps 32(%%"REG_c")\n\t" - -/* Phase 3*/ - -" flds 64(%%"REG_d")\n\t" -" fadds 124(%%"REG_d")\n\t" -" fstps 64(%%"REG_c")\n\t" - -" flds 68(%%"REG_d")\n\t" -" fadds 120(%%"REG_d")\n\t" -" fstps 68(%%"REG_c")\n\t" - -" flds 124(%%"REG_d")\n\t" -" fsubs 64(%%"REG_d")\n\t" -" fmuls 64(%%"REG_b")\n\t" -" fstps 124(%%"REG_c")\n\t" - -" flds 120(%%"REG_d")\n\t" -" fsubs 68(%%"REG_d")\n\t" -" fmuls 68(%%"REG_b")\n\t" -" fstps 120(%%"REG_c")\n\t" - -" flds 72(%%"REG_d")\n\t" -" fadds 116(%%"REG_d")\n\t" -" fstps 72(%%"REG_c")\n\t" - -" flds 76(%%"REG_d")\n\t" -" fadds 112(%%"REG_d")\n\t" -" fstps 76(%%"REG_c")\n\t" - -" flds 116(%%"REG_d")\n\t" -" fsubs 72(%%"REG_d")\n\t" -" fmuls 72(%%"REG_b")\n\t" -" fstps 116(%%"REG_c")\n\t" - -" flds 112(%%"REG_d")\n\t" -" fsubs 76(%%"REG_d")\n\t" -" fmuls 76(%%"REG_b")\n\t" -" fstps 112(%%"REG_c")\n\t" - -" flds 80(%%"REG_d")\n\t" -" fadds 108(%%"REG_d")\n\t" -" fstps 80(%%"REG_c")\n\t" - -" flds 84(%%"REG_d")\n\t" -" fadds 104(%%"REG_d")\n\t" -" fstps 84(%%"REG_c")\n\t" - -" flds 108(%%"REG_d")\n\t" -" fsubs 80(%%"REG_d")\n\t" -" fmuls 80(%%"REG_b")\n\t" -" fstps 108(%%"REG_c")\n\t" - -" flds 104(%%"REG_d")\n\t" -" fsubs 84(%%"REG_d")\n\t" -" fmuls 84(%%"REG_b")\n\t" -" fstps 104(%%"REG_c")\n\t" - -" flds 88(%%"REG_d")\n\t" -" fadds 100(%%"REG_d")\n\t" -" fstps 88(%%"REG_c")\n\t" - -" flds 92(%%"REG_d")\n\t" -" fadds 96(%%"REG_d")\n\t" -" fstps 92(%%"REG_c")\n\t" - -" flds 100(%%"REG_d")\n\t" -" fsubs 88(%%"REG_d")\n\t" -" fmuls 88(%%"REG_b")\n\t" -" fstps 100(%%"REG_c")\n\t" - -" flds 96(%%"REG_d")\n\t" -" fsubs 92(%%"REG_d")\n\t" -" fmuls 92(%%"REG_b")\n\t" -" fstps 96(%%"REG_c")\n\t" - -/* Phase 4*/ - -" flds (%%"REG_c")\n\t" -" fadds 28(%%"REG_c")\n\t" -" fstps (%%"REG_d")\n\t" - -" flds (%%"REG_c")\n\t" -" fsubs 28(%%"REG_c")\n\t" -" fmuls 96(%%"REG_b")\n\t" -" fstps 28(%%"REG_d")\n\t" - -" flds 4(%%"REG_c")\n\t" -" fadds 24(%%"REG_c")\n\t" -" fstps 4(%%"REG_d")\n\t" - -" flds 4(%%"REG_c")\n\t" -" fsubs 24(%%"REG_c")\n\t" -" fmuls 100(%%"REG_b")\n\t" -" fstps 24(%%"REG_d")\n\t" - -" flds 8(%%"REG_c")\n\t" -" fadds 20(%%"REG_c")\n\t" -" fstps 8(%%"REG_d")\n\t" - -" flds 8(%%"REG_c")\n\t" -" fsubs 20(%%"REG_c")\n\t" -" fmuls 104(%%"REG_b")\n\t" -" fstps 20(%%"REG_d")\n\t" - -" flds 12(%%"REG_c")\n\t" -" fadds 16(%%"REG_c")\n\t" -" fstps 12(%%"REG_d")\n\t" - -" flds 12(%%"REG_c")\n\t" -" fsubs 16(%%"REG_c")\n\t" -" fmuls 108(%%"REG_b")\n\t" -" fstps 16(%%"REG_d")\n\t" - -" flds 32(%%"REG_c")\n\t" -" fadds 60(%%"REG_c")\n\t" -" fstps 32(%%"REG_d")\n\t" - -" flds 60(%%"REG_c")\n\t" -" fsubs 32(%%"REG_c")\n\t" -" fmuls 96(%%"REG_b")\n\t" -" fstps 60(%%"REG_d")\n\t" - -" flds 36(%%"REG_c")\n\t" -" fadds 56(%%"REG_c")\n\t" -" fstps 36(%%"REG_d")\n\t" - -" flds 56(%%"REG_c")\n\t" -" fsubs 36(%%"REG_c")\n\t" -" fmuls 100(%%"REG_b")\n\t" -" fstps 56(%%"REG_d")\n\t" - -" flds 40(%%"REG_c")\n\t" -" fadds 52(%%"REG_c")\n\t" -" fstps 40(%%"REG_d")\n\t" - -" flds 52(%%"REG_c")\n\t" -" fsubs 40(%%"REG_c")\n\t" -" fmuls 104(%%"REG_b")\n\t" -" fstps 52(%%"REG_d")\n\t" - -" flds 44(%%"REG_c")\n\t" -" fadds 48(%%"REG_c")\n\t" -" fstps 44(%%"REG_d")\n\t" - -" flds 48(%%"REG_c")\n\t" -" fsubs 44(%%"REG_c")\n\t" -" fmuls 108(%%"REG_b")\n\t" -" fstps 48(%%"REG_d")\n\t" - -" flds 64(%%"REG_c")\n\t" -" fadds 92(%%"REG_c")\n\t" -" fstps 64(%%"REG_d")\n\t" - -" flds 64(%%"REG_c")\n\t" -" fsubs 92(%%"REG_c")\n\t" -" fmuls 96(%%"REG_b")\n\t" -" fstps 92(%%"REG_d")\n\t" - -" flds 68(%%"REG_c")\n\t" -" fadds 88(%%"REG_c")\n\t" -" fstps 68(%%"REG_d")\n\t" - -" flds 68(%%"REG_c")\n\t" -" fsubs 88(%%"REG_c")\n\t" -" fmuls 100(%%"REG_b")\n\t" -" fstps 88(%%"REG_d")\n\t" - -" flds 72(%%"REG_c")\n\t" -" fadds 84(%%"REG_c")\n\t" -" fstps 72(%%"REG_d")\n\t" - -" flds 72(%%"REG_c")\n\t" -" fsubs 84(%%"REG_c")\n\t" -" fmuls 104(%%"REG_b")\n\t" -" fstps 84(%%"REG_d")\n\t" - -" flds 76(%%"REG_c")\n\t" -" fadds 80(%%"REG_c")\n\t" -" fstps 76(%%"REG_d")\n\t" - -" flds 76(%%"REG_c")\n\t" -" fsubs 80(%%"REG_c")\n\t" -" fmuls 108(%%"REG_b")\n\t" -" fstps 80(%%"REG_d")\n\t" - -" flds 96(%%"REG_c")\n\t" -" fadds 124(%%"REG_c")\n\t" -" fstps 96(%%"REG_d")\n\t" - -" flds 124(%%"REG_c")\n\t" -" fsubs 96(%%"REG_c")\n\t" -" fmuls 96(%%"REG_b")\n\t" -" fstps 124(%%"REG_d")\n\t" - -" flds 100(%%"REG_c")\n\t" -" fadds 120(%%"REG_c")\n\t" -" fstps 100(%%"REG_d")\n\t" - -" flds 120(%%"REG_c")\n\t" -" fsubs 100(%%"REG_c")\n\t" -" fmuls 100(%%"REG_b")\n\t" -" fstps 120(%%"REG_d")\n\t" - -" flds 104(%%"REG_c")\n\t" -" fadds 116(%%"REG_c")\n\t" -" fstps 104(%%"REG_d")\n\t" - -" flds 116(%%"REG_c")\n\t" -" fsubs 104(%%"REG_c")\n\t" -" fmuls 104(%%"REG_b")\n\t" -" fstps 116(%%"REG_d")\n\t" - -" flds 108(%%"REG_c")\n\t" -" fadds 112(%%"REG_c")\n\t" -" fstps 108(%%"REG_d")\n\t" - -" flds 112(%%"REG_c")\n\t" -" fsubs 108(%%"REG_c")\n\t" -" fmuls 108(%%"REG_b")\n\t" -" fstps 112(%%"REG_d")\n\t" - -" flds (%%"REG_d")\n\t" -" fadds 12(%%"REG_d")\n\t" -" fstps (%%"REG_c")\n\t" - -" flds (%%"REG_d")\n\t" -" fsubs 12(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 12(%%"REG_c")\n\t" - -" flds 4(%%"REG_d")\n\t" -" fadds 8(%%"REG_d")\n\t" -" fstps 4(%%"REG_c")\n\t" - -" flds 4(%%"REG_d")\n\t" -" fsubs 8(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 8(%%"REG_c")\n\t" - -" flds 16(%%"REG_d")\n\t" -" fadds 28(%%"REG_d")\n\t" -" fstps 16(%%"REG_c")\n\t" - -" flds 28(%%"REG_d")\n\t" -" fsubs 16(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 28(%%"REG_c")\n\t" - -" flds 20(%%"REG_d")\n\t" -" fadds 24(%%"REG_d")\n\t" -" fstps 20(%%"REG_c")\n\t" - -" flds 24(%%"REG_d")\n\t" -" fsubs 20(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 24(%%"REG_c")\n\t" - -" flds 32(%%"REG_d")\n\t" -" fadds 44(%%"REG_d")\n\t" -" fstps 32(%%"REG_c")\n\t" - -" flds 32(%%"REG_d")\n\t" -" fsubs 44(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 44(%%"REG_c")\n\t" - -" flds 36(%%"REG_d")\n\t" -" fadds 40(%%"REG_d")\n\t" -" fstps 36(%%"REG_c")\n\t" - -" flds 36(%%"REG_d")\n\t" -" fsubs 40(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 40(%%"REG_c")\n\t" - -" flds 48(%%"REG_d")\n\t" -" fadds 60(%%"REG_d")\n\t" -" fstps 48(%%"REG_c")\n\t" - -" flds 60(%%"REG_d")\n\t" -" fsubs 48(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 60(%%"REG_c")\n\t" - -" flds 52(%%"REG_d")\n\t" -" fadds 56(%%"REG_d")\n\t" -" fstps 52(%%"REG_c")\n\t" - -" flds 56(%%"REG_d")\n\t" -" fsubs 52(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 56(%%"REG_c")\n\t" - -" flds 64(%%"REG_d")\n\t" -" fadds 76(%%"REG_d")\n\t" -" fstps 64(%%"REG_c")\n\t" - -" flds 64(%%"REG_d")\n\t" -" fsubs 76(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 76(%%"REG_c")\n\t" - -" flds 68(%%"REG_d")\n\t" -" fadds 72(%%"REG_d")\n\t" -" fstps 68(%%"REG_c")\n\t" - -" flds 68(%%"REG_d")\n\t" -" fsubs 72(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 72(%%"REG_c")\n\t" - -" flds 80(%%"REG_d")\n\t" -" fadds 92(%%"REG_d")\n\t" -" fstps 80(%%"REG_c")\n\t" - -" flds 92(%%"REG_d")\n\t" -" fsubs 80(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 92(%%"REG_c")\n\t" - -" flds 84(%%"REG_d")\n\t" -" fadds 88(%%"REG_d")\n\t" -" fstps 84(%%"REG_c")\n\t" - -" flds 88(%%"REG_d")\n\t" -" fsubs 84(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 88(%%"REG_c")\n\t" - -" flds 96(%%"REG_d")\n\t" -" fadds 108(%%"REG_d")\n\t" -" fstps 96(%%"REG_c")\n\t" - -" flds 96(%%"REG_d")\n\t" -" fsubs 108(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 108(%%"REG_c")\n\t" - -" flds 100(%%"REG_d")\n\t" -" fadds 104(%%"REG_d")\n\t" -" fstps 100(%%"REG_c")\n\t" - -" flds 100(%%"REG_d")\n\t" -" fsubs 104(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 104(%%"REG_c")\n\t" - -" flds 112(%%"REG_d")\n\t" -" fadds 124(%%"REG_d")\n\t" -" fstps 112(%%"REG_c")\n\t" - -" flds 124(%%"REG_d")\n\t" -" fsubs 112(%%"REG_d")\n\t" -" fmuls 112(%%"REG_b")\n\t" -" fstps 124(%%"REG_c")\n\t" - -" flds 116(%%"REG_d")\n\t" -" fadds 120(%%"REG_d")\n\t" -" fstps 116(%%"REG_c")\n\t" - -" flds 120(%%"REG_d")\n\t" -" fsubs 116(%%"REG_d")\n\t" -" fmuls 116(%%"REG_b")\n\t" -" fstps 120(%%"REG_c")\n\t" - -/* Phase 5*/ - -" flds 32(%%"REG_c")\n\t" -" fadds 36(%%"REG_c")\n\t" -" fstps 32(%%"REG_d")\n\t" - -" flds 32(%%"REG_c")\n\t" -" fsubs 36(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fstps 36(%%"REG_d")\n\t" - -" flds 44(%%"REG_c")\n\t" -" fsubs 40(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts 44(%%"REG_d")\n\t" -" fadds 40(%%"REG_c")\n\t" -" fadds 44(%%"REG_c")\n\t" -" fstps 40(%%"REG_d")\n\t" - -" flds 48(%%"REG_c")\n\t" -" fsubs 52(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" - -" flds 60(%%"REG_c")\n\t" -" fsubs 56(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fld %%st(0)\n\t" -" fadds 56(%%"REG_c")\n\t" -" fadds 60(%%"REG_c")\n\t" -" fld %%st(0)\n\t" -" fadds 48(%%"REG_c")\n\t" -" fadds 52(%%"REG_c")\n\t" -" fstps 48(%%"REG_d")\n\t" -" fadd %%st(2)\n\t" -" fstps 56(%%"REG_d")\n\t" -" fsts 60(%%"REG_d")\n\t" -" faddp %%st(1)\n\t" -" fstps 52(%%"REG_d")\n\t" - -" flds 64(%%"REG_c")\n\t" -" fadds 68(%%"REG_c")\n\t" -" fstps 64(%%"REG_d")\n\t" - -" flds 64(%%"REG_c")\n\t" -" fsubs 68(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fstps 68(%%"REG_d")\n\t" - -" flds 76(%%"REG_c")\n\t" -" fsubs 72(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts 76(%%"REG_d")\n\t" -" fadds 72(%%"REG_c")\n\t" -" fadds 76(%%"REG_c")\n\t" -" fstps 72(%%"REG_d")\n\t" - -" flds 92(%%"REG_c")\n\t" -" fsubs 88(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts 92(%%"REG_d")\n\t" -" fadds 92(%%"REG_c")\n\t" -" fadds 88(%%"REG_c")\n\t" -" fld %%st(0)\n\t" -" fadds 80(%%"REG_c")\n\t" -" fadds 84(%%"REG_c")\n\t" -" fstps 80(%%"REG_d")\n\t" - -" flds 80(%%"REG_c")\n\t" -" fsubs 84(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fadd %%st(0), %%st(1)\n\t" -" fadds 92(%%"REG_d")\n\t" -" fstps 84(%%"REG_d")\n\t" -" fstps 88(%%"REG_d")\n\t" - -" flds 96(%%"REG_c")\n\t" -" fadds 100(%%"REG_c")\n\t" -" fstps 96(%%"REG_d")\n\t" - -" flds 96(%%"REG_c")\n\t" -" fsubs 100(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fstps 100(%%"REG_d")\n\t" - -" flds 108(%%"REG_c")\n\t" -" fsubs 104(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts 108(%%"REG_d")\n\t" -" fadds 104(%%"REG_c")\n\t" -" fadds 108(%%"REG_c")\n\t" -" fstps 104(%%"REG_d")\n\t" - -" flds 124(%%"REG_c")\n\t" -" fsubs 120(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts 124(%%"REG_d")\n\t" -" fadds 120(%%"REG_c")\n\t" -" fadds 124(%%"REG_c")\n\t" -" fld %%st(0)\n\t" -" fadds 112(%%"REG_c")\n\t" -" fadds 116(%%"REG_c")\n\t" -" fstps 112(%%"REG_d")\n\t" - -" flds 112(%%"REG_c")\n\t" -" fsubs 116(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fadd %%st(0),%%st(1)\n\t" -" fadds 124(%%"REG_d")\n\t" -" fstps 116(%%"REG_d")\n\t" -" fstps 120(%%"REG_d")\n\t" -" jnz .L01\n\t" - -/* Phase 6*/ - -" flds (%%"REG_c")\n\t" -" fadds 4(%%"REG_c")\n\t" -" fstps 1024(%%"REG_S")\n\t" - -" flds (%%"REG_c")\n\t" -" fsubs 4(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts (%%"REG_S")\n\t" -" fstps (%%"REG_D")\n\t" - -" flds 12(%%"REG_c")\n\t" -" fsubs 8(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts 512(%%"REG_D")\n\t" -" fadds 12(%%"REG_c")\n\t" -" fadds 8(%%"REG_c")\n\t" -" fstps 512(%%"REG_S")\n\t" - -" flds 16(%%"REG_c")\n\t" -" fsubs 20(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" - -" flds 28(%%"REG_c")\n\t" -" fsubs 24(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fsts 768(%%"REG_D")\n\t" -" fld %%st(0)\n\t" -" fadds 24(%%"REG_c")\n\t" -" fadds 28(%%"REG_c")\n\t" -" fld %%st(0)\n\t" -" fadds 16(%%"REG_c")\n\t" -" fadds 20(%%"REG_c")\n\t" -" fstps 768(%%"REG_S")\n\t" -" fadd %%st(2)\n\t" -" fstps 256(%%"REG_S")\n\t" -" faddp %%st(1)\n\t" -" fstps 256(%%"REG_D")\n\t" - -/* Phase 7*/ - -" flds 32(%%"REG_d")\n\t" -" fadds 48(%%"REG_d")\n\t" -" fstps 896(%%"REG_S")\n\t" - -" flds 48(%%"REG_d")\n\t" -" fadds 40(%%"REG_d")\n\t" -" fstps 640(%%"REG_S")\n\t" - -" flds 40(%%"REG_d")\n\t" -" fadds 56(%%"REG_d")\n\t" -" fstps 384(%%"REG_S")\n\t" - -" flds 56(%%"REG_d")\n\t" -" fadds 36(%%"REG_d")\n\t" -" fstps 128(%%"REG_S")\n\t" - -" flds 36(%%"REG_d")\n\t" -" fadds 52(%%"REG_d")\n\t" -" fstps 128(%%"REG_D")\n\t" - -" flds 52(%%"REG_d")\n\t" -" fadds 44(%%"REG_d")\n\t" -" fstps 384(%%"REG_D")\n\t" - -" flds 60(%%"REG_d")\n\t" -" fsts 896(%%"REG_D")\n\t" -" fadds 44(%%"REG_d")\n\t" -" fstps 640(%%"REG_D")\n\t" - -" flds 96(%%"REG_d")\n\t" -" fadds 112(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 64(%%"REG_d")\n\t" -" fstps 960(%%"REG_S")\n\t" -" fadds 80(%%"REG_d")\n\t" -" fstps 832(%%"REG_S")\n\t" - -" flds 112(%%"REG_d")\n\t" -" fadds 104(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 80(%%"REG_d")\n\t" -" fstps 704(%%"REG_S")\n\t" -" fadds 72(%%"REG_d")\n\t" -" fstps 576(%%"REG_S")\n\t" - -" flds 104(%%"REG_d")\n\t" -" fadds 120(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 72(%%"REG_d")\n\t" -" fstps 448(%%"REG_S")\n\t" -" fadds 88(%%"REG_d")\n\t" -" fstps 320(%%"REG_S")\n\t" - -" flds 120(%%"REG_d")\n\t" -" fadds 100(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 88(%%"REG_d")\n\t" -" fstps 192(%%"REG_S")\n\t" -" fadds 68(%%"REG_d")\n\t" -" fstps 64(%%"REG_S")\n\t" - -" flds 100(%%"REG_d")\n\t" -" fadds 116(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 68(%%"REG_d")\n\t" -" fstps 64(%%"REG_D")\n\t" -" fadds 84(%%"REG_d")\n\t" -" fstps 192(%%"REG_D")\n\t" - -" flds 116(%%"REG_d")\n\t" -" fadds 108(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 84(%%"REG_d")\n\t" -" fstps 320(%%"REG_D")\n\t" -" fadds 76(%%"REG_d")\n\t" -" fstps 448(%%"REG_D")\n\t" - -" flds 108(%%"REG_d")\n\t" -" fadds 124(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 76(%%"REG_d")\n\t" -" fstps 576(%%"REG_D")\n\t" -" fadds 92(%%"REG_d")\n\t" -" fstps 704(%%"REG_D")\n\t" - -" flds 124(%%"REG_d")\n\t" -" fsts 960(%%"REG_D")\n\t" -" fadds 92(%%"REG_d")\n\t" -" fstps 832(%%"REG_D")\n\t" -" jmp .L_bye\n\t" -".L01:\n\t" -/* Phase 8*/ - -" flds (%%"REG_c")\n\t" -" fadds 4(%%"REG_c")\n\t" -" fistp 512(%%"REG_S")\n\t" - -" flds (%%"REG_c")\n\t" -" fsubs 4(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" - -" fistp (%%"REG_S")\n\t" - - -" flds 12(%%"REG_c")\n\t" -" fsubs 8(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fist 256(%%"REG_D")\n\t" -" fadds 12(%%"REG_c")\n\t" -" fadds 8(%%"REG_c")\n\t" -" fistp 256(%%"REG_S")\n\t" - -" flds 16(%%"REG_c")\n\t" -" fsubs 20(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" - -" flds 28(%%"REG_c")\n\t" -" fsubs 24(%%"REG_c")\n\t" -" fmuls 120(%%"REG_b")\n\t" -" fist 384(%%"REG_D")\n\t" -" fld %%st(0)\n\t" -" fadds 24(%%"REG_c")\n\t" -" fadds 28(%%"REG_c")\n\t" -" fld %%st(0)\n\t" -" fadds 16(%%"REG_c")\n\t" -" fadds 20(%%"REG_c")\n\t" -" fistp 384(%%"REG_S")\n\t" -" fadd %%st(2)\n\t" -" fistp 128(%%"REG_S")\n\t" -" faddp %%st(1)\n\t" -" fistp 128(%%"REG_D")\n\t" - -/* Phase 9*/ - -" flds 32(%%"REG_d")\n\t" -" fadds 48(%%"REG_d")\n\t" -" fistp 448(%%"REG_S")\n\t" - -" flds 48(%%"REG_d")\n\t" -" fadds 40(%%"REG_d")\n\t" -" fistp 320(%%"REG_S")\n\t" - -" flds 40(%%"REG_d")\n\t" -" fadds 56(%%"REG_d")\n\t" -" fistp 192(%%"REG_S")\n\t" - -" flds 56(%%"REG_d")\n\t" -" fadds 36(%%"REG_d")\n\t" -" fistp 64(%%"REG_S")\n\t" - -" flds 36(%%"REG_d")\n\t" -" fadds 52(%%"REG_d")\n\t" -" fistp 64(%%"REG_D")\n\t" - -" flds 52(%%"REG_d")\n\t" -" fadds 44(%%"REG_d")\n\t" -" fistp 192(%%"REG_D")\n\t" - -" flds 60(%%"REG_d")\n\t" -" fist 448(%%"REG_D")\n\t" -" fadds 44(%%"REG_d")\n\t" -" fistp 320(%%"REG_D")\n\t" - -" flds 96(%%"REG_d")\n\t" -" fadds 112(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 64(%%"REG_d")\n\t" -" fistp 480(%%"REG_S")\n\t" -" fadds 80(%%"REG_d")\n\t" -" fistp 416(%%"REG_S")\n\t" - -" flds 112(%%"REG_d")\n\t" -" fadds 104(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 80(%%"REG_d")\n\t" -" fistp 352(%%"REG_S")\n\t" -" fadds 72(%%"REG_d")\n\t" -" fistp 288(%%"REG_S")\n\t" - -" flds 104(%%"REG_d")\n\t" -" fadds 120(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 72(%%"REG_d")\n\t" -" fistp 224(%%"REG_S")\n\t" -" fadds 88(%%"REG_d")\n\t" -" fistp 160(%%"REG_S")\n\t" - -" flds 120(%%"REG_d")\n\t" -" fadds 100(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 88(%%"REG_d")\n\t" -" fistp 96(%%"REG_S")\n\t" -" fadds 68(%%"REG_d")\n\t" -" fistp 32(%%"REG_S")\n\t" - -" flds 100(%%"REG_d")\n\t" -" fadds 116(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 68(%%"REG_d")\n\t" -" fistp 32(%%"REG_D")\n\t" -" fadds 84(%%"REG_d")\n\t" -" fistp 96(%%"REG_D")\n\t" - -" flds 116(%%"REG_d")\n\t" -" fadds 108(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 84(%%"REG_d")\n\t" -" fistp 160(%%"REG_D")\n\t" -" fadds 76(%%"REG_d")\n\t" -" fistp 224(%%"REG_D")\n\t" - -" flds 108(%%"REG_d")\n\t" -" fadds 124(%%"REG_d")\n\t" -" fld %%st(0)\n\t" -" fadds 76(%%"REG_d")\n\t" -" fistp 288(%%"REG_D")\n\t" -" fadds 92(%%"REG_d")\n\t" -" fistp 352(%%"REG_D")\n\t" - -" flds 124(%%"REG_d")\n\t" -" fist 480(%%"REG_D")\n\t" -" fadds 92(%%"REG_d")\n\t" -" fistp 416(%%"REG_D")\n\t" -" movsw\n\t" -".L_bye:" - : - :"m"(a),"m"(b),"m"(c),"m"(tmp[0]) - :"memory","%eax","%ebx","%ecx","%edx","%esi","%edi"); -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/dct64_sse.c --- a/mp3lib/dct64_sse.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,421 +0,0 @@ -/* - * Discrete Cosine Tansform (DCT) for SSE - * Copyright (c) 2006 Zuxy MENG - * based upon code from mp3lib/dct64.c, mp3lib/dct64_altivec.c - * and mp3lib/dct64_mmx.c - */ - -#include "libavutil/mem.h" -#include "mpg123.h" - -extern float __attribute__((aligned(16))) costab_mmx[]; - -static const int ppnn[4] __attribute__((aligned(16))) = -{ 0, 0, 1 << 31, 1 << 31 }; - -static const int pnpn[4] __attribute__((aligned(16))) = -{ 0, 1 << 31, 0, 1 << 31 }; - -static const int nnnn[4] __attribute__((aligned(16))) = -{ 1 << 31, 1 << 31, 1 << 31, 1 << 31 }; - -void dct64_sse(short *out0,short *out1,real *c) -{ - DECLARE_ALIGNED(16, real, b1[0x20]); - DECLARE_ALIGNED(16, real, b2[0x20]); - static real const one = 1.f; - - { - real *costab = costab_mmx; - int i; - - for (i = 0; i < 0x20 / 2; i += 4) - { - __asm__( - "movaps %2, %%xmm3\n\t" - "shufps $27, %%xmm3, %%xmm3\n\t" - "movaps %3, %%xmm1\n\t" - "movaps %%xmm1, %%xmm4\n\t" - "movaps %4, %%xmm2\n\t" - "shufps $27, %%xmm4, %%xmm4\n\t" - "movaps %%xmm2, %%xmm0\n\t" - "shufps $27, %%xmm0, %%xmm0\n\t" - "addps %%xmm0, %%xmm1\n\t" - "movaps %%xmm1, %0\n\t" - "subps %%xmm2, %%xmm4\n\t" - "mulps %%xmm3, %%xmm4\n\t" - "movaps %%xmm4, %1\n\t" - :"=m"(*(b1 + i)), "=m"(*(b1 + 0x1c - i)) - :"m"(*(costab + i)), "m"(*(c + i)), "m"(*(c + 0x1c - i)) - ); - } - } - - { - int i; - - for (i = 0; i < 0x20; i += 0x10) - { - __asm__( - "movaps %4, %%xmm1\n\t" - "movaps %5, %%xmm3\n\t" - "movaps %6, %%xmm4\n\t" - "movaps %7, %%xmm6\n\t" - "movaps %%xmm1, %%xmm7\n\t" - "shufps $27, %%xmm7, %%xmm7\n\t" - "movaps %%xmm3, %%xmm5\n\t" - "shufps $27, %%xmm5, %%xmm5\n\t" - "movaps %%xmm4, %%xmm2\n\t" - "shufps $27, %%xmm2, %%xmm2\n\t" - "movaps %%xmm6, %%xmm0\n\t" - "shufps $27, %%xmm0, %%xmm0\n\t" - "addps %%xmm0, %%xmm1\n\t" - "movaps %%xmm1, %0\n\t" - "addps %%xmm2, %%xmm3\n\t" - "movaps %%xmm3, %1\n\t" - "subps %%xmm4, %%xmm5\n\t" - "movaps %%xmm5, %2\n\t" - "subps %%xmm6, %%xmm7\n\t" - "movaps %%xmm7, %3\n\t" - :"=m"(*(b2 + i)), "=m"(*(b2 + i + 4)), "=m"(*(b2 + i + 8)), "=m"(*(b2 + i + 12)) - :"m"(*(b1 + i)), "m"(*(b1 + i + 4)), "m"(*(b1 + i + 8)), "m"(*(b1 + i + 12)) - ); - } - } - - { - real *costab = costab_mmx + 16; - __asm__( - "movaps %4, %%xmm0\n\t" - "movaps %5, %%xmm1\n\t" - "movaps %8, %%xmm4\n\t" - "xorps %%xmm6, %%xmm6\n\t" - "shufps $27, %%xmm4, %%xmm4\n\t" - "mulps %%xmm4, %%xmm1\n\t" - "movaps %9, %%xmm2\n\t" - "xorps %%xmm7, %%xmm7\n\t" - "shufps $27, %%xmm2, %%xmm2\n\t" - "mulps %%xmm2, %%xmm0\n\t" - "movaps %%xmm0, %0\n\t" - "movaps %%xmm1, %1\n\t" - "movaps %6, %%xmm3\n\t" - "mulps %%xmm2, %%xmm3\n\t" - "subps %%xmm3, %%xmm6\n\t" - "movaps %%xmm6, %2\n\t" - "movaps %7, %%xmm5\n\t" - "mulps %%xmm4, %%xmm5\n\t" - "subps %%xmm5, %%xmm7\n\t" - "movaps %%xmm7, %3\n\t" - :"=m"(*(b2 + 8)), "=m"(*(b2 + 0xc)), "=m"(*(b2 + 0x18)), "=m"(*(b2 + 0x1c)) - :"m"(*(b2 + 8)), "m"(*(b2 + 0xc)), "m"(*(b2 + 0x18)), "m"(*(b2 + 0x1c)), "m"(*costab), "m"(*(costab + 4)) - ); - } - - { - int i; - - __asm__( - "movaps %0, %%xmm0\n\t" - "shufps $27, %%xmm0, %%xmm0\n\t" - "movaps %1, %%xmm5\n\t" - "movaps %%xmm5, %%xmm6\n\t" - : - :"m"(costab_mmx[24]), "m"(*nnnn) - ); - - for (i = 0; i < 0x20; i += 8) - { - __asm__( - "movaps %2, %%xmm2\n\t" - "movaps %3, %%xmm3\n\t" - "movaps %%xmm2, %%xmm4\n\t" - "xorps %%xmm5, %%xmm6\n\t" - "shufps $27, %%xmm4, %%xmm4\n\t" - "movaps %%xmm3, %%xmm1\n\t" - "shufps $27, %%xmm1, %%xmm1\n\t" - "addps %%xmm1, %%xmm2\n\t" - "movaps %%xmm2, %0\n\t" - "subps %%xmm3, %%xmm4\n\t" - "xorps %%xmm6, %%xmm4\n\t" - "mulps %%xmm0, %%xmm4\n\t" - "movaps %%xmm4, %1\n\t" - :"=m"(*(b1 + i)), "=m"(*(b1 + i + 4)) - :"m"(*(b2 + i)), "m"(*(b2 + i + 4)) - ); - } - } - - { - int i; - - __asm__( - "movss %0, %%xmm1\n\t" - "movss %1, %%xmm0\n\t" - "movaps %%xmm1, %%xmm3\n\t" - "unpcklps %%xmm0, %%xmm3\n\t" - "movss %2, %%xmm2\n\t" - "movaps %%xmm1, %%xmm0\n\t" - "unpcklps %%xmm2, %%xmm0\n\t" - "unpcklps %%xmm3, %%xmm0\n\t" - "movaps %3, %%xmm2\n\t" - : - :"m"(one), "m"(costab_mmx[28]), "m"(costab_mmx[29]), "m"(*ppnn) - ); - - for (i = 0; i < 0x20; i += 8) - { - __asm__( - "movaps %2, %%xmm3\n\t" - "movaps %%xmm3, %%xmm4\n\t" - "shufps $20, %%xmm4, %%xmm4\n\t" - "shufps $235, %%xmm3, %%xmm3\n\t" - "xorps %%xmm2, %%xmm3\n\t" - "addps %%xmm3, %%xmm4\n\t" - "mulps %%xmm0, %%xmm4\n\t" - "movaps %%xmm4, %0\n\t" - "movaps %3, %%xmm6\n\t" - "movaps %%xmm6, %%xmm5\n\t" - "shufps $27, %%xmm5, %%xmm5\n\t" - "xorps %%xmm2, %%xmm5\n\t" - "addps %%xmm5, %%xmm6\n\t" - "mulps %%xmm0, %%xmm6\n\t" - "movaps %%xmm6, %1\n\t" - :"=m"(*(b2 + i)), "=m"(*(b2 + i + 4)) - :"m"(*(b1 + i)), "m"(*(b1 + i + 4)) - ); - } - } - - { - int i; - __asm__( - "movss %0, %%xmm0\n\t" - "movaps %%xmm1, %%xmm2\n\t" - "movaps %%xmm0, %%xmm7\n\t" - "unpcklps %%xmm1, %%xmm2\n\t" - "unpcklps %%xmm0, %%xmm7\n\t" - "movaps %1, %%xmm0\n\t" - "unpcklps %%xmm7, %%xmm2\n\t" - : - :"m"(costab_mmx[30]), "m"(*pnpn) - ); - - for (i = 0x8; i < 0x20; i += 8) - { - __asm__ volatile ( - "movaps %2, %%xmm1\n\t" - "movaps %%xmm1, %%xmm3\n\t" - "shufps $224, %%xmm3, %%xmm3\n\t" - "shufps $181, %%xmm1, %%xmm1\n\t" - "xorps %%xmm0, %%xmm1\n\t" - "addps %%xmm1, %%xmm3\n\t" - "mulps %%xmm2, %%xmm3\n\t" - "movaps %%xmm3, %0\n\t" - "movaps %3, %%xmm4\n\t" - "movaps %%xmm4, %%xmm5\n\t" - "shufps $224, %%xmm5, %%xmm5\n\t" - "shufps $181, %%xmm4, %%xmm4\n\t" - "xorps %%xmm0, %%xmm4\n\t" - "addps %%xmm4, %%xmm5\n\t" - "mulps %%xmm2, %%xmm5\n\t" - "movaps %%xmm5, %1\n\t" - :"=m"(*(b1 + i)), "=m"(*(b1 + i + 4)) - :"m"(*(b2 + i)), "m"(*(b2 + i + 4)) - :"memory" - ); - } - for (i = 0x8; i < 0x20; i += 8) - { - b1[i + 2] += b1[i + 3]; - b1[i + 6] += b1[i + 7]; - b1[i + 4] += b1[i + 6]; - b1[i + 6] += b1[i + 5]; - b1[i + 5] += b1[i + 7]; - } - } - -#if 0 - /* Reference C code */ - - /* - Should run faster than x87 asm, given that the compiler is sane. - However, the C code dosen't round with saturation (0x7fff for too - large positive float, 0x8000 for too small negative float). You - can hear the difference if you listen carefully. - */ - - out0[256] = (short)(b2[0] + b2[1]); - out0[0] = (short)((b2[0] - b2[1]) * costab_mmx[30]); - out1[128] = (short)((b2[3] - b2[2]) * costab_mmx[30]); - out0[128] = (short)((b2[3] - b2[2]) * costab_mmx[30] + b2[3] + b2[2]); - out1[192] = (short)((b2[7] - b2[6]) * costab_mmx[30]); - out0[192] = (short)((b2[7] - b2[6]) * costab_mmx[30] + b2[6] + b2[7] + b2[4] + b2[5]); - out0[64] = (short)((b2[7] - b2[6]) * costab_mmx[30] + b2[6] + b2[7] + (b2[4] - b2[5]) * costab_mmx[30]); - out1[64] = (short)((b2[7] - b2[6]) * costab_mmx[30] + (b2[4] - b2[5]) * costab_mmx[30]); - - out0[224] = (short)(b1[8] + b1[12]); - out0[160] = (short)(b1[12] + b1[10]); - out0[96] = (short)(b1[10] + b1[14]); - out0[32] = (short)(b1[14] + b1[9]); - out1[32] = (short)(b1[9] + b1[13]); - out1[96] = (short)(b1[13] + b1[11]); - out1[224] = (short)b1[15]; - out1[160] = (short)(b1[15] + b1[11]); - out0[240] = (short)(b1[24] + b1[28] + b1[16]); - out0[208] = (short)(b1[24] + b1[28] + b1[20]); - out0[176] = (short)(b1[28] + b1[26] + b1[20]); - out0[144] = (short)(b1[28] + b1[26] + b1[18]); - out0[112] = (short)(b1[26] + b1[30] + b1[18]); - out0[80] = (short)(b1[26] + b1[30] + b1[22]); - out0[48] = (short)(b1[30] + b1[25] + b1[22]); - out0[16] = (short)(b1[30] + b1[25] + b1[17]); - out1[16] = (short)(b1[25] + b1[29] + b1[17]); - out1[48] = (short)(b1[25] + b1[29] + b1[21]); - out1[80] = (short)(b1[29] + b1[27] + b1[21]); - out1[112] = (short)(b1[29] + b1[27] + b1[19]); - out1[144] = (short)(b1[27] + b1[31] + b1[19]); - out1[176] = (short)(b1[27] + b1[31] + b1[23]); - out1[240] = (short)(b1[31]); - out1[208] = (short)(b1[31] + b1[23]); - -#else - /* - To do saturation efficiently in x86 we can use fist(p)s, - pf2iw, or packssdw. We use fist(p)s here. - */ - __asm__( - "flds %0\n\t" - "flds (%2)\n\t" - "fadds 4(%2)\n\t" - "fistps 512(%3)\n\t" - - "flds (%2)\n\t" - "fsubs 4(%2)\n\t" - "fmul %%st(1)\n\t" - "fistps (%3)\n\t" - - "flds 12(%2)\n\t" - "fsubs 8(%2)\n\t" - "fmul %%st(1)\n\t" - "fists 256(%4)\n\t" - "fadds 12(%2)\n\t" - "fadds 8(%2)\n\t" - "fistps 256(%3)\n\t" - - "flds 16(%2)\n\t" - "fsubs 20(%2)\n\t" - "fmul %%st(1)\n\t" - - "flds 28(%2)\n\t" - "fsubs 24(%2)\n\t" - "fmul %%st(2)\n\t" - "fists 384(%4)\n\t" - "fld %%st(0)\n\t" - "fadds 24(%2)\n\t" - "fadds 28(%2)\n\t" - "fld %%st(0)\n\t" - "fadds 16(%2)\n\t" - "fadds 20(%2)\n\t" - "fistps 384(%3)\n\t" - "fadd %%st(2)\n\t" - "fistps 128(%3)\n\t" - "faddp %%st(1)\n\t" - "fistps 128(%4)\n\t" - - "flds 32(%1)\n\t" - "fadds 48(%1)\n\t" - "fistps 448(%3)\n\t" - - "flds 48(%1)\n\t" - "fadds 40(%1)\n\t" - "fistps 320(%3)\n\t" - - "flds 40(%1)\n\t" - "fadds 56(%1)\n\t" - "fistps 192(%3)\n\t" - - "flds 56(%1)\n\t" - "fadds 36(%1)\n\t" - "fistps 64(%3)\n\t" - - "flds 36(%1)\n\t" - "fadds 52(%1)\n\t" - "fistps 64(%4)\n\t" - - "flds 52(%1)\n\t" - "fadds 44(%1)\n\t" - "fistps 192(%4)\n\t" - - "flds 60(%1)\n\t" - "fists 448(%4)\n\t" - "fadds 44(%1)\n\t" - "fistps 320(%4)\n\t" - - "flds 96(%1)\n\t" - "fadds 112(%1)\n\t" - "fld %%st(0)\n\t" - "fadds 64(%1)\n\t" - "fistps 480(%3)\n\t" - "fadds 80(%1)\n\t" - "fistps 416(%3)\n\t" - - "flds 112(%1)\n\t" - "fadds 104(%1)\n\t" - "fld %%st(0)\n\t" - "fadds 80(%1)\n\t" - "fistps 352(%3)\n\t" - "fadds 72(%1)\n\t" - "fistps 288(%3)\n\t" - - "flds 104(%1)\n\t" - "fadds 120(%1)\n\t" - "fld %%st(0)\n\t" - "fadds 72(%1)\n\t" - "fistps 224(%3)\n\t" - "fadds 88(%1)\n\t" - "fistps 160(%3)\n\t" - - "flds 120(%1)\n\t" - "fadds 100(%1)\n\t" - "fld %%st(0)\n\t" - "fadds 88(%1)\n\t" - "fistps 96(%3)\n\t" - "fadds 68(%1)\n\t" - "fistps 32(%3)\n\t" - - "flds 100(%1)\n\t" - "fadds 116(%1)\n\t" - "fld %%st(0)\n\t" - "fadds 68(%1)\n\t" - "fistps 32(%4)\n\t" - "fadds 84(%1)\n\t" - "fistps 96(%4)\n\t" - - "flds 116(%1)\n\t" - "fadds 108(%1)\n\t" - "fld %%st(0)\n\t" - "fadds 84(%1)\n\t" - "fistps 160(%4)\n\t" - "fadds 76(%1)\n\t" - "fistps 224(%4)\n\t" - - "flds 108(%1)\n\t" - "fadds 124(%1)\n\t" - "fld %%st(0)\n\t" - "fadds 76(%1)\n\t" - "fistps 288(%4)\n\t" - "fadds 92(%1)\n\t" - "fistps 352(%4)\n\t" - - "flds 124(%1)\n\t" - "fists 480(%4)\n\t" - "fadds 92(%1)\n\t" - "fistps 416(%4)\n\t" - ".byte 0xdf, 0xc0\n\t" // ffreep %%st(0) - : - :"m"(costab_mmx[30]), "r"(b1), "r"(b2), "r"(out0), "r"(out1) - :"memory" - ); -#endif - out1[0] = out0[0]; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/decod386.c --- a/mp3lib/decod386.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,253 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* - * Mpeg Layer-1,2,3 audio decoder - * ------------------------------ - * copyright (c) 1995,1996,1997 by Michael Hipp, All rights reserved. - * See also 'README' - * - * slighlty optimized for machines without autoincrement/decrement. - * The performance is highly compiler dependend. Maybe - * the decode.c version for 'normal' processor may be faster - * even for Intel processors. - */ - - -#include "config.h" - -#if 0 - /* old WRITE_SAMPLE */ - /* is portable */ -#define WRITE_SAMPLE(samples,sum,clip) { \ - if( (sum) > 32767.0) { *(samples) = 0x7fff; (clip)++; } \ - else if( (sum) < -32768.0) { *(samples) = -0x8000; (clip)++; }\ - else { *(samples) = sum; } \ -} -#else - /* new WRITE_SAMPLE */ - -/* - * should be the same as the "old WRITE_SAMPLE" macro above, but uses - * some tricks to avoid double->int conversions and floating point compares. - * - * Here's how it works: - * ((((65536.0 * 65536.0 * 16)+(65536.0 * 0.5))* 65536.0)) is - * 0x0010000080000000LL in hex. It computes 0x0010000080000000LL + sum - * as a double IEEE fp value and extracts the low-order 32-bits from the - * IEEE fp representation stored in memory. The 2^56 bit in the constant - * is intended to force the bits of "sum" into the least significant bits - * of the double mantissa. After an integer substraction of 0x80000000 - * we have the original double value "sum" converted to an 32-bit int value. - * - * (Is that really faster than the clean and simple old version of the macro?) - */ - -/* - * On a SPARC cpu, we fetch the low-order 32-bit from the second 32-bit - * word of the double fp value stored in memory. On an x86 cpu, we fetch it - * from the first 32-bit word. - * I'm not sure if the HAVE_BIGENDIAN feature test covers all possible memory - * layouts of double floating point values an all cpu architectures. If - * it doesn't work for you, just enable the "old WRITE_SAMPLE" macro. - */ -#if HAVE_BIGENDIAN -#define MANTISSA_OFFSET 1 -#else -#define MANTISSA_OFFSET 0 -#endif - - /* sizeof(int) == 4 */ -#define WRITE_SAMPLE(samples,sum,clip) { \ - union { double dtemp; int itemp[2]; } u; int v; \ - u.dtemp = ((((65536.0 * 65536.0 * 16)+(65536.0 * 0.5))* 65536.0)) + (sum);\ - v = u.itemp[MANTISSA_OFFSET] - 0x80000000; \ - if( v > 32767) { *(samples) = 0x7fff; (clip)++; } \ - else if( v < -32768) { *(samples) = -0x8000; (clip)++; } \ - else { *(samples) = v; } \ -} -#endif - - -/* -#define WRITE_SAMPLE(samples,sum,clip) { \ - double dtemp; int v; \ - dtemp = ((((65536.0 * 65536.0 * 16)+(65536.0 * 0.5))* 65536.0)) + (sum);\ - v = ((*(int *)&dtemp) - 0x80000000); \ - if( v > 32767) { *(samples) = 0x7fff; (clip)++; } \ - else if( v < -32768) { *(samples) = -0x8000; (clip)++; } \ - else { *(samples) = v; } \ -} -*/ - -static int synth_1to1(real *bandPtr,int channel,unsigned char *out,int *pnt); - -static int synth_1to1_mono2stereo(real *bandPtr,unsigned char *samples,int *pnt) -{ - int i,ret; - - ret = synth_1to1(bandPtr,0,samples,pnt); - samples = samples + *pnt - 128; - - for(i=0;i<32;i++) { - ((short *)samples)[1] = ((short *)samples)[0]; - samples+=4; - } - - return ret; -} - -static synth_func_t synth_func; - -#if HAVE_ALTIVEC -#define dct64_base(a,b,c) if(gCpuCaps.hasAltiVec) dct64_altivec(a,b,c); else dct64(a,b,c) -#else /* HAVE_ALTIVEC */ -#define dct64_base(a,b,c) dct64(a,b,c) -#endif /* HAVE_ALTIVEC */ - -static int synth_1to1(real *bandPtr,int channel,unsigned char *out,int *pnt) -{ - static real buffs[2][2][0x110]; - static const int step = 2; - static int bo = 1; - short *samples = (short *) (out + *pnt); - real *b0,(*buf)[0x110]; - int clip = 0; - int bo1; - - *pnt += 128; - -/* optimized for x86 */ -#if ARCH_X86 - if ( synth_func ) - { -// printf("Calling %p, bandPtr=%p channel=%d samples=%p\n",synth_func,bandPtr,channel,samples); - // FIXME: synth_func() may destroy EBP, don't rely on stack contents!!! - return (*synth_func)( bandPtr,channel,samples); - } -#endif - if(!channel) { /* channel=0 */ - bo--; - bo &= 0xf; - buf = buffs[0]; - } - else { - samples++; - buf = buffs[1]; - } - - if(bo & 0x1) { - b0 = buf[0]; - bo1 = bo; - dct64_base(buf[1]+((bo+1)&0xf),buf[0]+bo,bandPtr); - } - else { - b0 = buf[1]; - bo1 = bo+1; - dct64_base(buf[0]+bo,buf[1]+bo+1,bandPtr); - } - - { - register int j; - real *window = mp3lib_decwin + 16 - bo1; - - for (j=16;j;j--,b0+=0x10,window+=0x20,samples+=step) - { - real sum; - sum = window[0x0] * b0[0x0]; - sum -= window[0x1] * b0[0x1]; - sum += window[0x2] * b0[0x2]; - sum -= window[0x3] * b0[0x3]; - sum += window[0x4] * b0[0x4]; - sum -= window[0x5] * b0[0x5]; - sum += window[0x6] * b0[0x6]; - sum -= window[0x7] * b0[0x7]; - sum += window[0x8] * b0[0x8]; - sum -= window[0x9] * b0[0x9]; - sum += window[0xA] * b0[0xA]; - sum -= window[0xB] * b0[0xB]; - sum += window[0xC] * b0[0xC]; - sum -= window[0xD] * b0[0xD]; - sum += window[0xE] * b0[0xE]; - sum -= window[0xF] * b0[0xF]; - - WRITE_SAMPLE(samples,sum,clip); - } - - { - real sum; - sum = window[0x0] * b0[0x0]; - sum += window[0x2] * b0[0x2]; - sum += window[0x4] * b0[0x4]; - sum += window[0x6] * b0[0x6]; - sum += window[0x8] * b0[0x8]; - sum += window[0xA] * b0[0xA]; - sum += window[0xC] * b0[0xC]; - sum += window[0xE] * b0[0xE]; - WRITE_SAMPLE(samples,sum,clip); - b0-=0x10,window-=0x20,samples+=step; - } - window += bo1<<1; - - for (j=15;j;j--,b0-=0x10,window-=0x20,samples+=step) - { - real sum; - sum = -window[-0x1] * b0[0x0]; - sum -= window[-0x2] * b0[0x1]; - sum -= window[-0x3] * b0[0x2]; - sum -= window[-0x4] * b0[0x3]; - sum -= window[-0x5] * b0[0x4]; - sum -= window[-0x6] * b0[0x5]; - sum -= window[-0x7] * b0[0x6]; - sum -= window[-0x8] * b0[0x7]; - sum -= window[-0x9] * b0[0x8]; - sum -= window[-0xA] * b0[0x9]; - sum -= window[-0xB] * b0[0xA]; - sum -= window[-0xC] * b0[0xB]; - sum -= window[-0xD] * b0[0xC]; - sum -= window[-0xE] * b0[0xD]; - sum -= window[-0xF] * b0[0xE]; - sum -= window[-0x0] * b0[0xF]; - - WRITE_SAMPLE(samples,sum,clip); - } - } - - return clip; - -} - -#ifdef CONFIG_FAKE_MONO -static int synth_1to1_l(real *bandPtr,int channel,unsigned char *out,int *pnt) -{ - int i,ret; - - ret = synth_1to1(bandPtr,channel,out,pnt); - out = out + *pnt - 128; - - for(i=0;i<32;i++) { - ((short *)out)[1] = ((short *)out)[0]; - out+=4; - } - - return ret; -} - -static int synth_1to1_r(real *bandPtr,int channel,unsigned char *out,int *pnt) -{ - int i,ret; - - ret = synth_1to1(bandPtr,channel,out,pnt); - out = out + *pnt - 128; - - for(i=0;i<32;i++) { - ((short *)out)[0] = ((short *)out)[1]; - out+=4; - } - - return ret; -} -#endif diff -r bc0898c7399b -r b924f0df5a1d mp3lib/decode_i586.c --- a/mp3lib/decode_i586.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,318 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* -* mpg123_synth_1to1 works the same way as the c version of this -* file. only two types of changes have been made: -* - reordered floating point instructions to -* prevent pipline stalls -* - made WRITE_SAMPLE use integer instead of -* (slower) floating point -* all kinds of x86 processors should benefit from these -* modifications. -* -* useful sources of information on optimizing x86 code include: -* -* Intel Architecture Optimization Manual -* http://www.intel.com/design/pentium/manuals/242816.htm -* -* Cyrix 6x86 Instruction Set Summary -* ftp://ftp.cyrix.com/6x86/6x-dbch6.pdf -* -* AMD-K5 Processor Software Development -* http://www.amd.com/products/cpg/techdocs/appnotes/20007e.pdf -* -* Stefan Bieschewski -* -* $Id$ -*/ -#include "config.h" -#include "mangle.h" -#include "mpg123.h" - -static int attribute_used buffs[1088]={0}; -static int attribute_used bo=1; -static int attribute_used saved_ebp=0; - -int synth_1to1_pent(real *bandPtr, int channel, short *samples) -{ - real tmp[3]; - register int retval; - __asm__ volatile( -" movl %%ebp,"MANGLE(saved_ebp)"\n\t" -" movl %1,%%eax\n\t"/*bandPtr*/ -" movl %3,%%esi\n\t" -" xorl %%edi,%%edi\n\t" -" movl "MANGLE(bo)",%%ebp\n\t" -" cmpl %%edi,%2\n\t" -" jne .L48\n\t" -" decl %%ebp\n\t" -" andl $15,%%ebp\n\t" -" movl %%ebp,"MANGLE(bo)"\n\t" -" movl $"MANGLE(buffs)",%%ecx\n\t" -" jmp .L49\n\t" -".L48:\n\t" -" addl $2,%%esi\n\t" -" movl $"MANGLE(buffs)"+2176,%%ecx\n\t" -".L49:\n\t" -" testl $1,%%ebp\n\t" -" je .L50\n\t" -" movl %%ecx,%%ebx\n\t" -" movl %%ebp,%4\n\t" -" pushl %%eax\n\t" -" movl 4+%4,%%edx\n\t" -" leal (%%ebx,%%edx,4),%%eax\n\t" -" pushl %%eax\n\t" -" movl 8+%4,%%eax\n\t" -" incl %%eax\n\t" -" andl $15,%%eax\n\t" -" leal 1088(,%%eax,4),%%eax\n\t" -" addl %%ebx,%%eax\n\t" -" jmp .L74\n\t" -".L50:\n\t" -" leal 1088(%%ecx),%%ebx\n\t" -" leal 1(%%ebp),%%edx\n\t" -" movl %%edx,%4\n\t" -" pushl %%eax\n\t" -" leal 1092(%%ecx,%%ebp,4),%%eax\n\t" -" pushl %%eax\n\t" -" leal (%%ecx,%%ebp,4),%%eax\n\t" -".L74:\n\t" -" pushl %%eax\n\t" -" call "MANGLE(mp3lib_dct64)"\n\t" -" addl $12,%%esp\n\t" -" movl %4,%%edx\n\t" -" leal 0(,%%edx,4),%%edx\n\t" -" movl $"MANGLE(mp3lib_decwin)"+64,%%eax\n\t" -" movl %%eax,%%ecx\n\t" -" subl %%edx,%%ecx\n\t" -" movl $16,%%ebp\n\t" -".L55:\n\t" -" flds (%%ecx)\n\t" -" fmuls (%%ebx)\n\t" -" flds 4(%%ecx)\n\t" -" fmuls 4(%%ebx)\n\t" -" fxch %%st(1)\n\t" -" flds 8(%%ecx)\n\t" -" fmuls 8(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds 12(%%ecx)\n\t" -" fmuls 12(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 16(%%ecx)\n\t" -" fmuls 16(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds 20(%%ecx)\n\t" -" fmuls 20(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 24(%%ecx)\n\t" -" fmuls 24(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds 28(%%ecx)\n\t" -" fmuls 28(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 32(%%ecx)\n\t" -" fmuls 32(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds 36(%%ecx)\n\t" -" fmuls 36(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 40(%%ecx)\n\t" -" fmuls 40(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds 44(%%ecx)\n\t" -" fmuls 44(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 48(%%ecx)\n\t" -" fmuls 48(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds 52(%%ecx)\n\t" -" fmuls 52(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 56(%%ecx)\n\t" -" fmuls 56(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds 60(%%ecx)\n\t" -" fmuls 60(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" subl $4,%%esp\n\t" -" faddp %%st,%%st(1)\n\t" -" fxch %%st(1)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" fistpl (%%esp)\n\t" -" popl %%eax\n\t" -" cmpl $32767,%%eax\n\t" -" jg 1f\n\t" -" cmpl $-32768,%%eax\n\t" -" jl 2f\n\t" -" movw %%ax,(%%esi)\n\t" -" jmp 4f\n\t" -"1: movw $32767,(%%esi)\n\t" -" jmp 3f\n\t" -"2: movw $-32768,(%%esi)\n\t" -"3: incl %%edi\n\t" -"4:\n\t" -".L54:\n\t" -" addl $64,%%ebx\n\t" -" subl $-128,%%ecx\n\t" -" addl $4,%%esi\n\t" -" decl %%ebp\n\t" -" jnz .L55\n\t" -" flds (%%ecx)\n\t" -" fmuls (%%ebx)\n\t" -" flds 8(%%ecx)\n\t" -" fmuls 8(%%ebx)\n\t" -" flds 16(%%ecx)\n\t" -" fmuls 16(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 24(%%ecx)\n\t" -" fmuls 24(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 32(%%ecx)\n\t" -" fmuls 32(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 40(%%ecx)\n\t" -" fmuls 40(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 48(%%ecx)\n\t" -" fmuls 48(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" faddp %%st,%%st(1)\n\t" -" flds 56(%%ecx)\n\t" -" fmuls 56(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" subl $4,%%esp\n\t" -" faddp %%st,%%st(1)\n\t" -" fxch %%st(1)\n\t" -" faddp %%st,%%st(1)\n\t" -" fistpl (%%esp)\n\t" -" popl %%eax\n\t" -" cmpl $32767,%%eax\n\t" -" jg 1f\n\t" -" cmpl $-32768,%%eax\n\t" -" jl 2f\n\t" -" movw %%ax,(%%esi)\n\t" -" jmp 4f\n\t" -"1: movw $32767,(%%esi)\n\t" -" jmp 3f\n\t" -"2: movw $-32768,(%%esi)\n\t" -"3: incl %%edi\n\t" -"4:\n\t" -".L62:\n\t" -" addl $-64,%%ebx\n\t" -" addl $4,%%esi\n\t" -" movl %4,%%edx\n\t" -" leal -128(%%ecx,%%edx,8),%%ecx\n\t" -" movl $15,%%ebp\n\t" -".L68:\n\t" -" flds -4(%%ecx)\n\t" -" fchs\n\t" -" fmuls (%%ebx)\n\t" -" flds -8(%%ecx)\n\t" -" fmuls 4(%%ebx)\n\t" -" fxch %%st(1)\n\t" -" flds -12(%%ecx)\n\t" -" fmuls 8(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -16(%%ecx)\n\t" -" fmuls 12(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -20(%%ecx)\n\t" -" fmuls 16(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -24(%%ecx)\n\t" -" fmuls 20(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -28(%%ecx)\n\t" -" fmuls 24(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -32(%%ecx)\n\t" -" fmuls 28(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -36(%%ecx)\n\t" -" fmuls 32(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -40(%%ecx)\n\t" -" fmuls 36(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -44(%%ecx)\n\t" -" fmuls 40(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -48(%%ecx)\n\t" -" fmuls 44(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -52(%%ecx)\n\t" -" fmuls 48(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -56(%%ecx)\n\t" -" fmuls 52(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds -60(%%ecx)\n\t" -" fmuls 56(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" flds (%%ecx)\n\t" -" fmuls 60(%%ebx)\n\t" -" fxch %%st(2)\n\t" -" subl $4,%%esp\n\t" -" fsubrp %%st,%%st(1)\n\t" -" fxch %%st(1)\n\t" -" fsubrp %%st,%%st(1)\n\t" -" fistpl (%%esp)\n\t" -" popl %%eax\n\t" -" cmpl $32767,%%eax\n\t" -" jg 1f\n\t" -" cmpl $-32768,%%eax\n\t" -" jl 2f\n\t" -" movw %%ax,(%%esi)\n\t" -" jmp 4f\n\t" -"1: movw $32767,(%%esi)\n\t" -" jmp 3f\n\t" -"2: movw $-32768,(%%esi)\n\t" -"3: incl %%edi\n\t" -"4:\n\t" -".L67:\n\t" -" addl $-64,%%ebx\n\t" -" addl $-128,%%ecx\n\t" -" addl $4,%%esi\n\t" -" decl %%ebp\n\t" -" jnz .L68\n\t" -" movl %%edi,%%eax\n\t" -" movl "MANGLE(saved_ebp)",%%ebp\n\t" - :"=a"(retval) - :"m"(bandPtr),"m"(channel),"m"(samples),"m"(tmp[0]) - :"memory","%edi","%esi","%ebx","%ecx","%edx"); - return retval; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/decode_mmx.c --- a/mp3lib/decode_mmx.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,369 +0,0 @@ -/* - * this code comes under GPL - * This code was taken from http://www.mpg123.org - * See ChangeLog of mpg123-0.59s-pre.1 for detail - * Applied to mplayer by Nick Kurshev - * - * Local ChangeLog: - * - Partial loops unrolling and removing MOVW insn from loops -*/ -#include "config.h" -#include "mangle.h" -#include "mpg123.h" - -static const unsigned long long attribute_used __attribute__((aligned(8))) null_one = 0x0000ffff0000ffffULL; -static const unsigned long long attribute_used __attribute__((aligned(8))) one_null = 0xffff0000ffff0000ULL; -const unsigned int __attribute__((aligned(16))) costab_mmx[] = -{ - 1056974725, - 1057056395, - 1057223771, - 1057485416, - 1057855544, - 1058356026, - 1059019886, - 1059897405, - 1061067246, - 1062657950, - 1064892987, - 1066774581, - 1069414683, - 1073984175, - 1079645762, - 1092815430, - 1057005197, - 1057342072, - 1058087743, - 1059427869, - 1061799040, - 1065862217, - 1071413542, - 1084439708, - 1057128951, - 1058664893, - 1063675095, - 1076102863, - 1057655764, - 1067924853, - 1060439283, -}; - -/** - This array of magic numbers were calculated by the pure function - make_decode_tables_MMX(32768), which had been implemented in (deleted since - r23383) tabinit_MMX.c. - */ -static const short __attribute__((aligned(8))) mp3lib_decwins[] = -{ - 0, 7, 54, 114, 510, 1288, 1644, 9372, - 18760, -9373, 1644, -1289, 510, -115, 54, -8, - 0, 7, 54, 114, 510, 1288, 1644, 9372, - 18760, -9373, 1644, -1289, 510, -115, 54, -8, - 0, 7, 55, 129, 500, 1379, 1490, 9834, - 18748, -8910, 1784, -1197, 516, -101, 52, -7, - 0, 7, 55, 129, 500, 1379, 1490, 9834, - 18748, -8910, 1784, -1197, 516, -101, 52, -7, - 0, 8, 56, 145, 488, 1469, 1322, 10294, - 18714, -8448, 1910, -1107, 520, -87, 51, -6, - 0, 8, 56, 145, 488, 1469, 1322, 10294, - 18714, -8448, 1910, -1107, 520, -87, 51, -6, - 0, 9, 57, 161, 474, 1559, 1141, 10751, - 18658, -7987, 2023, -1016, 522, -74, 49, -6, - 0, 9, 57, 161, 474, 1559, 1141, 10751, - 18658, -7987, 2023, -1016, 522, -74, 49, -6, - 0, 10, 57, 177, 456, 1647, 944, 11205, - 18579, -7528, 2123, -927, 522, -61, 48, -5, - 0, 10, 57, 177, 456, 1647, 944, 11205, - 18579, -7528, 2123, -927, 522, -61, 48, -5, - 0, 11, 57, 194, 435, 1733, 734, 11654, - 18477, -7073, 2210, -838, 519, -50, 46, -5, - 0, 11, 57, 194, 435, 1733, 734, 11654, - 18477, -7073, 2210, -838, 519, -50, 46, -5, - 0, 12, 57, 212, 411, 1817, 510, 12097, - 18354, -6621, 2285, -751, 515, -39, 44, -4, - 0, 12, 57, 212, 411, 1817, 510, 12097, - 18354, -6621, 2285, -751, 515, -39, 44, -4, - 0, 13, 57, 229, 384, 1899, 271, 12534, - 18209, -6174, 2348, -666, 508, -28, 43, -4, - 0, 13, 57, 229, 384, 1899, 271, 12534, - 18209, -6174, 2348, -666, 508, -28, 43, -4, - 0, 14, 56, 247, 354, 1977, 18, 12963, - 18043, -5733, 2398, -583, 501, -18, 41, -4, - 0, 14, 56, 247, 354, 1977, 18, 12963, - 18043, -5733, 2398, -583, 501, -18, 41, -4, - 0, 15, 56, 266, 320, 2052, -249, 13383, - 17855, -5298, 2438, -502, 491, -9, 39, -3, - 0, 15, 56, 266, 320, 2052, -249, 13383, - 17855, -5298, 2438, -502, 491, -9, 39, -3, - 0, 17, 54, 284, 283, 2122, -530, 13794, - 17648, -4870, 2466, -423, 480, -1, 37, -3, - 0, 17, 54, 284, 283, 2122, -530, 13794, - 17648, -4870, 2466, -423, 480, -1, 37, -3, - 0, 18, 52, 302, 243, 2188, -825, 14194, - 17420, -4450, 2484, -347, 468, 7, 35, -3, - 0, 18, 52, 302, 243, 2188, -825, 14194, - 17420, -4450, 2484, -347, 468, 7, 35, -3, - 0, 19, 50, 320, 199, 2249, -1133, 14583, - 17173, -4039, 2492, -274, 455, 14, 33, -2, - 0, 19, 50, 320, 199, 2249, -1133, 14583, - 17173, -4039, 2492, -274, 455, 14, 33, -2, - -1, 21, 48, 339, 152, 2304, -1454, 14959, - 16908, -3637, 2490, -204, 440, 20, 32, -2, - -1, 21, 48, 339, 152, 2304, -1454, 14959, - 16908, -3637, 2490, -204, 440, 20, 32, -2, - -1, 22, 45, 357, 101, 2354, -1788, 15322, - 16624, -3245, 2479, -137, 425, 26, 30, -2, - -1, 22, 45, 357, 101, 2354, -1788, 15322, - 16624, -3245, 2479, -137, 425, 26, 30, -2, - -1, 24, 41, 374, 47, 2396, -2135, 15671, - 16323, -2864, 2460, -72, 409, 31, 28, -2, - -1, 24, 41, 374, 47, 2396, -2135, 15671, - 16323, -2864, 2460, -72, 409, 31, 28, -2, - -1, 26, 37, 391, -11, 2431, -2493, 16004, - 16005, -2494, 2432, -12, 392, 36, 26, -2, - -1, 26, 37, 391, -11, 2431, -2493, 16004, - 16005, -2494, 2432, -12, 392, 36, 26, -2, - -2, -28, 31, -409, -72, -2460, -2864, -16323, - 15671, 2135, 2396, -47, 374, -41, 24, 1, - -2, -28, 31, -409, -72, -2460, -2864, -16323, - 15671, 2135, 2396, -47, 374, -41, 24, 1, - -2, -30, 26, -425, -137, -2479, -3245, -16624, - 15322, 1788, 2354, -101, 357, -45, 22, 1, - -2, -30, 26, -425, -137, -2479, -3245, -16624, - 15322, 1788, 2354, -101, 357, -45, 22, 1, - -2, -32, 20, -440, -204, -2490, -3637, -16908, - 14959, 1454, 2304, -152, 339, -48, 21, 1, - -2, -32, 20, -440, -204, -2490, -3637, -16908, - 14959, 1454, 2304, -152, 339, -48, 21, 1, - -2, -33, 14, -455, -274, -2492, -4039, -17173, - 14583, 1133, 2249, -199, 320, -50, 19, 0, - -2, -33, 14, -455, -274, -2492, -4039, -17173, - 14583, 1133, 2249, -199, 320, -50, 19, 0, - -3, -35, 7, -468, -347, -2484, -4450, -17420, - 14194, 825, 2188, -243, 302, -52, 18, 0, - -3, -35, 7, -468, -347, -2484, -4450, -17420, - 14194, 825, 2188, -243, 302, -52, 18, 0, - -3, -37, -1, -480, -423, -2466, -4870, -17648, - 13794, 530, 2122, -283, 284, -54, 17, 0, - -3, -37, -1, -480, -423, -2466, -4870, -17648, - 13794, 530, 2122, -283, 284, -54, 17, 0, - -3, -39, -9, -491, -502, -2438, -5298, -17855, - 13383, 249, 2052, -320, 266, -56, 15, 0, - -3, -39, -9, -491, -502, -2438, -5298, -17855, - 13383, 249, 2052, -320, 266, -56, 15, 0, - -4, -41, -18, -501, -583, -2398, -5733, -18043, - 12963, -18, 1977, -354, 247, -56, 14, 0, - -4, -41, -18, -501, -583, -2398, -5733, -18043, - 12963, -18, 1977, -354, 247, -56, 14, 0, - -4, -43, -28, -508, -666, -2348, -6174, -18209, - 12534, -271, 1899, -384, 229, -57, 13, 0, - -4, -43, -28, -508, -666, -2348, -6174, -18209, - 12534, -271, 1899, -384, 229, -57, 13, 0, - -4, -44, -39, -515, -751, -2285, -6621, -18354, - 12097, -510, 1817, -411, 212, -57, 12, 0, - -4, -44, -39, -515, -751, -2285, -6621, -18354, - 12097, -510, 1817, -411, 212, -57, 12, 0, - -5, -46, -50, -519, -838, -2210, -7073, -18477, - 11654, -734, 1733, -435, 194, -57, 11, 0, - -5, -46, -50, -519, -838, -2210, -7073, -18477, - 11654, -734, 1733, -435, 194, -57, 11, 0, - -5, -48, -61, -522, -927, -2123, -7528, -18579, - 11205, -944, 1647, -456, 177, -57, 10, 0, - -5, -48, -61, -522, -927, -2123, -7528, -18579, - 11205, -944, 1647, -456, 177, -57, 10, 0, - -6, -49, -74, -522, -1016, -2023, -7987, -18658, - 10751, -1141, 1559, -474, 161, -57, 9, 0, - -6, -49, -74, -522, -1016, -2023, -7987, -18658, - 10751, -1141, 1559, -474, 161, -57, 9, 0, - -6, -51, -87, -520, -1107, -1910, -8448, -18714, - 10294, -1322, 1469, -488, 145, -56, 8, 0, - -6, -51, -87, -520, -1107, -1910, -8448, -18714, - 10294, -1322, 1469, -488, 145, -56, 8, 0, - -7, -52, -101, -516, -1197, -1784, -8910, -18748, - 9834, -1490, 1379, -500, 129, -55, 7, 0, - -7, -52, -101, -516, -1197, -1784, -8910, -18748, - 9834, -1490, 1379, -500, 129, -55, 7, 0, -}; - -int synth_1to1_MMX(real *bandPtr, int channel, short *samples) -{ - static short buffs[2][2][0x110] __attribute__((aligned(8))); - static int bo = 1; - short *b0, (*buf)[0x110], *a, *b; - const short* window; - int bo1, i = 8; - - if (channel == 0) { - bo = (bo - 1) & 0xf; - buf = buffs[1]; - } else { - samples++; - buf = buffs[0]; - } - - if (bo & 1) { - b0 = buf[1]; - bo1 = bo + 1; - a = buf[0] + bo; - b = buf[1] + ((bo + 1) & 0xf); - } else { - b0 = buf[0]; - bo1 = bo; - b = buf[0] + bo; - a = buf[1] + ((bo + 1) & 0xf); - } - - dct64_MMX_func(a, b, bandPtr); - window = mp3lib_decwins + 16 - bo1; - //printf("DEBUG: channel %d, bo %d, off %d\n", channel, bo, 16 - bo1); -__asm__ volatile( -ASMALIGN(4) -"0:\n\t" - "movq (%1),%%mm0\n\t" - "movq 64(%1),%%mm4\n\t" - "pmaddwd (%2),%%mm0\n\t" - "pmaddwd 32(%2),%%mm4\n\t" - "movq 8(%1),%%mm1\n\t" - "movq 72(%1),%%mm5\n\t" - "pmaddwd 8(%2),%%mm1\n\t" - "pmaddwd 40(%2),%%mm5\n\t" - "movq 16(%1),%%mm2\n\t" - "movq 80(%1),%%mm6\n\t" - "pmaddwd 16(%2),%%mm2\n\t" - "pmaddwd 48(%2),%%mm6\n\t" - "movq 24(%1),%%mm3\n\t" - "movq 88(%1),%%mm7\n\t" - "pmaddwd 24(%2),%%mm3\n\t" - "pmaddwd 56(%2),%%mm7\n\t" - "paddd %%mm1,%%mm0\n\t" - "paddd %%mm5,%%mm4\n\t" - "paddd %%mm2,%%mm0\n\t" - "paddd %%mm6,%%mm4\n\t" - "paddd %%mm3,%%mm0\n\t" - "paddd %%mm7,%%mm4\n\t" - "movq %%mm0,%%mm1\n\t" - "movq %%mm4,%%mm5\n\t" - "psrlq $32,%%mm1\n\t" - "psrlq $32,%%mm5\n\t" - "paddd %%mm1,%%mm0\n\t" - "paddd %%mm5,%%mm4\n\t" - "psrad $13,%%mm0\n\t" - "psrad $13,%%mm4\n\t" - "packssdw %%mm0,%%mm0\n\t" - "packssdw %%mm4,%%mm4\n\t" - - "movq (%3), %%mm1\n\t" - "punpckldq %%mm4, %%mm0\n\t" - "pand "MANGLE(one_null)", %%mm1\n\t" - "pand "MANGLE(null_one)", %%mm0\n\t" - "por %%mm0, %%mm1\n\t" - "movq %%mm1,(%3)\n\t" - - "add $64,%2\n\t" - "add $128,%1\n\t" - "add $8,%3\n\t" - - "decl %0\n\t" - "jnz 0b\n\t" - - "movq (%1),%%mm0\n\t" - "pmaddwd (%2),%%mm0\n\t" - "movq 8(%1),%%mm1\n\t" - "pmaddwd 8(%2),%%mm1\n\t" - "movq 16(%1),%%mm2\n\t" - "pmaddwd 16(%2),%%mm2\n\t" - "movq 24(%1),%%mm3\n\t" - "pmaddwd 24(%2),%%mm3\n\t" - "paddd %%mm1,%%mm0\n\t" - "paddd %%mm2,%%mm0\n\t" - "paddd %%mm3,%%mm0\n\t" - "movq %%mm0,%%mm1\n\t" - "psrlq $32,%%mm1\n\t" - "paddd %%mm1,%%mm0\n\t" - "psrad $13,%%mm0\n\t" - "packssdw %%mm0,%%mm0\n\t" - "movd %%mm0,%%eax\n\t" - "movw %%ax, (%3)\n\t" - "sub $32,%2\n\t" - "add $64,%1\n\t" - "add $4,%3\n\t" - - "movl $7,%0\n\t" -ASMALIGN(4) -"1:\n\t" - "movq (%1),%%mm0\n\t" - "movq 64(%1),%%mm4\n\t" - "pmaddwd (%2),%%mm0\n\t" - "pmaddwd -32(%2),%%mm4\n\t" - "movq 8(%1),%%mm1\n\t" - "movq 72(%1),%%mm5\n\t" - "pmaddwd 8(%2),%%mm1\n\t" - "pmaddwd -24(%2),%%mm5\n\t" - "movq 16(%1),%%mm2\n\t" - "movq 80(%1),%%mm6\n\t" - "pmaddwd 16(%2),%%mm2\n\t" - "pmaddwd -16(%2),%%mm6\n\t" - "movq 24(%1),%%mm3\n\t" - "movq 88(%1),%%mm7\n\t" - "pmaddwd 24(%2),%%mm3\n\t" - "pmaddwd -8(%2),%%mm7\n\t" - "paddd %%mm1,%%mm0\n\t" - "paddd %%mm5,%%mm4\n\t" - "paddd %%mm2,%%mm0\n\t" - "paddd %%mm6,%%mm4\n\t" - "paddd %%mm3,%%mm0\n\t" - "paddd %%mm7,%%mm4\n\t" - "movq %%mm0,%%mm1\n\t" - "movq %%mm4,%%mm5\n\t" - "psrlq $32,%%mm1\n\t" - "psrlq $32,%%mm5\n\t" - "paddd %%mm0,%%mm1\n\t" - "paddd %%mm4,%%mm5\n\t" - "psrad $13,%%mm1\n\t" - "psrad $13,%%mm5\n\t" - "packssdw %%mm1,%%mm1\n\t" - "packssdw %%mm5,%%mm5\n\t" - "psubd %%mm0,%%mm0\n\t" - "psubd %%mm4,%%mm4\n\t" - "psubsw %%mm1,%%mm0\n\t" - "psubsw %%mm5,%%mm4\n\t" - - "movq (%3), %%mm1\n\t" - "punpckldq %%mm4, %%mm0\n\t" - "pand "MANGLE(one_null)", %%mm1\n\t" - "pand "MANGLE(null_one)", %%mm0\n\t" - "por %%mm0, %%mm1\n\t" - "movq %%mm1,(%3)\n\t" - - "sub $64,%2\n\t" - "add $128,%1\n\t" - "add $8,%3\n\t" - "decl %0\n\t" - "jnz 1b\n\t" - - "movq (%1),%%mm0\n\t" - "pmaddwd (%2),%%mm0\n\t" - "movq 8(%1),%%mm1\n\t" - "pmaddwd 8(%2),%%mm1\n\t" - "movq 16(%1),%%mm2\n\t" - "pmaddwd 16(%2),%%mm2\n\t" - "movq 24(%1),%%mm3\n\t" - "pmaddwd 24(%2),%%mm3\n\t" - "paddd %%mm1,%%mm0\n\t" - "paddd %%mm2,%%mm0\n\t" - "paddd %%mm3,%%mm0\n\t" - "movq %%mm0,%%mm1\n\t" - "psrlq $32,%%mm1\n\t" - "paddd %%mm0,%%mm1\n\t" - "psrad $13,%%mm1\n\t" - "packssdw %%mm1,%%mm1\n\t" - "psubd %%mm0,%%mm0\n\t" - "psubsw %%mm1,%%mm0\n\t" - "movd %%mm0,%%eax\n\t" - "movw %%ax,(%3)\n\t" - "emms\n\t" - :"+r"(i), "+r"(window), "+r"(b0), "+r"(samples) - : - :"memory", "%eax"); - return 0; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/equalizer.c --- a/mp3lib/equalizer.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,78 +0,0 @@ -#include "mpg123.h" - -void init_spline(float *x, float *y, int n, float *y2) -{ - int i, k; - float p, qn, sig, un, *u; - u = malloc(n * sizeof (float)); - - y2[0] = u[0] = 0.0; - - for (i = 1; i < n - 1; i++) - { - sig = ((float) x[i] - x[i - 1]) / ((float) x[i + 1] - x[i - 1]); - p = sig * y2[i - 1] + 2.0; - y2[i] = (sig - 1.0) / p; - u[i] = (((float) y[i + 1] - y[i]) / (x[i + 1] - x[i])) - - (((float) y[i] - y[i - 1]) / (x[i] - x[i - 1])); - u[i] = (6.0 * u[i] / (x[i + 1] - x[i - 1]) - sig * u[i - 1]) / p; - } - qn = un = 0.0; - - y2[n - 1] = (un - qn * u[n - 2]) / (qn * y2[n - 2] + 1.0); - for (k = n - 2; k >= 0; k--) - y2[k] = y2[k] * y2[k + 1] + u[k]; - free(u); -} - -float eval_spline(float xa[], float ya[], float y2a[], int n, float x) -{ - int klo, khi, k; - float h, b, a; - - klo = 0; - khi = n - 1; - while (khi - klo > 1) - { - k = (khi + klo) >> 1; - if (xa[k] > x) - khi = k; - else - klo = k; - } - h = xa[khi] - xa[klo]; - a = (xa[khi] - x) / h; - b = (x - xa[klo]) / h; - return (a * ya[klo] + b * ya[khi] + ((a * a * a - a) * y2a[klo] + (b * b * b - b) * y2a[khi]) - * (h * h) / 6.0); -} - -void mpg123_set_eq(int on, float preamp, float *b) -{ - float x[] = - {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, yf[10], val, band[10]; - int bands[] = - {0, 4, 8, 16, 26, 78, 157, 313, 366, 418}; - int i, j; - - mpg123_info->eq_active = on; - if (mpg123_info->eq_active) - { - for (i = 0; i < 10; i++) - { - band[i] = b[i] + preamp; - } - - init_spline(x, band, 10, yf); - for (i = 0; i < 9; i++) - { - for (j = bands[i]; j < bands[i + 1]; j++) - { - val = eval_spline(x, band, yf, 10, i + ((float) (j - bands[i]) * (1.0 / (bands[i + 1] - bands[i])))); - mpg123_info->eq_mul[j] = pow(2, val / 10.0); - } - } - for (i = bands[9]; i < 576; i++) - mpg123_info->eq_mul[i] = mpg123_info->eq_mul[bands[9] - 1]; - } -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/huffman.h --- a/mp3lib/huffman.h Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,335 +0,0 @@ -/* - * huffman tables ... recalcualted to work with my optimzed - * decoder scheme (MH) - * - * probably we could save a few bytes of memory, because the - * smaller tables are often the part of a bigger table - */ - -#ifndef MPLAYER_MP3LIB_HUFFMAN_H -#define MPLAYER_MP3LIB_HUFFMAN_H - -struct newhuff -{ - unsigned int linbits; - short *table; -}; - -static short tab0[] = -{ - 0 -}; - -static short tab1[] = -{ - -5, -3, -1, 17, 1, 16, 0 -}; - -static short tab2[] = -{ - -15, -11, -9, -5, -3, -1, 34, 2, 18, -1, 33, 32, 17, -1, 1, - 16, 0 -}; - -static short tab3[] = -{ - -13, -11, -9, -5, -3, -1, 34, 2, 18, -1, 33, 32, 16, 17, -1, - 1, 0 -}; - -static short tab5[] = -{ - -29, -25, -23, -15, -7, -5, -3, -1, 51, 35, 50, 49, -3, -1, 19, - 3, -1, 48, 34, -3, -1, 18, 33, -1, 2, 32, 17, -1, 1, 16, - 0 -}; - -static short tab6[] = -{ - -25, -19, -13, -9, -5, -3, -1, 51, 3, 35, -1, 50, 48, -1, 19, - 49, -3, -1, 34, 2, 18, -3, -1, 33, 32, 1, -1, 17, -1, 16, - 0 -}; - -static short tab7[] = -{ - -69, -65, -57, -39, -29, -17, -11, -7, -3, -1, 85, 69, -1, 84, 83, - -1, 53, 68, -3, -1, 37, 82, 21, -5, -1, 81, -1, 5, 52, -1, - 80, -1, 67, 51, -5, -3, -1, 36, 66, 20, -1, 65, 64, -11, -7, - -3, -1, 4, 35, -1, 50, 3, -1, 19, 49, -3, -1, 48, 34, 18, - -5, -1, 33, -1, 2, 32, 17, -1, 1, 16, 0 -}; - -static short tab8[] = -{ - -65, -63, -59, -45, -31, -19, -13, -7, -5, -3, -1, 85, 84, 69, 83, - -3, -1, 53, 68, 37, -3, -1, 82, 5, 21, -5, -1, 81, -1, 52, - 67, -3, -1, 80, 51, 36, -5, -3, -1, 66, 20, 65, -3, -1, 4, - 64, -1, 35, 50, -9, -7, -3, -1, 19, 49, -1, 3, 48, 34, -1, - 2, 32, -1, 18, 33, 17, -3, -1, 1, 16, 0 -}; - -static short tab9[] = -{ - -63, -53, -41, -29, -19, -11, -5, -3, -1, 85, 69, 53, -1, 83, -1, - 84, 5, -3, -1, 68, 37, -1, 82, 21, -3, -1, 81, 52, -1, 67, - -1, 80, 4, -7, -3, -1, 36, 66, -1, 51, 64, -1, 20, 65, -5, - -3, -1, 35, 50, 19, -1, 49, -1, 3, 48, -5, -3, -1, 34, 2, - 18, -1, 33, 32, -3, -1, 17, 1, -1, 16, 0 -}; - -static short tab10[] = -{ --125,-121,-111, -83, -55, -35, -21, -13, -7, -3, -1, 119, 103, -1, 118, - 87, -3, -1, 117, 102, 71, -3, -1, 116, 86, -1, 101, 55, -9, -3, - -1, 115, 70, -3, -1, 85, 84, 99, -1, 39, 114, -11, -5, -3, -1, - 100, 7, 112, -1, 98, -1, 69, 53, -5, -1, 6, -1, 83, 68, 23, - -17, -5, -1, 113, -1, 54, 38, -5, -3, -1, 37, 82, 21, -1, 81, - -1, 52, 67, -3, -1, 22, 97, -1, 96, -1, 5, 80, -19, -11, -7, - -3, -1, 36, 66, -1, 51, 4, -1, 20, 65, -3, -1, 64, 35, -1, - 50, 3, -3, -1, 19, 49, -1, 48, 34, -7, -3, -1, 18, 33, -1, - 2, 32, 17, -1, 1, 16, 0 -}; - -static short tab11[] = -{ --121,-113, -89, -59, -43, -27, -17, -7, -3, -1, 119, 103, -1, 118, 117, - -3, -1, 102, 71, -1, 116, -1, 87, 85, -5, -3, -1, 86, 101, 55, - -1, 115, 70, -9, -7, -3, -1, 69, 84, -1, 53, 83, 39, -1, 114, - -1, 100, 7, -5, -1, 113, -1, 23, 112, -3, -1, 54, 99, -1, 96, - -1, 68, 37, -13, -7, -5, -3, -1, 82, 5, 21, 98, -3, -1, 38, - 6, 22, -5, -1, 97, -1, 81, 52, -5, -1, 80, -1, 67, 51, -1, - 36, 66, -15, -11, -7, -3, -1, 20, 65, -1, 4, 64, -1, 35, 50, - -1, 19, 49, -5, -3, -1, 3, 48, 34, 33, -5, -1, 18, -1, 2, - 32, 17, -3, -1, 1, 16, 0 -}; - -static short tab12[] = -{ --115, -99, -73, -45, -27, -17, -9, -5, -3, -1, 119, 103, 118, -1, 87, - 117, -3, -1, 102, 71, -1, 116, 101, -3, -1, 86, 55, -3, -1, 115, - 85, 39, -7, -3, -1, 114, 70, -1, 100, 23, -5, -1, 113, -1, 7, - 112, -1, 54, 99, -13, -9, -3, -1, 69, 84, -1, 68, -1, 6, 5, - -1, 38, 98, -5, -1, 97, -1, 22, 96, -3, -1, 53, 83, -1, 37, - 82, -17, -7, -3, -1, 21, 81, -1, 52, 67, -5, -3, -1, 80, 4, - 36, -1, 66, 20, -3, -1, 51, 65, -1, 35, 50, -11, -7, -5, -3, - -1, 64, 3, 48, 19, -1, 49, 34, -1, 18, 33, -7, -5, -3, -1, - 2, 32, 0, 17, -1, 1, 16 -}; - -static short tab13[] = -{ --509,-503,-475,-405,-333,-265,-205,-153,-115, -83, -53, -35, -21, -13, -9, - -7, -5, -3, -1, 254, 252, 253, 237, 255, -1, 239, 223, -3, -1, 238, - 207, -1, 222, 191, -9, -3, -1, 251, 206, -1, 220, -1, 175, 233, -1, - 236, 221, -9, -5, -3, -1, 250, 205, 190, -1, 235, 159, -3, -1, 249, - 234, -1, 189, 219, -17, -9, -3, -1, 143, 248, -1, 204, -1, 174, 158, - -5, -1, 142, -1, 127, 126, 247, -5, -1, 218, -1, 173, 188, -3, -1, - 203, 246, 111, -15, -7, -3, -1, 232, 95, -1, 157, 217, -3, -1, 245, - 231, -1, 172, 187, -9, -3, -1, 79, 244, -3, -1, 202, 230, 243, -1, - 63, -1, 141, 216, -21, -9, -3, -1, 47, 242, -3, -1, 110, 156, 15, - -5, -3, -1, 201, 94, 171, -3, -1, 125, 215, 78, -11, -5, -3, -1, - 200, 214, 62, -1, 185, -1, 155, 170, -1, 31, 241, -23, -13, -5, -1, - 240, -1, 186, 229, -3, -1, 228, 140, -1, 109, 227, -5, -1, 226, -1, - 46, 14, -1, 30, 225, -15, -7, -3, -1, 224, 93, -1, 213, 124, -3, - -1, 199, 77, -1, 139, 184, -7, -3, -1, 212, 154, -1, 169, 108, -1, - 198, 61, -37, -21, -9, -5, -3, -1, 211, 123, 45, -1, 210, 29, -5, - -1, 183, -1, 92, 197, -3, -1, 153, 122, 195, -7, -5, -3, -1, 167, - 151, 75, 209, -3, -1, 13, 208, -1, 138, 168, -11, -7, -3, -1, 76, - 196, -1, 107, 182, -1, 60, 44, -3, -1, 194, 91, -3, -1, 181, 137, - 28, -43, -23, -11, -5, -1, 193, -1, 152, 12, -1, 192, -1, 180, 106, - -5, -3, -1, 166, 121, 59, -1, 179, -1, 136, 90, -11, -5, -1, 43, - -1, 165, 105, -1, 164, -1, 120, 135, -5, -1, 148, -1, 119, 118, 178, - -11, -3, -1, 27, 177, -3, -1, 11, 176, -1, 150, 74, -7, -3, -1, - 58, 163, -1, 89, 149, -1, 42, 162, -47, -23, -9, -3, -1, 26, 161, - -3, -1, 10, 104, 160, -5, -3, -1, 134, 73, 147, -3, -1, 57, 88, - -1, 133, 103, -9, -3, -1, 41, 146, -3, -1, 87, 117, 56, -5, -1, - 131, -1, 102, 71, -3, -1, 116, 86, -1, 101, 115, -11, -3, -1, 25, - 145, -3, -1, 9, 144, -1, 72, 132, -7, -5, -1, 114, -1, 70, 100, - 40, -1, 130, 24, -41, -27, -11, -5, -3, -1, 55, 39, 23, -1, 113, - -1, 85, 7, -7, -3, -1, 112, 54, -1, 99, 69, -3, -1, 84, 38, - -1, 98, 53, -5, -1, 129, -1, 8, 128, -3, -1, 22, 97, -1, 6, - 96, -13, -9, -5, -3, -1, 83, 68, 37, -1, 82, 5, -1, 21, 81, - -7, -3, -1, 52, 67, -1, 80, 36, -3, -1, 66, 51, 20, -19, -11, - -5, -1, 65, -1, 4, 64, -3, -1, 35, 50, 19, -3, -1, 49, 3, - -1, 48, 34, -3, -1, 18, 33, -1, 2, 32, -3, -1, 17, 1, 16, - 0 -}; - -static short tab15[] = -{ --495,-445,-355,-263,-183,-115, -77, -43, -27, -13, -7, -3, -1, 255, 239, - -1, 254, 223, -1, 238, -1, 253, 207, -7, -3, -1, 252, 222, -1, 237, - 191, -1, 251, -1, 206, 236, -7, -3, -1, 221, 175, -1, 250, 190, -3, - -1, 235, 205, -1, 220, 159, -15, -7, -3, -1, 249, 234, -1, 189, 219, - -3, -1, 143, 248, -1, 204, 158, -7, -3, -1, 233, 127, -1, 247, 173, - -3, -1, 218, 188, -1, 111, -1, 174, 15, -19, -11, -3, -1, 203, 246, - -3, -1, 142, 232, -1, 95, 157, -3, -1, 245, 126, -1, 231, 172, -9, - -3, -1, 202, 187, -3, -1, 217, 141, 79, -3, -1, 244, 63, -1, 243, - 216, -33, -17, -9, -3, -1, 230, 47, -1, 242, -1, 110, 240, -3, -1, - 31, 241, -1, 156, 201, -7, -3, -1, 94, 171, -1, 186, 229, -3, -1, - 125, 215, -1, 78, 228, -15, -7, -3, -1, 140, 200, -1, 62, 109, -3, - -1, 214, 227, -1, 155, 185, -7, -3, -1, 46, 170, -1, 226, 30, -5, - -1, 225, -1, 14, 224, -1, 93, 213, -45, -25, -13, -7, -3, -1, 124, - 199, -1, 77, 139, -1, 212, -1, 184, 154, -7, -3, -1, 169, 108, -1, - 198, 61, -1, 211, 210, -9, -5, -3, -1, 45, 13, 29, -1, 123, 183, - -5, -1, 209, -1, 92, 208, -1, 197, 138, -17, -7, -3, -1, 168, 76, - -1, 196, 107, -5, -1, 182, -1, 153, 12, -1, 60, 195, -9, -3, -1, - 122, 167, -1, 166, -1, 192, 11, -1, 194, -1, 44, 91, -55, -29, -15, - -7, -3, -1, 181, 28, -1, 137, 152, -3, -1, 193, 75, -1, 180, 106, - -5, -3, -1, 59, 121, 179, -3, -1, 151, 136, -1, 43, 90, -11, -5, - -1, 178, -1, 165, 27, -1, 177, -1, 176, 105, -7, -3, -1, 150, 74, - -1, 164, 120, -3, -1, 135, 58, 163, -17, -7, -3, -1, 89, 149, -1, - 42, 162, -3, -1, 26, 161, -3, -1, 10, 160, 104, -7, -3, -1, 134, - 73, -1, 148, 57, -5, -1, 147, -1, 119, 9, -1, 88, 133, -53, -29, - -13, -7, -3, -1, 41, 103, -1, 118, 146, -1, 145, -1, 25, 144, -7, - -3, -1, 72, 132, -1, 87, 117, -3, -1, 56, 131, -1, 102, 71, -7, - -3, -1, 40, 130, -1, 24, 129, -7, -3, -1, 116, 8, -1, 128, 86, - -3, -1, 101, 55, -1, 115, 70, -17, -7, -3, -1, 39, 114, -1, 100, - 23, -3, -1, 85, 113, -3, -1, 7, 112, 54, -7, -3, -1, 99, 69, - -1, 84, 38, -3, -1, 98, 22, -3, -1, 6, 96, 53, -33, -19, -9, - -5, -1, 97, -1, 83, 68, -1, 37, 82, -3, -1, 21, 81, -3, -1, - 5, 80, 52, -7, -3, -1, 67, 36, -1, 66, 51, -1, 65, -1, 20, - 4, -9, -3, -1, 35, 50, -3, -1, 64, 3, 19, -3, -1, 49, 48, - 34, -9, -7, -3, -1, 18, 33, -1, 2, 32, 17, -3, -1, 1, 16, - 0 -}; - -static short tab16[] = -{ --509,-503,-461,-323,-103, -37, -27, -15, -7, -3, -1, 239, 254, -1, 223, - 253, -3, -1, 207, 252, -1, 191, 251, -5, -1, 175, -1, 250, 159, -3, - -1, 249, 248, 143, -7, -3, -1, 127, 247, -1, 111, 246, 255, -9, -5, - -3, -1, 95, 245, 79, -1, 244, 243, -53, -1, 240, -1, 63, -29, -19, - -13, -7, -5, -1, 206, -1, 236, 221, 222, -1, 233, -1, 234, 217, -1, - 238, -1, 237, 235, -3, -1, 190, 205, -3, -1, 220, 219, 174, -11, -5, - -1, 204, -1, 173, 218, -3, -1, 126, 172, 202, -5, -3, -1, 201, 125, - 94, 189, 242, -93, -5, -3, -1, 47, 15, 31, -1, 241, -49, -25, -13, - -5, -1, 158, -1, 188, 203, -3, -1, 142, 232, -1, 157, 231, -7, -3, - -1, 187, 141, -1, 216, 110, -1, 230, 156, -13, -7, -3, -1, 171, 186, - -1, 229, 215, -1, 78, -1, 228, 140, -3, -1, 200, 62, -1, 109, -1, - 214, 155, -19, -11, -5, -3, -1, 185, 170, 225, -1, 212, -1, 184, 169, - -5, -1, 123, -1, 183, 208, 227, -7, -3, -1, 14, 224, -1, 93, 213, - -3, -1, 124, 199, -1, 77, 139, -75, -45, -27, -13, -7, -3, -1, 154, - 108, -1, 198, 61, -3, -1, 92, 197, 13, -7, -3, -1, 138, 168, -1, - 153, 76, -3, -1, 182, 122, 60, -11, -5, -3, -1, 91, 137, 28, -1, - 192, -1, 152, 121, -1, 226, -1, 46, 30, -15, -7, -3, -1, 211, 45, - -1, 210, 209, -5, -1, 59, -1, 151, 136, 29, -7, -3, -1, 196, 107, - -1, 195, 167, -1, 44, -1, 194, 181, -23, -13, -7, -3, -1, 193, 12, - -1, 75, 180, -3, -1, 106, 166, 179, -5, -3, -1, 90, 165, 43, -1, - 178, 27, -13, -5, -1, 177, -1, 11, 176, -3, -1, 105, 150, -1, 74, - 164, -5, -3, -1, 120, 135, 163, -3, -1, 58, 89, 42, -97, -57, -33, - -19, -11, -5, -3, -1, 149, 104, 161, -3, -1, 134, 119, 148, -5, -3, - -1, 73, 87, 103, 162, -5, -1, 26, -1, 10, 160, -3, -1, 57, 147, - -1, 88, 133, -9, -3, -1, 41, 146, -3, -1, 118, 9, 25, -5, -1, - 145, -1, 144, 72, -3, -1, 132, 117, -1, 56, 131, -21, -11, -5, -3, - -1, 102, 40, 130, -3, -1, 71, 116, 24, -3, -1, 129, 128, -3, -1, - 8, 86, 55, -9, -5, -1, 115, -1, 101, 70, -1, 39, 114, -5, -3, - -1, 100, 85, 7, 23, -23, -13, -5, -1, 113, -1, 112, 54, -3, -1, - 99, 69, -1, 84, 38, -3, -1, 98, 22, -1, 97, -1, 6, 96, -9, - -5, -1, 83, -1, 53, 68, -1, 37, 82, -1, 81, -1, 21, 5, -33, - -23, -13, -7, -3, -1, 52, 67, -1, 80, 36, -3, -1, 66, 51, 20, - -5, -1, 65, -1, 4, 64, -1, 35, 50, -3, -1, 19, 49, -3, -1, - 3, 48, 34, -3, -1, 18, 33, -1, 2, 32, -3, -1, 17, 1, 16, - 0 -}; - -static short tab24[] = -{ --451,-117, -43, -25, -15, -7, -3, -1, 239, 254, -1, 223, 253, -3, -1, - 207, 252, -1, 191, 251, -5, -1, 250, -1, 175, 159, -1, 249, 248, -9, - -5, -3, -1, 143, 127, 247, -1, 111, 246, -3, -1, 95, 245, -1, 79, - 244, -71, -7, -3, -1, 63, 243, -1, 47, 242, -5, -1, 241, -1, 31, - 240, -25, -9, -1, 15, -3, -1, 238, 222, -1, 237, 206, -7, -3, -1, - 236, 221, -1, 190, 235, -3, -1, 205, 220, -1, 174, 234, -15, -7, -3, - -1, 189, 219, -1, 204, 158, -3, -1, 233, 173, -1, 218, 188, -7, -3, - -1, 203, 142, -1, 232, 157, -3, -1, 217, 126, -1, 231, 172, 255,-235, --143, -77, -45, -25, -15, -7, -3, -1, 202, 187, -1, 141, 216, -5, -3, - -1, 14, 224, 13, 230, -5, -3, -1, 110, 156, 201, -1, 94, 186, -9, - -5, -1, 229, -1, 171, 125, -1, 215, 228, -3, -1, 140, 200, -3, -1, - 78, 46, 62, -15, -7, -3, -1, 109, 214, -1, 227, 155, -3, -1, 185, - 170, -1, 226, 30, -7, -3, -1, 225, 93, -1, 213, 124, -3, -1, 199, - 77, -1, 139, 184, -31, -15, -7, -3, -1, 212, 154, -1, 169, 108, -3, - -1, 198, 61, -1, 211, 45, -7, -3, -1, 210, 29, -1, 123, 183, -3, - -1, 209, 92, -1, 197, 138, -17, -7, -3, -1, 168, 153, -1, 76, 196, - -3, -1, 107, 182, -3, -1, 208, 12, 60, -7, -3, -1, 195, 122, -1, - 167, 44, -3, -1, 194, 91, -1, 181, 28, -57, -35, -19, -7, -3, -1, - 137, 152, -1, 193, 75, -5, -3, -1, 192, 11, 59, -3, -1, 176, 10, - 26, -5, -1, 180, -1, 106, 166, -3, -1, 121, 151, -3, -1, 160, 9, - 144, -9, -3, -1, 179, 136, -3, -1, 43, 90, 178, -7, -3, -1, 165, - 27, -1, 177, 105, -1, 150, 164, -17, -9, -5, -3, -1, 74, 120, 135, - -1, 58, 163, -3, -1, 89, 149, -1, 42, 162, -7, -3, -1, 161, 104, - -1, 134, 119, -3, -1, 73, 148, -1, 57, 147, -63, -31, -15, -7, -3, - -1, 88, 133, -1, 41, 103, -3, -1, 118, 146, -1, 25, 145, -7, -3, - -1, 72, 132, -1, 87, 117, -3, -1, 56, 131, -1, 102, 40, -17, -7, - -3, -1, 130, 24, -1, 71, 116, -5, -1, 129, -1, 8, 128, -1, 86, - 101, -7, -5, -1, 23, -1, 7, 112, 115, -3, -1, 55, 39, 114, -15, - -7, -3, -1, 70, 100, -1, 85, 113, -3, -1, 54, 99, -1, 69, 84, - -7, -3, -1, 38, 98, -1, 22, 97, -5, -3, -1, 6, 96, 53, -1, - 83, 68, -51, -37, -23, -15, -9, -3, -1, 37, 82, -1, 21, -1, 5, - 80, -1, 81, -1, 52, 67, -3, -1, 36, 66, -1, 51, 20, -9, -5, - -1, 65, -1, 4, 64, -1, 35, 50, -1, 19, 49, -7, -5, -3, -1, - 3, 48, 34, 18, -1, 33, -1, 2, 32, -3, -1, 17, 1, -1, 16, - 0 -}; - -static short tab_c0[] = -{ - -29, -21, -13, -7, -3, -1, 11, 15, -1, 13, 14, -3, -1, 7, 5, - 9, -3, -1, 6, 3, -1, 10, 12, -3, -1, 2, 1, -1, 4, 8, - 0 -}; - -static short tab_c1[] = -{ - -15, -7, -3, -1, 15, 14, -1, 13, 12, -3, -1, 11, 10, -1, 9, - 8, -7, -3, -1, 7, 6, -1, 5, 4, -3, -1, 3, 2, -1, 1, - 0 -}; - - - -static struct newhuff ht[] = -{ - { /* 0 */ 0 , tab0 } , - { /* 2 */ 0 , tab1 } , - { /* 3 */ 0 , tab2 } , - { /* 3 */ 0 , tab3 } , - { /* 0 */ 0 , tab0 } , - { /* 4 */ 0 , tab5 } , - { /* 4 */ 0 , tab6 } , - { /* 6 */ 0 , tab7 } , - { /* 6 */ 0 , tab8 } , - { /* 6 */ 0 , tab9 } , - { /* 8 */ 0 , tab10 } , - { /* 8 */ 0 , tab11 } , - { /* 8 */ 0 , tab12 } , - { /* 16 */ 0 , tab13 } , - { /* 0 */ 0 , tab0 } , - { /* 16 */ 0 , tab15 } , - - { /* 16 */ 1 , tab16 } , - { /* 16 */ 2 , tab16 } , - { /* 16 */ 3 , tab16 } , - { /* 16 */ 4 , tab16 } , - { /* 16 */ 6 , tab16 } , - { /* 16 */ 8 , tab16 } , - { /* 16 */ 10, tab16 } , - { /* 16 */ 13, tab16 } , - { /* 16 */ 4 , tab24 } , - { /* 16 */ 5 , tab24 } , - { /* 16 */ 6 , tab24 } , - { /* 16 */ 7 , tab24 } , - { /* 16 */ 8 , tab24 } , - { /* 16 */ 9 , tab24 } , - { /* 16 */ 11, tab24 } , - { /* 16 */ 13, tab24 } -}; - -static struct newhuff htc[] = -{ - { /* 1 , 1 , */ 0 , tab_c0 } , - { /* 1 , 1 , */ 0 , tab_c1 } -}; - -#endif /* MPLAYER_MP3LIB_HUFFMAN_H */ diff -r bc0898c7399b -r b924f0df5a1d mp3lib/l2tables.h --- a/mp3lib/l2tables.h Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,166 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -#ifndef MPLAYER_MP3LIB_L2TABLES_H -#define MPLAYER_MP3LIB_L2TABLES_H - -#include "mpg123.h" - -/* - * Layer 2 Alloc tables .. - * most other tables are calculated on program start (which is (of course) - * not ISO-conform) .. - * Layer-3 huffman table is in huffman.h - */ - -static struct al_table alloc_0[] = { - {4,0},{5,3},{3,-3},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255},{10,-511}, - {11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383},{16,-32767}, - {4,0},{5,3},{3,-3},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255},{10,-511}, - {11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383},{16,-32767}, - {4,0},{5,3},{3,-3},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255},{10,-511}, - {11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767} }; - -static struct al_table alloc_1[] = { - {4,0},{5,3},{3,-3},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255},{10,-511}, - {11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383},{16,-32767}, - {4,0},{5,3},{3,-3},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255},{10,-511}, - {11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383},{16,-32767}, - {4,0},{5,3},{3,-3},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255},{10,-511}, - {11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {3,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767}, - {2,0},{5,3},{7,5},{16,-32767} }; - -static struct al_table alloc_2[] = { - {4,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255}, - {10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383}, - {4,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255}, - {10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63} }; - -static struct al_table alloc_3[] = { - {4,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255}, - {10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383}, - {4,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127},{9,-255}, - {10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191},{15,-16383}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63} }; - -static struct al_table alloc_4[] = { - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191}, - {4,0},{5,3},{7,5},{3,-3},{10,9},{4,-7},{5,-15},{6,-31},{7,-63},{8,-127}, - {9,-255},{10,-511},{11,-1023},{12,-2047},{13,-4095},{14,-8191}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {3,0},{5,3},{7,5},{10,9},{4,-7},{5,-15},{6,-31},{7,-63}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9}, - {2,0},{5,3},{7,5},{10,9} }; - -#endif /* MPLAYER_MP3LIB_L2TABLES_H */ diff -r bc0898c7399b -r b924f0df5a1d mp3lib/layer1.c --- a/mp3lib/layer1.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,165 +0,0 @@ -/* - * Mpeg Layer-1 audio decoder - * -------------------------- - * copyright (c) 1995 by Michael Hipp, All rights reserved. See also 'README' - * near unoptimzed ... - * - * may have a few bugs after last optimization ... - * - */ - -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - * - * The above-mentioned README file has the following to say about licensing: - * - * COPYING: you may use this source under LGPL terms! - */ - -#include "mpg123.h" - -static void I_step_one(unsigned int balloc[], unsigned int scale_index[2][SBLIMIT],struct frame *fr) -{ - unsigned int *ba=balloc; - unsigned int *sca = (unsigned int *) scale_index; - - if(fr->stereo == 2) { - int i; - int jsbound = fr->jsbound; - for (i=0;istereo == 2) { - int jsbound = fr->jsbound; - register real *f0 = fraction[0]; - register real *f1 = fraction[1]; - ba = balloc; - for (sample=smpb,i=0;idown_sample_sblimit;i<32;i++) - fraction[0][i] = fraction[1][i] = 0.0; - } - else { - register real *f0 = fraction[0]; - ba = balloc; - for (sample=smpb,i=0;idown_sample_sblimit;i<32;i++) - fraction[0][i] = 0.0; - } -} - -static int do_layer1(struct frame *fr,int single) -{ - int clip=0; - int i,stereo = fr->stereo; - unsigned int balloc[2*SBLIMIT]; - unsigned int scale_index[2][SBLIMIT]; - DECLARE_ALIGNED(16, real, fraction[2][SBLIMIT]); -// int single = fr->single; - -// printf("do_layer1(0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X )\n", -// wordpointer[0],wordpointer[1],wordpointer[2],wordpointer[3],wordpointer[4],wordpointer[5],wordpointer[6],wordpointer[7]); - - fr->jsbound = (fr->mode == MPG_MD_JOINT_STEREO) ? - (fr->mode_ext<<2)+4 : 32; - - if(stereo == 1 || single == 3) - single = 0; - - I_step_one(balloc,scale_index,fr); - - for (i=0;i= 0) - { - clip += (fr->synth_mono)( (real *) fraction[single],pcm_sample,&pcm_point); - } - else { - int p1 = pcm_point; - clip += (fr->synth)( (real *) fraction[0],0,pcm_sample,&p1); - clip += (fr->synth)( (real *) fraction[1],1,pcm_sample,&pcm_point); - } - - } - - return clip; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/layer2.c --- a/mp3lib/layer2.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,322 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* - * Mpeg Layer-2 audio decoder - * -------------------------- - * copyright (c) 1995 by Michael Hipp, All rights reserved. See also 'README' - * - */ - -#include "mpg123.h" -#include "l2tables.h" - -static int grp_3tab[32 * 3] = { 0, }; /* used: 27 */ -static int grp_5tab[128 * 3] = { 0, }; /* used: 125 */ -static int grp_9tab[1024 * 3] = { 0, }; /* used: 729 */ - -static real muls[27][64]; /* also used by layer 1 */ - -static void init_layer2(void) -{ - static double mulmul[27] = { - 0.0 , -2.0/3.0 , 2.0/3.0 , - 2.0/7.0 , 2.0/15.0 , 2.0/31.0, 2.0/63.0 , 2.0/127.0 , 2.0/255.0 , - 2.0/511.0 , 2.0/1023.0 , 2.0/2047.0 , 2.0/4095.0 , 2.0/8191.0 , - 2.0/16383.0 , 2.0/32767.0 , 2.0/65535.0 , - -4.0/5.0 , -2.0/5.0 , 2.0/5.0, 4.0/5.0 , - -8.0/9.0 , -4.0/9.0 , -2.0/9.0 , 2.0/9.0 , 4.0/9.0 , 8.0/9.0 }; - static int base[3][9] = { - { 1 , 0, 2 , } , - { 17, 18, 0 , 19, 20 , } , - { 21, 1, 22, 23, 0, 24, 25, 2, 26 } }; - int i,j,k,l,len; - real *table; - static int tablen[3] = { 3 , 5 , 9 }; - static int *itable,*tables[3] = { grp_3tab , grp_5tab , grp_9tab }; - - for(i=0;i<3;i++) - { - itable = tables[i]; - len = tablen[i]; - for(j=0;jstereo-1; - int sblimit = fr->II_sblimit; - int jsbound = fr->jsbound; - int sblimit2 = fr->II_sblimit<alloc; - int i; - static unsigned int scfsi_buf[64]; - unsigned int *scfsi,*bita; - int sc,step; - - bita = bit_alloc; - if(stereo) - { - for (i=jsbound;i>0;i--,alloc1+=(1<bits); - *bita++ = (char) getbits(step); - } - for (i=sblimit-jsbound;i>0;i--,alloc1+=(1<bits); - bita[1] = bita[0]; - bita+=2; - } - bita = bit_alloc; - scfsi=scfsi_buf; - for (i=sblimit2;i>0;i--) - if (*bita++) - *scfsi++ = (char) getbits_fast(2); - } - else /* mono */ - { - for (i=sblimit;i>0;i--,alloc1+=(1<bits); - bita = bit_alloc; - scfsi=scfsi_buf; - for (i=sblimit;i>0;i--) - if (*bita++) - *scfsi++ = (char) getbits_fast(2); - } - - bita = bit_alloc; - scfsi=scfsi_buf; - for (i=sblimit2;i>0;i--) - if (*bita++) - switch (*scfsi++) - { - case 0: - *scale++ = getbits_fast(6); - *scale++ = getbits_fast(6); - *scale++ = getbits_fast(6); - break; - case 1 : - *scale++ = sc = getbits_fast(6); - *scale++ = sc; - *scale++ = getbits_fast(6); - break; - case 2: - *scale++ = sc = getbits_fast(6); - *scale++ = sc; - *scale++ = sc; - break; - default: /* case 3 */ - *scale++ = getbits_fast(6); - *scale++ = sc = getbits_fast(6); - *scale++ = sc; - break; - } - -} - -static void II_step_two(unsigned int *bit_alloc,real fraction[2][4][SBLIMIT],int *scale,struct frame *fr,int x1) -{ - int i,j,k,ba; - int stereo = fr->stereo; - int sblimit = fr->II_sblimit; - int jsbound = fr->jsbound; - struct al_table *alloc2,*alloc1 = fr->alloc; - unsigned int *bita=bit_alloc; - int d1,step; - - for (i=0;ibits; - for (j=0;jbits; - if( (d1=alloc2->d) < 0) - { - real cm=muls[k][scale[x1]]; - fraction[j][0][i] = ((real) ((int)getbits(k) + d1)) * cm; - fraction[j][1][i] = ((real) ((int)getbits(k) + d1)) * cm; - fraction[j][2][i] = ((real) ((int)getbits(k) + d1)) * cm; - } - else - { - static int *table[] = { 0,0,0,grp_3tab,0,grp_5tab,0,0,0,grp_9tab }; - unsigned int idx,*tab,m=scale[x1]; - idx = (unsigned int) getbits(k); - tab = (unsigned int *) (table[d1] + idx + idx + idx); - fraction[j][0][i] = muls[*tab++][m]; - fraction[j][1][i] = muls[*tab++][m]; - fraction[j][2][i] = muls[*tab][m]; - } - scale+=3; - } - else - fraction[j][0][i] = fraction[j][1][i] = fraction[j][2][i] = 0.0; - } - } - - for (i=jsbound;ibits; - bita++; /* channel 1 and channel 2 bitalloc are the same */ - if ( (ba=*bita++) ) - { - k=(alloc2 = alloc1+ba)->bits; - if( (d1=alloc2->d) < 0) - { - real cm; - cm=muls[k][scale[x1+3]]; - fraction[1][0][i] = (fraction[0][0][i] = (real) ((int)getbits(k) + d1) ) * cm; - fraction[1][1][i] = (fraction[0][1][i] = (real) ((int)getbits(k) + d1) ) * cm; - fraction[1][2][i] = (fraction[0][2][i] = (real) ((int)getbits(k) + d1) ) * cm; - cm=muls[k][scale[x1]]; - fraction[0][0][i] *= cm; fraction[0][1][i] *= cm; fraction[0][2][i] *= cm; - } - else - { - static int *table[] = { 0,0,0,grp_3tab,0,grp_5tab,0,0,0,grp_9tab }; - unsigned int idx,*tab,m1,m2; - m1 = scale[x1]; m2 = scale[x1+3]; - idx = (unsigned int) getbits(k); - tab = (unsigned int *) (table[d1] + idx + idx + idx); - fraction[0][0][i] = muls[*tab][m1]; fraction[1][0][i] = muls[*tab++][m2]; - fraction[0][1][i] = muls[*tab][m1]; fraction[1][1][i] = muls[*tab++][m2]; - fraction[0][2][i] = muls[*tab][m1]; fraction[1][2][i] = muls[*tab][m2]; - } - scale+=6; - } - else { - fraction[0][0][i] = fraction[0][1][i] = fraction[0][2][i] = - fraction[1][0][i] = fraction[1][1][i] = fraction[1][2][i] = 0.0; - } -/* - should we use individual scalefac for channel 2 or - is the current way the right one , where we just copy channel 1 to - channel 2 ?? - The current 'strange' thing is, that we throw away the scalefac - values for the second channel ...!! --> changed .. now we use the scalefac values of channel one !! -*/ - } - - if(sblimit > (fr->down_sample_sblimit) ) - sblimit = fr->down_sample_sblimit; - - for(i=sblimit;ilsf) - table = 4; - else - table = translate[fr->sampling_frequency][2-fr->stereo][fr->bitrate_index]; - sblim = sblims[table]; - - fr->alloc = tables[table]; - fr->II_sblimit = sblim; -} - - -static int do_layer2(struct frame *fr,int outmode) -{ - int clip=0; - int i,j; - int stereo = fr->stereo; - DECLARE_ALIGNED(16, real, fraction[2][4][SBLIMIT]); /* pick_table clears unused subbands */ - unsigned int bit_alloc[64]; - int scale[192]; - int single = fr->single; - - II_select_table(fr); - fr->jsbound = (fr->mode == MPG_MD_JOINT_STEREO) ? - (fr->mode_ext<<2)+4 : fr->II_sblimit; - - if(stereo == 1 || single == 3) - single = 0; - - II_step_one(bit_alloc, scale, fr); - - for (i=0;i>2); - for (j=0;j<3;j++) - { - if(single >= 0) - { - clip += (fr->synth_mono) (fraction[single][j],pcm_sample,&pcm_point); - } - else { - int p1 = pcm_point; - clip += (fr->synth) (fraction[0][j],0,pcm_sample,&p1); - clip += (fr->synth) (fraction[1][j],1,pcm_sample,&pcm_point); - } - -// if(pcm_point >= audiobufsize) audio_flush(outmode,ai); - } - } - - return clip; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/layer3.c --- a/mp3lib/layer3.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1349 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* - * Mpeg Layer-3 audio decoder - * -------------------------- - * copyright (c) 1995-1999 by Michael Hipp. - * All rights reserved. See also 'README' - * - * Optimize-TODO: put short bands into the band-field without the stride - * of 3 reals - * Length-optimze: unify long and short band code where it is possible - */ - -#include "mpg123.h" - -#if 0 -#define L3_DEBUG 1 -#endif - -#if 0 -#define CUT_HF -#endif - -#define REAL_MUL(x, y) ((x) * (y)) - -static real ispow[8207]; -static real aa_ca[8],aa_cs[8]; -static real COS1[12][6]; -static real win[4][36]; -static real win1[4][36]; -static real gainpow2[256+118+4]; - -/* non static for external 3dnow functions */ -real COS9[9]; -static real COS6_1,COS6_2; -real tfcos36[9]; - -static real tfcos12[3]; -#define NEW_DCT9 -#ifdef NEW_DCT9 -static real cos9[3],cos18[3]; -#endif - -struct bandInfoStruct { - uint16_t longIdx[23]; - uint8_t longDiff[22]; - uint16_t shortIdx[14]; - uint8_t shortDiff[13]; -}; - -static int longLimit[9][23]; -static int shortLimit[9][14]; - -static const struct bandInfoStruct bandInfo[9] = { - -/* MPEG 1.0 */ - { {0,4,8,12,16,20,24,30,36,44,52,62,74, 90,110,134,162,196,238,288,342,418,576}, - {4,4,4,4,4,4,6,6,8, 8,10,12,16,20,24,28,34,42,50,54, 76,158}, - {0,4*3,8*3,12*3,16*3,22*3,30*3,40*3,52*3,66*3, 84*3,106*3,136*3,192*3}, - {4,4,4,4,6,8,10,12,14,18,22,30,56} } , - - { {0,4,8,12,16,20,24,30,36,42,50,60,72, 88,106,128,156,190,230,276,330,384,576}, - {4,4,4,4,4,4,6,6,6, 8,10,12,16,18,22,28,34,40,46,54, 54,192}, - {0,4*3,8*3,12*3,16*3,22*3,28*3,38*3,50*3,64*3, 80*3,100*3,126*3,192*3}, - {4,4,4,4,6,6,10,12,14,16,20,26,66} } , - - { {0,4,8,12,16,20,24,30,36,44,54,66,82,102,126,156,194,240,296,364,448,550,576} , - {4,4,4,4,4,4,6,6,8,10,12,16,20,24,30,38,46,56,68,84,102, 26} , - {0,4*3,8*3,12*3,16*3,22*3,30*3,42*3,58*3,78*3,104*3,138*3,180*3,192*3} , - {4,4,4,4,6,8,12,16,20,26,34,42,12} } , - -/* MPEG 2.0 */ - { {0,6,12,18,24,30,36,44,54,66,80,96,116,140,168,200,238,284,336,396,464,522,576}, - {6,6,6,6,6,6,8,10,12,14,16,20,24,28,32,38,46,52,60,68,58,54 } , - {0,4*3,8*3,12*3,18*3,24*3,32*3,42*3,56*3,74*3,100*3,132*3,174*3,192*3} , - {4,4,4,6,6,8,10,14,18,26,32,42,18 } } , -/* changed 19th value fropm 330 to 332 */ - { {0,6,12,18,24,30,36,44,54,66,80,96,114,136,162,194,232,278,332,394,464,540,576}, - {6,6,6,6,6,6,8,10,12,14,16,18,22,26,32,38,46,54,62,70,76,36 } , - {0,4*3,8*3,12*3,18*3,26*3,36*3,48*3,62*3,80*3,104*3,136*3,180*3,192*3} , - {4,4,4,6,8,10,12,14,18,24,32,44,12 } } , - - { {0,6,12,18,24,30,36,44,54,66,80,96,116,140,168,200,238,284,336,396,464,522,576}, - {6,6,6,6,6,6,8,10,12,14,16,20,24,28,32,38,46,52,60,68,58,54 }, - {0,4*3,8*3,12*3,18*3,26*3,36*3,48*3,62*3,80*3,104*3,134*3,174*3,192*3}, - {4,4,4,6,8,10,12,14,18,24,30,40,18 } } , -/* MPEG 2.5 */ - { {0,6,12,18,24,30,36,44,54,66,80,96,116,140,168,200,238,284,336,396,464,522,576} , - {6,6,6,6,6,6,8,10,12,14,16,20,24,28,32,38,46,52,60,68,58,54}, - {0,12,24,36,54,78,108,144,186,240,312,402,522,576}, - {4,4,4,6,8,10,12,14,18,24,30,40,18} }, - { {0,6,12,18,24,30,36,44,54,66,80,96,116,140,168,200,238,284,336,396,464,522,576} , - {6,6,6,6,6,6,8,10,12,14,16,20,24,28,32,38,46,52,60,68,58,54}, - {0,12,24,36,54,78,108,144,186,240,312,402,522,576}, - {4,4,4,6,8,10,12,14,18,24,30,40,18} }, - { {0,12,24,36,48,60,72,88,108,132,160,192,232,280,336,400,476,566,568,570,572,574,576}, - {12,12,12,12,12,12,16,20,24,28,32,40,48,56,64,76,90,2,2,2,2,2}, - {0, 24, 48, 72,108,156,216,288,372,480,486,492,498,576}, - {8,8,8,12,16,20,24,28,36,2,2,2,26} } , -}; - -static int mapbuf0[9][152]; -static int mapbuf1[9][156]; -static int mapbuf2[9][44]; -static int *map[9][3]; -static int *mapend[9][3]; - -static unsigned int n_slen2[512]; /* MPEG 2.0 slen for 'normal' mode */ -static unsigned int i_slen2[256]; /* MPEG 2.0 slen for intensity stereo */ - -static real tan1_1[16],tan2_1[16],tan1_2[16],tan2_2[16]; -static real pow1_1[2][16],pow2_1[2][16],pow1_2[2][16],pow2_2[2][16]; - -/* - * init tables for layer-3 - */ -static void init_layer3(int down_sample_sblimit) -{ - int i,j,k,l; - - for(i=-256;i<118+4;i++) - { - if(_has_mmx) - gainpow2[i+256] = 16384.0 * pow((double)2.0,-0.25 * (double) (i+210) ); - else - gainpow2[i+256] = pow((double)2.0,-0.25 * (double) (i+210) ); - } - for(i=0;i<8207;i++) - ispow[i] = pow((double)i,(double)4.0/3.0); - - for (i=0;i<8;i++) - { - static const double Ci[8]={-0.6,-0.535,-0.33,-0.185,-0.095,-0.041,-0.0142,-0.0037}; - double sq=sqrt(1.0+Ci[i]*Ci[i]); - aa_cs[i] = 1.0/sq; - aa_ca[i] = Ci[i]/sq; - } - - for(i=0;i<18;i++) - { - win[0][i] = win[1][i] = 0.5 * sin( M_PI / 72.0 * (double) (2*(i+0) +1) ) / cos ( M_PI * (double) (2*(i+0) +19) / 72.0 ); - win[0][i+18] = win[3][i+18] = 0.5 * sin( M_PI / 72.0 * (double) (2*(i+18)+1) ) / cos ( M_PI * (double) (2*(i+18)+19) / 72.0 ); - } - for(i=0;i<6;i++) - { - win[1][i+18] = 0.5 / cos ( M_PI * (double) (2*(i+18)+19) / 72.0 ); - win[3][i+12] = 0.5 / cos ( M_PI * (double) (2*(i+12)+19) / 72.0 ); - win[1][i+24] = 0.5 * sin( M_PI / 24.0 * (double) (2*i+13) ) / cos ( M_PI * (double) (2*(i+24)+19) / 72.0 ); - win[1][i+30] = win[3][i] = 0.0; - win[3][i+6 ] = 0.5 * sin( M_PI / 24.0 * (double) (2*i+1) ) / cos ( M_PI * (double) (2*(i+6 )+19) / 72.0 ); - } - - for(i=0;i<9;i++) - COS9[i] = cos( M_PI / 18.0 * (double) i); - - for(i=0;i<9;i++) - tfcos36[i] = 0.5 / cos ( M_PI * (double) (i*2+1) / 36.0 ); - for(i=0;i<3;i++) - tfcos12[i] = 0.5 / cos ( M_PI * (double) (i*2+1) / 12.0 ); - - COS6_1 = cos( M_PI / 6.0 * (double) 1); - COS6_2 = cos( M_PI / 6.0 * (double) 2); - -#ifdef NEW_DCT9 - cos9[0] = cos(1.0*M_PI/9.0); - cos9[1] = cos(5.0*M_PI/9.0); - cos9[2] = cos(7.0*M_PI/9.0); - cos18[0] = cos(1.0*M_PI/18.0); - cos18[1] = cos(11.0*M_PI/18.0); - cos18[2] = cos(13.0*M_PI/18.0); -#endif - - for(i=0;i<12;i++) - { - win[2][i] = 0.5 * sin( M_PI / 24.0 * (double) (2*i+1) ) / cos ( M_PI * (double) (2*i+7) / 24.0 ); - for(j=0;j<6;j++) - COS1[i][j] = cos( M_PI / 24.0 * (double) ((2*i+7)*(2*j+1)) ); - } - - for(j=0;j<4;j++) { - static const int len[4] = { 36,36,12,36 }; - for(i=0;i 0) { - if( i & 1 ) - p1 = pow(base,(i+1.0)*0.5); - else - p2 = pow(base,i*0.5); - } - pow1_1[j][i] = p1; - pow2_1[j][i] = p2; - pow1_2[j][i] = M_SQRT2 * p1; - pow2_2[j][i] = M_SQRT2 * p2; - } - } - - for(j=0;j<9;j++) - { - const struct bandInfoStruct *bi = &bandInfo[j]; - int *mp; - int cb,lwin; - const uint8_t *bdf; - - mp = map[j][0] = mapbuf0[j]; - bdf = bi->longDiff; - for(i=0,cb = 0; cb < 8 ; cb++,i+=*bdf++) { - *mp++ = (*bdf) >> 1; - *mp++ = i; - *mp++ = 3; - *mp++ = cb; - } - bdf = bi->shortDiff+3; - for(cb=3;cb<13;cb++) { - int l = (*bdf++) >> 1; - for(lwin=0;lwin<3;lwin++) { - *mp++ = l; - *mp++ = i + lwin; - *mp++ = lwin; - *mp++ = cb; - } - i += 6*l; - } - mapend[j][0] = mp; - - mp = map[j][1] = mapbuf1[j]; - bdf = bi->shortDiff+0; - for(i=0,cb=0;cb<13;cb++) { - int l = (*bdf++) >> 1; - for(lwin=0;lwin<3;lwin++) { - *mp++ = l; - *mp++ = i + lwin; - *mp++ = lwin; - *mp++ = cb; - } - i += 6*l; - } - mapend[j][1] = mp; - - mp = map[j][2] = mapbuf2[j]; - bdf = bi->longDiff; - for(cb = 0; cb < 22 ; cb++) { - *mp++ = (*bdf++) >> 1; - *mp++ = cb; - } - mapend[j][2] = mp; - - } - - for(j=0;j<9;j++) { - for(i=0;i<23;i++) { - longLimit[j][i] = (bandInfo[j].longIdx[i] - 1 + 8) / 18 + 1; - if(longLimit[j][i] > (down_sample_sblimit) ) - longLimit[j][i] = down_sample_sblimit; - } - for(i=0;i<14;i++) { - shortLimit[j][i] = (bandInfo[j].shortIdx[i] - 1) / 18 + 1; - if(shortLimit[j][i] > (down_sample_sblimit) ) - shortLimit[j][i] = down_sample_sblimit; - } - } - - for(i=0;i<5;i++) { - for(j=0;j<6;j++) { - for(k=0;k<6;k++) { - int n = k + j * 6 + i * 36; - i_slen2[n] = i|(j<<3)|(k<<6)|(3<<12); - } - } - } - for(i=0;i<4;i++) { - for(j=0;j<4;j++) { - for(k=0;k<4;k++) { - int n = k + j * 4 + i * 16; - i_slen2[n+180] = i|(j<<3)|(k<<6)|(4<<12); - } - } - } - for(i=0;i<4;i++) { - for(j=0;j<3;j++) { - int n = j + i * 3; - i_slen2[n+244] = i|(j<<3) | (5<<12); - n_slen2[n+500] = i|(j<<3) | (2<<12) | (1<<15); - } - } - - for(i=0;i<5;i++) { - for(j=0;j<5;j++) { - for(k=0;k<4;k++) { - for(l=0;l<4;l++) { - int n = l + k * 4 + j * 16 + i * 80; - n_slen2[n] = i|(j<<3)|(k<<6)|(l<<9)|(0<<12); - } - } - } - } - for(i=0;i<5;i++) { - for(j=0;j<5;j++) { - for(k=0;k<4;k++) { - int n = k + j * 4 + i * 20; - n_slen2[n+400] = i|(j<<3)|(k<<6)|(1<<12); - } - } - } -} - -/* - * read additional side information (for MPEG 1 and MPEG 2) - */ -static int III_get_side_info(struct III_sideinfo *si,int stereo, - int ms_stereo,int sfreq,int single,int lsf) -{ - int ch, gr; - int powdiff = (single == 3) ? 4 : 0; - - static const int tabs[2][5] = { { 2,9,5,3,4 } , { 1,8,1,2,9 } }; - const int *tab = tabs[lsf]; - - si->main_data_begin = getbits(tab[1]); - if (stereo == 1) - si->private_bits = getbits_fast(tab[2]); - else - si->private_bits = getbits_fast(tab[3]); - - if(!lsf) { - for (ch=0; chch[ch].gr[0].scfsi = -1; - si->ch[ch].gr[1].scfsi = getbits_fast(4); - } - } - - for (gr=0; grch[ch].gr[gr]); - - gr_info->part2_3_length = getbits(12); - gr_info->big_values = getbits(9); - if(gr_info->big_values > 288) { - fprintf(stderr,"big_values too large!\n"); - gr_info->big_values = 288; - } - gr_info->pow2gain = gainpow2+256 - getbits_fast(8) + powdiff; - if(ms_stereo) - gr_info->pow2gain += 2; - gr_info->scalefac_compress = getbits(tab[4]); - - if(get1bit()) { /* window switch flag */ - int i; -#ifdef L3_DEBUG -if(2*gr_info->big_values > bandInfo[sfreq].shortIdx[12]) - fprintf(stderr,"L3: BigValues too large, doesn't make sense %d %d\n",2*gr_info->big_values,bandInfo[sfreq].shortIdx[12]); -#endif - - gr_info->block_type = getbits_fast(2); - gr_info->mixed_block_flag = get1bit(); - gr_info->table_select[0] = getbits_fast(5); - gr_info->table_select[1] = getbits_fast(5); - /* - * table_select[2] not needed, because there is no region2, - * but to satisfy some verifications tools we set it either. - */ - gr_info->table_select[2] = 0; - for(i=0;i<3;i++) - gr_info->full_gain[i] = gr_info->pow2gain + (getbits_fast(3)<<3); - - if(gr_info->block_type == 0) { - fprintf(stderr,"Blocktype == 0 and window-switching == 1 not allowed.\n"); - return 0; - } - - /* region_count/start parameters are implicit in this case. */ - if(!lsf || gr_info->block_type == 2) - gr_info->region1start = 36>>1; - else { -/* check this again for 2.5 and sfreq=8 */ - if(sfreq == 8) - gr_info->region1start = 108>>1; - else - gr_info->region1start = 54>>1; - } - gr_info->region2start = 576>>1; - } - else { - int i,r0c,r1c; -#ifdef L3_DEBUG -if(2*gr_info->big_values > bandInfo[sfreq].longIdx[21]) - fprintf(stderr,"L3: BigValues too large, doesn't make sense %d %d\n",2*gr_info->big_values,bandInfo[sfreq].longIdx[21]); -#endif - for (i=0; i<3; i++) - gr_info->table_select[i] = getbits_fast(5); - r0c = getbits_fast(4); - r1c = getbits_fast(3); - gr_info->region1start = bandInfo[sfreq].longIdx[r0c+1] >> 1 ; - if(r0c + r1c + 2 > 22) - gr_info->region2start = 576>>1; - else - gr_info->region2start = bandInfo[sfreq].longIdx[r0c+1+r1c+1] >> 1; - gr_info->block_type = 0; - gr_info->mixed_block_flag = 0; - } - if(!lsf) - gr_info->preflag = get1bit(); - gr_info->scalefac_scale = get1bit(); - gr_info->count1table_select = get1bit(); - } - } - - return !0; -} - -/* - * read scalefactors - */ -static int III_get_scale_factors_1(int *scf,struct gr_info_s *gr_info) -{ - static const unsigned char slen[2][16] = { - {0, 0, 0, 0, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4}, - {0, 1, 2, 3, 0, 1, 2, 3, 1, 2, 3, 1, 2, 3, 2, 3} - }; - int numbits; - int num0 = slen[0][gr_info->scalefac_compress]; - int num1 = slen[1][gr_info->scalefac_compress]; - - if (gr_info->block_type == 2) { - int i=18; - numbits = (num0 + num1) * 18; - - if (gr_info->mixed_block_flag) { - for (i=8;i;i--) - *scf++ = getbits_fast(num0); - i = 9; - numbits -= num0; /* num0 * 17 + num1 * 18 */ - } - - for (;i;i--) - *scf++ = getbits_fast(num0); - for (i = 18; i; i--) - *scf++ = getbits_fast(num1); - *scf++ = 0; *scf++ = 0; *scf++ = 0; /* short[13][0..2] = 0 */ - } - else { - int i; - int scfsi = gr_info->scfsi; - - if(scfsi < 0) { /* scfsi < 0 => granule == 0 */ - for(i=11;i;i--) - *scf++ = getbits_fast(num0); - for(i=10;i;i--) - *scf++ = getbits_fast(num1); - numbits = (num0 + num1) * 10 + num0; - *scf++ = 0; - } - else { - numbits = 0; - if(!(scfsi & 0x8)) { - for (i=0;i<6;i++) - *scf++ = getbits_fast(num0); - numbits += num0 * 6; - } - else { - scf += 6; - } - - if(!(scfsi & 0x4)) { - for (i=0;i<5;i++) - *scf++ = getbits_fast(num0); - numbits += num0 * 5; - } - else { - scf += 5; - } - - if(!(scfsi & 0x2)) { - for(i=0;i<5;i++) - *scf++ = getbits_fast(num1); - numbits += num1 * 5; - } - else { - scf += 5; - } - - if(!(scfsi & 0x1)) { - for (i=0;i<5;i++) - *scf++ = getbits_fast(num1); - numbits += num1 * 5; - } - else { - scf += 5; - } - *scf++ = 0; /* no l[21] in original sources */ - } - } - return numbits; -} - -static int III_get_scale_factors_2(int *scf,struct gr_info_s *gr_info,int i_stereo) -{ - unsigned char *pnt; - int i,j; - unsigned int slen; - int n = 0; - int numbits = 0; - - static unsigned char stab[3][6][4] = { - { { 6, 5, 5,5 } , { 6, 5, 7,3 } , { 11,10,0,0} , - { 7, 7, 7,0 } , { 6, 6, 6,3 } , { 8, 8,5,0} } , - { { 9, 9, 9,9 } , { 9, 9,12,6 } , { 18,18,0,0} , - {12,12,12,0 } , {12, 9, 9,6 } , { 15,12,9,0} } , - { { 6, 9, 9,9 } , { 6, 9,12,6 } , { 15,18,0,0} , - { 6,15,12,0 } , { 6,12, 9,6 } , { 6,18,9,0} } }; - - if(i_stereo) /* i_stereo AND second channel -> do_layer3() checks this */ - slen = i_slen2[gr_info->scalefac_compress>>1]; - else - slen = n_slen2[gr_info->scalefac_compress]; - - gr_info->preflag = (slen>>15) & 0x1; - - n = 0; - if( gr_info->block_type == 2 ) { - n++; - if(gr_info->mixed_block_flag) n++; - } - - pnt = stab[n][(slen>>12)&0x7]; - - for(i=0;i<4;i++) { - int num = slen & 0x7; - slen >>= 3; - if(num) { - for(j=0;j<(int)(pnt[i]);j++) *scf++ = getbits_fast(num); - numbits += pnt[i] * num; - } - else { - for(j=0;j<(int)(pnt[i]);j++) *scf++ = 0; - } - } - - n = (n << 1) + 1; - for(i=0;iscalefac_scale; - real *xrpnt = (real *) xr; - int l[3],l3; - int part2remain = gr_info->part2_3_length - part2bits; - int *me; - - int num=getbitoffset(); - long mask; - /* we must split this, because for num==0 the shift is undefined if you do it in one step */ - mask = ((unsigned long) getbits(num))<big_values; - int region1 = gr_info->region1start; - int region2 = gr_info->region2start; - - l3 = ((576>>1)-bv)>>1; -/* - * we may lose the 'odd' bit here !! - * check this later again - */ - if(bv <= region1) { - l[0] = bv; l[1] = l[2] = 0; - } - else { - l[0] = region1; - if(bv <= region2) { - l[1] = bv - l[0]; l[2] = 0; - } - else { - l[1] = region2 - l[0]; l[2] = bv - region2; - } - } - } - - if(gr_info->block_type == 2) { - /* - * decoding with short or mixed mode BandIndex table - */ - int i,max[4]; - int step=0,lwin=3,cb=0; - register real v = 0.0; - register int *m,mc; - - if(gr_info->mixed_block_flag) { - max[3] = -1; - max[0] = max[1] = max[2] = 2; - m = map[sfreq][0]; - me = mapend[sfreq][0]; - } - else { - max[0] = max[1] = max[2] = max[3] = -1; - /* max[3] not really needed in this case */ - m = map[sfreq][1]; - me = mapend[sfreq][1]; - } - - mc = 0; - for(i=0;i<2;i++) { - int lp = l[i]; - struct newhuff *h = ht+gr_info->table_select[i]; - for(;lp;lp--,mc--) { - register int x,y; - if( (!mc) ) { - mc = *m++; - xrpnt = ((real *) xr) + (*m++); - lwin = *m++; - cb = *m++; - if(lwin == 3) { - v = gr_info->pow2gain[(*scf++) << shift]; - step = 1; - } - else { - v = gr_info->full_gain[lwin][(*scf++) << shift]; - step = 3; - } - } - { - register short *val = h->table; - REFRESH_MASK; - while((y=*val++)<0) { - if (mask < 0) - val -= y; - num--; - mask <<= 1; - } - x = y >> 4; - y &= 0xf; - } - if(x == 15 && h->linbits) { - max[lwin] = cb; - REFRESH_MASK; - x += ((unsigned long) mask) >> (BITSHIFT+8-h->linbits); - num -= h->linbits+1; - mask <<= h->linbits; - if(mask < 0) - *xrpnt = REAL_MUL(-ispow[x], v); - else - *xrpnt = REAL_MUL(ispow[x], v); - mask <<= 1; - } - else if(x) { - max[lwin] = cb; - if(mask < 0) - *xrpnt = REAL_MUL(-ispow[x], v); - else - *xrpnt = REAL_MUL(ispow[x], v); - num--; - mask <<= 1; - } - else - *xrpnt = 0.0; - xrpnt += step; - if(y == 15 && h->linbits) { - max[lwin] = cb; - REFRESH_MASK; - y += ((unsigned long) mask) >> (BITSHIFT+8-h->linbits); - num -= h->linbits+1; - mask <<= h->linbits; - if(mask < 0) - *xrpnt = REAL_MUL(-ispow[y], v); - else - *xrpnt = REAL_MUL(ispow[y], v); - mask <<= 1; - } - else if(y) { - max[lwin] = cb; - if(mask < 0) - *xrpnt = REAL_MUL(-ispow[y], v); - else - *xrpnt = REAL_MUL(ispow[y], v); - num--; - mask <<= 1; - } - else - *xrpnt = 0.0; - xrpnt += step; - } - } - - for(;l3 && (part2remain+num > 0);l3--) { - struct newhuff *h = htc+gr_info->count1table_select; - register short *val = h->table,a; - - REFRESH_MASK; - while((a=*val++)<0) { - if (mask < 0) - val -= a; - num--; - mask <<= 1; - } - if(part2remain+num <= 0) { - num -= part2remain+num; - break; - } - - for(i=0;i<4;i++) { - if(!(i & 1)) { - if(!mc) { - mc = *m++; - xrpnt = ((real *) xr) + (*m++); - lwin = *m++; - cb = *m++; - if(lwin == 3) { - v = gr_info->pow2gain[(*scf++) << shift]; - step = 1; - } - else { - v = gr_info->full_gain[lwin][(*scf++) << shift]; - step = 3; - } - } - mc--; - } - if( (a & (0x8>>i)) ) { - max[lwin] = cb; - if(part2remain+num <= 0) { - break; - } - if(mask < 0) - *xrpnt = -v; - else - *xrpnt = v; - num--; - mask <<= 1; - } - else - *xrpnt = 0.0; - xrpnt += step; - } - } - - if(lwin < 3) { /* short band? */ - while(1) { - for(;mc > 0;mc--) { - *xrpnt = 0.0; xrpnt += 3; /* short band -> step=3 */ - *xrpnt = 0.0; xrpnt += 3; - } - if(m >= me) - break; - mc = *m++; - xrpnt = ((real *) xr) + *m++; - if(*m++ == 0) - break; /* optimize: field will be set to zero at the end of the function */ - m++; /* cb */ - } - } - - gr_info->maxband[0] = max[0]+1; - gr_info->maxband[1] = max[1]+1; - gr_info->maxband[2] = max[2]+1; - gr_info->maxbandl = max[3]+1; - - { - int rmax = max[0] > max[1] ? max[0] : max[1]; - rmax = (rmax > max[2] ? rmax : max[2]) + 1; - gr_info->maxb = rmax ? shortLimit[sfreq][rmax] : longLimit[sfreq][max[3]+1]; - } - - } - else { - /* - * decoding with 'long' BandIndex table (block_type != 2) - */ - int *pretab = gr_info->preflag ? pretab1 : pretab2; - int i,max = -1; - int cb = 0; - int *m = map[sfreq][2]; - register real v = 0.0; - int mc = 0; - - /* - * long hash table values - */ - for(i=0;i<3;i++) { - int lp = l[i]; - struct newhuff *h = ht+gr_info->table_select[i]; - - for(;lp;lp--,mc--) { - int x,y; - - if(!mc) { - mc = *m++; - cb = *m++; -#ifdef CUT_HF - if(cb == 21) { - fprintf(stderr,"c"); - v = 0.0; - } - else -#endif - v = gr_info->pow2gain[((*scf++) + (*pretab++)) << shift]; - - } - { - register short *val = h->table; - REFRESH_MASK; - while((y=*val++)<0) { - if (mask < 0) - val -= y; - num--; - mask <<= 1; - } - x = y >> 4; - y &= 0xf; - } - - if (x == 15 && h->linbits) { - max = cb; - REFRESH_MASK; - x += ((unsigned long) mask) >> (BITSHIFT+8-h->linbits); - num -= h->linbits+1; - mask <<= h->linbits; - if(mask < 0) - *xrpnt++ = REAL_MUL(-ispow[x], v); - else - *xrpnt++ = REAL_MUL(ispow[x], v); - mask <<= 1; - } - else if(x) { - max = cb; - if(mask < 0) - *xrpnt++ = REAL_MUL(-ispow[x], v); - else - *xrpnt++ = REAL_MUL(ispow[x], v); - num--; - mask <<= 1; - } - else - *xrpnt++ = 0.0; - - if (y == 15 && h->linbits) { - max = cb; - REFRESH_MASK; - y += ((unsigned long) mask) >> (BITSHIFT+8-h->linbits); - num -= h->linbits+1; - mask <<= h->linbits; - if(mask < 0) - *xrpnt++ = REAL_MUL(-ispow[y], v); - else - *xrpnt++ = REAL_MUL(ispow[y], v); - mask <<= 1; - } - else if(y) { - max = cb; - if(mask < 0) - *xrpnt++ = REAL_MUL(-ispow[y], v); - else - *xrpnt++ = REAL_MUL(ispow[y], v); - num--; - mask <<= 1; - } - else - *xrpnt++ = 0.0; - } - } - - /* - * short (count1table) values - */ - for(;l3 && (part2remain+num > 0);l3--) { - struct newhuff *h = htc+gr_info->count1table_select; - register short *val = h->table,a; - - REFRESH_MASK; - while((a=*val++)<0) { - if (mask < 0) - val -= a; - num--; - mask <<= 1; - } - if(part2remain+num <= 0) { - num -= part2remain+num; - break; - } - - for(i=0;i<4;i++) { - if(!(i & 1)) { - if(!mc) { - mc = *m++; - cb = *m++; -#ifdef CUT_HF - if(cb == 21) { - fprintf(stderr,"c"); - v = 0.0; - } - else -#endif - v = gr_info->pow2gain[((*scf++) + (*pretab++)) << shift]; - } - mc--; - } - if ( (a & (0x8>>i)) ) { - max = cb; - if(part2remain+num <= 0) { - break; - } - if(mask < 0) - *xrpnt++ = -v; - else - *xrpnt++ = v; - num--; - mask <<= 1; - } - else - *xrpnt++ = 0.0; - } - } - - gr_info->maxbandl = max+1; - gr_info->maxb = longLimit[sfreq][gr_info->maxbandl]; - } - - part2remain += num; -// backbits(num); - bitindex -= num; wordpointer += (bitindex>>3); bitindex &= 0x7; - num = 0; - - while(xrpnt < &xr[SBLIMIT][0]) - *xrpnt++ = 0.0; - - while( part2remain > 16 ) { - getbits(16); /* Dismiss stuffing Bits */ - part2remain -= 16; - } - if(part2remain > 0) - getbits(part2remain); - else if(part2remain < 0) { - fprintf(stderr,"mpg123: Can't rewind stream by %d bits!\n",-part2remain); - return 1; /* -> error */ - } - return 0; -} - - - - -/* - * III_stereo: calculate real channel values for Joint-I-Stereo-mode - */ -static void III_i_stereo(real xr_buf[2][SBLIMIT][SSLIMIT],int *scalefac, - struct gr_info_s *gr_info,int sfreq,int ms_stereo,int lsf) -{ - real (*xr)[SBLIMIT*SSLIMIT] = (real (*)[SBLIMIT*SSLIMIT] ) xr_buf; - const struct bandInfoStruct *bi = &bandInfo[sfreq]; - - const real *tab1,*tab2; - - int tab; - static const real *tabs[3][2][2] = { - { { tan1_1,tan2_1 } , { tan1_2,tan2_2 } }, - { { pow1_1[0],pow2_1[0] } , { pow1_2[0],pow2_2[0] } } , - { { pow1_1[1],pow2_1[1] } , { pow1_2[1],pow2_2[1] } } - }; - - tab = lsf + (gr_info->scalefac_compress & lsf); - tab1 = tabs[tab][ms_stereo][0]; - tab2 = tabs[tab][ms_stereo][1]; -#if 0 - if(lsf) { - int p = gr_info->scalefac_compress & 0x1; - if(ms_stereo) { - tab1 = pow1_2[p]; tab2 = pow2_2[p]; - } - else { - tab1 = pow1_1[p]; tab2 = pow2_1[p]; - } - } - else { - if(ms_stereo) { - tab1 = tan1_2; tab2 = tan2_2; - } - else { - tab1 = tan1_1; tab2 = tan2_1; - } - } -#endif - -// printf("III_i_st: tab1=%p tab2=%p tab=%d ms=%d \n", tab1, tab2, tab, ms_stereo); - - if (gr_info->block_type == 2) { - int lwin,do_l = 0; - if( gr_info->mixed_block_flag ) - do_l = 1; - - for (lwin=0;lwin<3;lwin++) { /* process each window */ - /* get first band with zero values */ - int is_p,sb,idx,sfb = gr_info->maxband[lwin]; /* sfb is minimal 3 for mixed mode */ - if(sfb > 3) - do_l = 0; - - for(;sfb<12;sfb++) { - is_p = scalefac[sfb*3+lwin-gr_info->mixed_block_flag]; /* scale: 0-15 */ - if(is_p != 7) { - real t1,t2; - sb = bi->shortDiff[sfb]; - idx = bi->shortIdx[sfb] + lwin; - t1 = tab1[is_p]; t2 = tab2[is_p]; - for (; sb > 0; sb--,idx+=3) { - real v = xr[0][idx]; - xr[0][idx] = REAL_MUL(v, t1); - xr[1][idx] = REAL_MUL(v, t2); - } - } - } - -#if 1 -/* in the original: copy 10 to 11 , here: copy 11 to 12 -maybe still wrong??? (copy 12 to 13?) */ - is_p = scalefac[11*3+lwin-gr_info->mixed_block_flag]; /* scale: 0-15 */ - sb = bi->shortDiff[12]; - idx = bi->shortIdx[12] + lwin; -#else - is_p = scalefac[10*3+lwin-gr_info->mixed_block_flag]; /* scale: 0-15 */ - sb = bi->shortDiff[11]; - idx = bi->shortIdx[11] + lwin; -#endif - if(is_p != 7) { - real t1,t2; - t1 = tab1[is_p]; t2 = tab2[is_p]; - for ( ; sb > 0; sb--,idx+=3 ) { - real v = xr[0][idx]; - xr[0][idx] = REAL_MUL(v, t1); - xr[1][idx] = REAL_MUL(v, t2); - } - } - } /* end for(lwin; .. ; . ) */ - -/* also check l-part, if ALL bands in the three windows are 'empty' - * and mode = mixed_mode - */ - if (do_l) { - int sfb = gr_info->maxbandl; - int idx = bi->longIdx[sfb]; - - for ( ; sfb<8; sfb++ ) { - int sb = bi->longDiff[sfb]; - int is_p = scalefac[sfb]; /* scale: 0-15 */ - if(is_p != 7) { - real t1,t2; - t1 = tab1[is_p]; t2 = tab2[is_p]; - for ( ; sb > 0; sb--,idx++) { - real v = xr[0][idx]; - xr[0][idx] = REAL_MUL(v, t1); - xr[1][idx] = REAL_MUL(v, t2); - } - } - else - idx += sb; - } - } - } - else { /* ((gr_info->block_type != 2)) */ - int sfb = gr_info->maxbandl; - int is_p,idx = bi->longIdx[sfb]; - -/* hmm ... maybe the maxbandl stuff for i-stereo is buggy? */ - if(sfb <= 21) { - for ( ; sfb<21; sfb++) { - int sb = bi->longDiff[sfb]; - is_p = scalefac[sfb]; /* scale: 0-15 */ - if(is_p != 7) { - real t1,t2; - t1 = tab1[is_p]; t2 = tab2[is_p]; - for ( ; sb > 0; sb--,idx++) { - real v = xr[0][idx]; - xr[0][idx] = REAL_MUL(v, t1); - xr[1][idx] = REAL_MUL(v, t2); - } - } - else - idx += sb; - } - - is_p = scalefac[20]; - if(is_p != 7) { /* copy l-band 20 to l-band 21 */ - int sb; - real t1 = tab1[is_p],t2 = tab2[is_p]; - - for ( sb = bi->longDiff[21]; sb > 0; sb--,idx++ ) { - real v = xr[0][idx]; - xr[0][idx] = REAL_MUL(v, t1); - xr[1][idx] = REAL_MUL(v, t2); - } - } - } /* end: if(sfb <= 21) */ - } /* ... */ -} - -static void III_antialias(real xr[SBLIMIT][SSLIMIT],struct gr_info_s *gr_info) { - int sblim; - - if(gr_info->block_type == 2) { - if(!gr_info->mixed_block_flag) - return; - sblim = 1; - } - else { - sblim = gr_info->maxb-1; - } - - /* 31 alias-reduction operations between each pair of sub-bands */ - /* with 8 butterflies between each pair */ - - { - int sb; - real *xr1=(real *) xr[1]; - - for(sb=sblim;sb;sb--,xr1+=10) { - int ss; - real *cs=aa_cs,*ca=aa_ca; - real *xr2 = xr1; - - for(ss=7;ss>=0;ss--) { /* upper and lower butterfly inputs */ - register real bu = *--xr2,bd = *xr1; - *xr2 = (bu * (*cs) ) - (bd * (*ca) ); - *xr1++ = (bd * (*cs++) ) + (bu * (*ca++) ); - } - } - - } -} - -#include "dct64.c" -#include "dct36.c" -#include "dct12.c" - -#include "decod386.c" - -/* - * III_hybrid - */ - -static dct36_func_t dct36_func; - -static void III_hybrid(real fsIn[SBLIMIT][SSLIMIT],real tsOut[SSLIMIT][SBLIMIT], - int ch,struct gr_info_s *gr_info) -{ - real *tspnt = (real *) tsOut; - static real block[2][2][SBLIMIT*SSLIMIT] = { { { 0, } } }; - static int blc[2]={0,0}; - real *rawout1,*rawout2; - int bt; - int sb = 0; - - { - int b = blc[ch]; - rawout1=block[b][ch]; - b=-b+1; - rawout2=block[b][ch]; - blc[ch] = b; - } - - if(gr_info->mixed_block_flag) { - sb = 2; - (*dct36_func)(fsIn[0],rawout1,rawout2,win[0],tspnt); - (*dct36_func)(fsIn[1],rawout1+18,rawout2+18,win1[0],tspnt+1); - rawout1 += 36; rawout2 += 36; tspnt += 2; - } - - bt = gr_info->block_type; - if(bt == 2) { - for (; sbmaxb; sb+=2,tspnt+=2,rawout1+=36,rawout2+=36) { - dct12(fsIn[sb],rawout1,rawout2,win[2],tspnt); - dct12(fsIn[sb+1],rawout1+18,rawout2+18,win1[2],tspnt+1); - } - } - else { - for (; sbmaxb; sb+=2,tspnt+=2,rawout1+=36,rawout2+=36) { - (*dct36_func)(fsIn[sb],rawout1,rawout2,win[bt],tspnt); - (*dct36_func)(fsIn[sb+1],rawout1+18,rawout2+18,win1[bt],tspnt+1); - } - } - - for(;sbstereo; - int ms_stereo,i_stereo; - int sfreq = fr->sampling_frequency; - int stereo1,granules; - -// if (fr->error_protection) getbits(16); /* skip crc */ - - if(stereo == 1) { /* stream is mono */ - stereo1 = 1; - single = 0; - } else - if(single >= 0) /* stream is stereo, but force to mono */ - stereo1 = 1; - else - stereo1 = 2; - - if(fr->mode == MPG_MD_JOINT_STEREO) { - ms_stereo = (fr->mode_ext & 0x2)>>1; - i_stereo = fr->mode_ext & 0x1; - } else - ms_stereo = i_stereo = 0; - - if(!III_get_side_info(&sideinfo,stereo,ms_stereo,sfreq,single,fr->lsf)) - return -1; - - set_pointer(sideinfo.main_data_begin); - - granules = (fr->lsf) ? 1 : 2; - for (gr=0;grlsf) - part2bits = III_get_scale_factors_2(scalefacs[0],gr_info,0); - else - part2bits = III_get_scale_factors_1(scalefacs[0],gr_info); - if(III_dequantize_sample(hybridIn[0], scalefacs[0],gr_info,sfreq,part2bits)) - return clip; - } - - if(stereo == 2) { - struct gr_info_s *gr_info = &(sideinfo.ch[1].gr[gr]); - - int part2bits; - if(fr->lsf) - part2bits = III_get_scale_factors_2(scalefacs[1],gr_info,i_stereo); - else - part2bits = III_get_scale_factors_1(scalefacs[1],gr_info); - - if(III_dequantize_sample(hybridIn[1],scalefacs[1],gr_info,sfreq,part2bits)) - return clip; - - if(ms_stereo) { - int i; - int maxb = sideinfo.ch[0].gr[gr].maxb; - if(sideinfo.ch[1].gr[gr].maxb > maxb) - maxb = sideinfo.ch[1].gr[gr].maxb; - for(i=0;ilsf); - - if(ms_stereo || i_stereo || (single == 3) ) { - if(gr_info->maxb > sideinfo.ch[0].gr[gr].maxb) - sideinfo.ch[0].gr[gr].maxb = gr_info->maxb; - else - gr_info->maxb = sideinfo.ch[0].gr[gr].maxb; - } - - switch(single) { - case 3: { - register int i; - register real *in0 = (real *) hybridIn[0],*in1 = (real *) hybridIn[1]; - for(i=0;imaxb;i++,in0++) - *in0 = (*in0 + *in1++); /* *0.5 done by pow-scale */ - break; } - case 1: { - register int i; - register real *in0 = (real *) hybridIn[0],*in1 = (real *) hybridIn[1]; - for(i=0;imaxb;i++) - *in0++ = *in1++; - break; } - } - - } // if(stereo == 2) - - for(ch=0;ch= 0) { - clip += (fr->synth_mono)(hybridOut[0][ss],pcm_sample,&pcm_point); - } else { - int p1 = pcm_point; - clip += (fr->synth)(hybridOut[0][ss],0,pcm_sample,&p1); - clip += (fr->synth)(hybridOut[1][ss],1,pcm_sample,&pcm_point); - } - } - - } - - return clip; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/mp3.h --- a/mp3lib/mp3.h Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,39 +0,0 @@ -/* MP3 Player Library 2.0 (C) 1999 A'rpi/Astral&ESP-team */ - -#ifndef MPLAYER_MP3LIB_MP3_H -#define MPLAYER_MP3LIB_MP3_H - -/* decoder level: */ -#ifdef CONFIG_FAKE_MONO -void MP3_Init(int fakemono); -#else -void MP3_Init(void); -#endif -int MP3_Open(char *filename, int buffsize); -void MP3_SeekFrame(int num, int dir); -void MP3_SeekForward(int num); -int MP3_PrintTAG(void); -int MP3_DecodeFrame(unsigned char *hova, short single); -int MP3_FillBuffers(void); -void MP3_PrintHeader(void); -void MP3_Close(void); -/* public variables: */ -extern int MP3_eof; // set if EOF reached -extern int MP3_pause; // lock playing -/* informational: */ -extern int MP3_filesize; // filesize -extern int MP3_frames; // current frame no -extern int MP3_fpos; // current file pos -extern int MP3_framesize; // current framesize in bytes (including header) -extern int MP3_bitrate; // current bitrate (kbits) -extern int MP3_samplerate; // current sampling freq (Hz) -extern int MP3_channels; -extern int MP3_bps; - -/* player level: */ -int MP3_OpenDevice(char *devname); /* devname can be NULL for default) */ -void MP3_Play(void); -void MP3_Stop(void); -void MP3_CloseDevice(void); - -#endif /* MPLAYER_MP3LIB_MP3_H */ diff -r bc0898c7399b -r b924f0df5a1d mp3lib/mpg123.h --- a/mp3lib/mpg123.h Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,144 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -/* - * mpg123 defines - * used source: musicout.h from mpegaudio package - */ - -#ifndef MPLAYER_MP3LIB_MPG123_H -#define MPLAYER_MP3LIB_MPG123_H - -#include "config.h" - -#ifndef M_PI -#define M_PI 3.141592653589793238462 -#endif -#ifndef M_SQRT2 -#define M_SQRT2 1.414213562373095048802 -#endif -#define REAL_IS_FLOAT -#define NEW_DCT9 - -#undef MPG123_REMOTE /* Get rid of this stuff for Win32 */ - -typedef float real; - -/* -# define real float -# define real long double -# define real double -#include "audio.h" - -// #define AUDIOBUFSIZE 4096 -*/ - -#define FALSE 0 -#define TRUE 1 - -#define MAX_NAME_SIZE 81 -#define SBLIMIT 32 -#define SCALE_BLOCK 12 -#define SSLIMIT 18 - -#define MPG_MD_STEREO 0 -#define MPG_MD_JOINT_STEREO 1 -#define MPG_MD_DUAL_CHANNEL 2 -#define MPG_MD_MONO 3 - -/* #define MAXOUTBURST 32768 */ - -/* Pre Shift fo 16 to 8 bit converter table */ -#define AUSHIFT (3) - -struct al_table -{ - short bits; - short d; -}; - -struct frame { - struct al_table *alloc; - int (*synth)(real *,int,unsigned char *,int *); - int (*synth_mono)(real *,unsigned char *,int *); - int stereo; - int jsbound; - int single; - int II_sblimit; - int down_sample_sblimit; - int lsf; - int mpeg25; - int down_sample; - int header_change; - int lay; - int error_protection; - int bitrate_index; - int sampling_frequency; - int padding; - int extension; - int mode; - int mode_ext; - int copyright; - int original; - int emphasis; - int framesize; /* computed framesize */ -}; - - -struct gr_info_s { - int scfsi; - unsigned part2_3_length; - unsigned big_values; - unsigned scalefac_compress; - unsigned block_type; - unsigned mixed_block_flag; - unsigned table_select[3]; - unsigned subblock_gain[3]; - unsigned maxband[3]; - unsigned maxbandl; - unsigned maxb; - unsigned region1start; - unsigned region2start; - unsigned preflag; - unsigned scalefac_scale; - unsigned count1table_select; - real *full_gain[3]; - real *pow2gain; -}; - -struct III_sideinfo -{ - unsigned main_data_begin; - unsigned private_bits; - struct { - struct gr_info_s gr[2]; - } ch[2]; -}; - -extern real mp3lib_decwin[(512+32)]; -extern real *mp3lib_pnts[]; - -int synth_1to1_pent( real *, int, short * ); -int synth_1to1_MMX( real *, int, short * ); -int synth_1to1_MMX_s(real *, int, short *, short *, int *); - -void dct36_3dnow(real *, real *, real *, real *, real *); -void dct36_3dnowex(real *, real *, real *, real *, real *); -void dct36_sse(real *, real *, real *, real *, real *); - -void dct64_MMX(short *, short *, real *); -void dct64_MMX_3dnow(short *, short *, real *); -void dct64_MMX_3dnowex(short *, short *, real *); -void dct64_sse(short *, short *, real *); -void dct64_altivec(real *, real *, real *); -extern void (*dct64_MMX_func)(short *, short *, real *); - -void mp3lib_dct64(real *, real *, real *); - -typedef int (*synth_func_t)( real *,int,short * ); -typedef void (*dct36_func_t)(real *,real *,real *,real *,real *); - -#endif /* MPLAYER_MP3LIB_MPG123_H */ diff -r bc0898c7399b -r b924f0df5a1d mp3lib/sr1.c --- a/mp3lib/sr1.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,605 +0,0 @@ -// #define NEWBUFFERING -//#define DEBUG_RESYNC - -/* 1 frame = 4608 byte PCM */ - -#define LOCAL static inline - -//#undef LOCAL -//#define LOCAL - -#include -#include -#include -#include - -#include "mpg123.h" -#include "huffman.h" -#include "mp3.h" -#include "mpbswap.h" -#include "cpudetect.h" -#include "mp_msg.h" -#include "libmpcodecs/ad_mp3lib.h" -#include "libvo/fastmemcpy.h" - -#include "libavutil/common.h" - -#if ARCH_X86_64 -// 3DNow! and 3DNow!Ext routines don't compile under AMD64 -#undef HAVE_AMD3DNOW -#undef HAVE_AMD3DNOWEXT -#define HAVE_AMD3DNOW 0 -#define HAVE_AMD3DNOWEXT 0 -#endif - -//static FILE* mp3_file=NULL; - -int MP3_frames=0; -int MP3_eof=0; -int MP3_pause=0; -int MP3_filesize=0; -int MP3_fpos=0; // current file position -int MP3_framesize=0; // current framesize -int MP3_bitrate=0; // current bitrate -int MP3_samplerate=0; // current samplerate -int MP3_resync=0; -int MP3_channels=0; -int MP3_bps=2; - -static long outscale = 32768; -#include "tabinit.c" - -#if 1 -LOCAL int mp3_read(char *buf,int size){ -// int len=fread(buf,1,size,mp3_file); - int len=mplayer_audio_read(buf,size); - if(len>0) MP3_fpos+=len; -// if(len!=size) MP3_eof=1; - return len; -} -#else -int mp3_read(char *buf,int size); -#endif -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - - -//void mp3_seek(int pos){ -// fseek(mp3_file,pos,SEEK_SET); -// return MP3_fpos = ftell(mp3_file); -//} - -/* Frame reader */ - -#define MAXFRAMESIZE 1280 -#define MAXFRAMESIZE2 (512+MAXFRAMESIZE) - -static int fsizeold=0,ssize=0; -static unsigned char bsspace[2][MAXFRAMESIZE2]; /* !!!!! */ -static unsigned char *bsbufold=bsspace[0]+512; -static unsigned char *bsbuf=bsspace[1]+512; -static int bsnum=0; - -static int bitindex; -static unsigned char *wordpointer; -static int bitsleft; - -static unsigned char *pcm_sample; /* outbuffer address */ -static int pcm_point = 0; /* outbuffer offset */ - -static struct frame fr; - -static int tabsel_123[2][3][16] = { - { {0,32,64,96,128,160,192,224,256,288,320,352,384,416,448,}, - {0,32,48,56, 64, 80, 96,112,128,160,192,224,256,320,384,}, - {0,32,40,48, 56, 64, 80, 96,112,128,160,192,224,256,320,} }, - - { {0,32,48,56,64,80,96,112,128,144,160,176,192,224,256,}, - {0,8,16,24,32,40,48,56,64,80,96,112,128,144,160,}, - {0,8,16,24,32,40,48,56,64,80,96,112,128,144,160,} } -}; - -static int freqs[9] = { 44100, 48000, 32000, 22050, 24000, 16000 , 11025 , 12000 , 8000 }; - -LOCAL unsigned int getbits(short number_of_bits) -{ - unsigned rval; -// if(MP3_frames>=7741) printf("getbits: bits=%d bitsleft=%d wordptr=%x\n",number_of_bits,bitsleft,wordpointer); - if((bitsleft-=number_of_bits)<0) return 0; - if(!number_of_bits) return 0; - rval = wordpointer[0]; - rval <<= 8; - rval |= wordpointer[1]; - rval <<= 8; - rval |= wordpointer[2]; - rval <<= bitindex; - rval &= 0xffffff; - bitindex += number_of_bits; - rval >>= (24-number_of_bits); - wordpointer += (bitindex>>3); - bitindex &= 7; - return rval; -} - - -LOCAL unsigned int getbits_fast(short number_of_bits) -{ - unsigned rval; -// if(MP3_frames>=7741) printf("getbits_fast: bits=%d bitsleft=%d wordptr=%x\n",number_of_bits,bitsleft,wordpointer); - if((bitsleft-=number_of_bits)<0) return 0; - if(!number_of_bits) return 0; -#if ARCH_X86 - rval = bswap_16(*((uint16_t *)wordpointer)); -#else - /* - * we may not be able to address unaligned 16-bit data on non-x86 cpus. - * Fall back to some portable code. - */ - rval = wordpointer[0] << 8 | wordpointer[1]; -#endif - rval <<= bitindex; - rval &= 0xffff; - bitindex += number_of_bits; - rval >>= (16-number_of_bits); - wordpointer += (bitindex>>3); - bitindex &= 7; - return rval; -} - -LOCAL unsigned int get1bit(void) -{ - unsigned char rval; -// if(MP3_frames>=7741) printf("get1bit: bitsleft=%d wordptr=%x\n",bitsleft,wordpointer); - if((--bitsleft)<0) return 0; - rval = *wordpointer << bitindex; - bitindex++; - wordpointer += (bitindex>>3); - bitindex &= 7; - return (rval >> 7) & 1; -} - -LOCAL void set_pointer(int backstep) -{ -// if(backstep!=512 && backstep>fsizeold) -// printf("\rWarning! backstep (%d>%d) \n",backstep,fsizeold); - wordpointer = bsbuf + ssize - backstep; - if (backstep) fast_memcpy(wordpointer,bsbufold+fsizeold-backstep,backstep); - bitindex = 0; - bitsleft+=8*backstep; -// printf("Backstep %d (bitsleft=%d)\n",backstep,bitsleft); -} - -LOCAL int stream_head_read(unsigned char *hbuf,uint32_t *newhead){ - if(mp3_read(hbuf,4) != 4) return FALSE; -#if ARCH_X86 - *newhead = bswap_32(*((uint32_t*)hbuf)); -#else - /* - * we may not be able to address unaligned 32-bit data on non-x86 cpus. - * Fall back to some portable code. - */ - *newhead = - hbuf[0] << 24 | - hbuf[1] << 16 | - hbuf[2] << 8 | - hbuf[3]; -#endif - return TRUE; -} - -LOCAL int stream_head_shift(unsigned char *hbuf,uint32_t *head){ - *((uint32_t*)hbuf) >>= 8; - if(mp3_read(hbuf+3,1) != 1) return 0; - *head <<= 8; - *head |= hbuf[3]; - return 1; -} - -/* - * decode a header and write the information - * into the frame structure - */ -LOCAL int decode_header(struct frame *fr,uint32_t newhead){ - - // head_check: - if( (newhead & 0xffe00000) != 0xffe00000 || - (newhead & 0x0000fc00) == 0x0000fc00) return FALSE; - - fr->lay = 4-((newhead>>17)&3); -// if(fr->lay!=3) return FALSE; - - if( newhead & (1<<20) ) { - fr->lsf = (newhead & (1<<19)) ? 0x0 : 0x1; - fr->mpeg25 = 0; - } else { - fr->lsf = 1; - fr->mpeg25 = 1; - } - - if(fr->mpeg25) - fr->sampling_frequency = 6 + ((newhead>>10)&0x3); - else - fr->sampling_frequency = ((newhead>>10)&0x3) + (fr->lsf*3); - - if(fr->sampling_frequency>8) return FALSE; // valid: 0..8 - - fr->error_protection = ((newhead>>16)&0x1)^0x1; - fr->bitrate_index = ((newhead>>12)&0xf); - fr->padding = ((newhead>>9)&0x1); - fr->extension = ((newhead>>8)&0x1); - fr->mode = ((newhead>>6)&0x3); - fr->mode_ext = ((newhead>>4)&0x3); - fr->copyright = ((newhead>>3)&0x1); - fr->original = ((newhead>>2)&0x1); - fr->emphasis = newhead & 0x3; - - MP3_channels = fr->stereo = (fr->mode == MPG_MD_MONO) ? 1 : 2; - - if(!fr->bitrate_index){ -// fprintf(stderr,"Free format not supported.\n"); - return FALSE; - } - -switch(fr->lay){ - case 2: - MP3_bitrate=tabsel_123[fr->lsf][1][fr->bitrate_index]; - MP3_samplerate=freqs[fr->sampling_frequency]; - fr->framesize = MP3_bitrate * 144000; - fr->framesize /= MP3_samplerate; - MP3_framesize=fr->framesize; - fr->framesize += fr->padding - 4; - break; - case 3: - if(fr->lsf) - ssize = (fr->stereo == 1) ? 9 : 17; - else - ssize = (fr->stereo == 1) ? 17 : 32; - if(fr->error_protection) ssize += 2; - - MP3_bitrate=tabsel_123[fr->lsf][2][fr->bitrate_index]; - MP3_samplerate=freqs[fr->sampling_frequency]; - fr->framesize = MP3_bitrate * 144000; - fr->framesize /= MP3_samplerate<<(fr->lsf); - MP3_framesize=fr->framesize; - fr->framesize += fr->padding - 4; - break; - case 1: -// fr->jsbound = (fr->mode == MPG_MD_JOINT_STEREO) ? (fr->mode_ext<<2)+4 : 32; - MP3_bitrate=tabsel_123[fr->lsf][0][fr->bitrate_index]; - MP3_samplerate=freqs[fr->sampling_frequency]; - fr->framesize = MP3_bitrate * 12000; - fr->framesize /= MP3_samplerate; - MP3_framesize = ((fr->framesize+fr->padding)<<2); - fr->framesize = MP3_framesize-4; -// printf("framesize=%d\n",fr->framesize); - break; - default: - MP3_framesize=fr->framesize=0; -// fprintf(stderr,"Sorry, unsupported layer type.\n"); - return 0; -} - if(fr->framesize<=0 || fr->framesize>MAXFRAMESIZE) return FALSE; - - return 1; -} - - -LOCAL int stream_read_frame_body(int size){ - - /* flip/init buffer for Layer 3 */ - bsbufold = bsbuf; - bsbuf = bsspace[bsnum]+512; - bsnum = (bsnum + 1) & 1; - - if( mp3_read(bsbuf,size) != size) return 0; // broken frame - - bitindex = 0; - wordpointer = (unsigned char *) bsbuf; - bitsleft=8*size; - - return 1; -} - - -/***************************************************************** - * read next frame return number of frames read. - */ -LOCAL int read_frame(struct frame *fr){ - uint32_t newhead; - union { - unsigned char buf[8]; - unsigned long dummy; // for alignment - } hbuf; - int skipped,resyncpos; - int frames=0; - -resync: - skipped=MP3_fpos; - resyncpos=MP3_fpos; - - set_pointer(512); - fsizeold=fr->framesize; /* for Layer3 */ - if(!stream_head_read(hbuf.buf,&newhead)) return 0; - if(!decode_header(fr,newhead)){ - // invalid header! try to resync stream! -#ifdef DEBUG_RESYNC - printf("ReSync: searching for a valid header... (pos=%X)\n",MP3_fpos); -#endif -retry1: - while(!decode_header(fr,newhead)){ - if(!stream_head_shift(hbuf.buf,&newhead)) return 0; - } - resyncpos=MP3_fpos-4; - // found valid header -#ifdef DEBUG_RESYNC - printf("ReSync: found valid hdr at %X fsize=%ld ",resyncpos,fr->framesize); -#endif - if(!stream_read_frame_body(fr->framesize)) return 0; // read body - set_pointer(512); - fsizeold=fr->framesize; /* for Layer3 */ - if(!stream_head_read(hbuf.buf,&newhead)) return 0; - if(!decode_header(fr,newhead)){ - // invalid hdr! go back... -#ifdef DEBUG_RESYNC - printf("INVALID\n"); -#endif -// mp3_seek(resyncpos+1); - if(!stream_head_read(hbuf.buf,&newhead)) return 0; - goto retry1; - } -#ifdef DEBUG_RESYNC - printf("OK!\n"); - ++frames; -#endif - } - - skipped=resyncpos-skipped; -// if(skipped && !MP3_resync) printf("\r%d bad bytes skipped (resync at 0x%X) \n",skipped,resyncpos); - -// printf("%8X [%08X] %d %d (%d)%s%s\n",MP3_fpos-4,newhead,fr->framesize,fr->mode,fr->mode_ext,fr->error_protection?" CRC":"",fr->padding?" PAD":""); - - /* read main data into memory */ - if(!stream_read_frame_body(fr->framesize)){ - printf("\nBroken frame at 0x%X \n",resyncpos); - return 0; - } - ++frames; - - if(MP3_resync){ - MP3_resync=0; - if(frames==1) goto resync; - } - - return frames; -} - -static int _has_mmx = 0; // used by layer2.c, layer3.c to pre-scale coeffs - -/******************************************************************************/ -/* PUBLIC FUNCTIONS */ -/******************************************************************************/ - -void (*dct64_MMX_func)(short *, short *, real *); - -#include "layer2.c" -#include "layer3.c" -#include "layer1.c" - -#include "cpudetect.h" - -// Init decoder tables. Call first, once! -#ifdef CONFIG_FAKE_MONO -void MP3_Init(int fakemono){ -#else -void MP3_Init(void){ -#endif - -//gCpuCaps.hasMMX=gCpuCaps.hasMMX2=gCpuCaps.hasSSE=0; // for testing! - - _has_mmx = 0; - dct36_func = dct36; - - make_decode_tables(outscale); - -#if HAVE_MMX - if (gCpuCaps.hasMMX) - { - _has_mmx = 1; - synth_func = synth_1to1_MMX; - } -#endif - -#if HAVE_AMD3DNOWEXT - if (gCpuCaps.has3DNowExt) - { - dct36_func=dct36_3dnowex; - dct64_MMX_func= dct64_MMX_3dnowex; - mp_msg(MSGT_DECAUDIO,MSGL_V,"mp3lib: using 3DNow!Ex optimized decore!\n"); - } - else -#endif -#if HAVE_AMD3DNOW - if (gCpuCaps.has3DNow) - { - dct36_func = dct36_3dnow; - dct64_MMX_func = dct64_MMX_3dnow; - mp_msg(MSGT_DECAUDIO,MSGL_V,"mp3lib: using 3DNow! optimized decore!\n"); - } - else -#endif -#if HAVE_SSE - if (gCpuCaps.hasSSE) - { - dct64_MMX_func = dct64_sse; - mp_msg(MSGT_DECAUDIO,MSGL_V,"mp3lib: using SSE optimized decore!\n"); - } - else -#endif -#if ARCH_X86_32 -#if HAVE_MMX - if (gCpuCaps.hasMMX) - { - dct64_MMX_func = dct64_MMX; - mp_msg(MSGT_DECAUDIO,MSGL_V,"mp3lib: using MMX optimized decore!\n"); - } - else -#endif - if (gCpuCaps.cpuType >= CPUTYPE_I586) - { - synth_func = synth_1to1_pent; - mp_msg(MSGT_DECAUDIO,MSGL_V,"mp3lib: using Pentium optimized decore!\n"); - } - else -#endif /* ARCH_X86_32 */ -#if HAVE_ALTIVEC - if (gCpuCaps.hasAltiVec) - { - mp_msg(MSGT_DECAUDIO,MSGL_V,"mp3lib: using AltiVec optimized decore!\n"); - } - else -#endif - { - synth_func = NULL; /* use default c version */ - mp_msg(MSGT_DECAUDIO,MSGL_V,"mp3lib: using generic C decore!\n"); - } - -#ifdef CONFIG_FAKE_MONO - if (fakemono == 1) - fr.synth=synth_1to1_l; - else if (fakemono == 2) - fr.synth=synth_1to1_r; - else - fr.synth=synth_1to1; -#else - fr.synth=synth_1to1; -#endif - fr.synth_mono=synth_1to1_mono2stereo; - fr.down_sample=0; - fr.down_sample_sblimit = SBLIMIT>>(fr.down_sample); - - init_layer2(); - init_layer3(fr.down_sample_sblimit); - mp_msg(MSGT_DECAUDIO,MSGL_V,"MP3lib: init layer2&3 finished, tables done\n"); -} - -#if 0 - -void MP3_Close(void){ - MP3_eof=1; - if(mp3_file) fclose(mp3_file); - mp3_file=NULL; -} - -// Open a file, init buffers. Call once per file! -int MP3_Open(char *filename,int buffsize){ - MP3_eof=1; // lock decoding - MP3_pause=1; // lock playing - if(mp3_file) MP3_Close(); // close prev. file - MP3_frames=0; - - mp3_file=fopen(filename,"rb"); -// printf("MP3_Open: file='%s'",filename); -// if(!mp3_file){ printf(" not found!\n"); return 0;} else printf("Ok!\n"); - if(!mp3_file) return 0; - - MP3_filesize=MP3_PrintTAG(); - fseek(mp3_file,0,SEEK_SET); - - MP3_InitBuffers(buffsize); - if(!tables_done_flag) MP3_Init(); - MP3_eof=0; // allow decoding - MP3_pause=0; // allow playing - return MP3_filesize; -} - -#endif - -// Read & decode a single frame. Called by sound driver. -int MP3_DecodeFrame(unsigned char *hova,short single){ - pcm_sample = hova; - pcm_point = 0; - if(!read_frame(&fr)) return 0; - if(single==-2){ set_pointer(512); return 1; } - if(fr.error_protection) getbits(16); /* skip crc */ - fr.single=single; - switch(fr.lay){ - case 2: do_layer2(&fr,single);break; - case 3: do_layer3(&fr,single);break; - case 1: do_layer1(&fr,single);break; - default: - return 0; // unsupported - } -// ++MP3_frames; - return pcm_point ? pcm_point : 2; -} - -// Prints last frame header in ascii. -void MP3_PrintHeader(void){ - static char *modes[4] = { "Stereo", "Joint-Stereo", "Dual-Channel", "Single-Channel" }; - static char *layers[4] = { "???" , "I", "II", "III" }; - - mp_msg(MSGT_DECAUDIO,MSGL_V,"\rMPEG %s, Layer %s, %d Hz %d kbit %s, BPF: %d\n", - fr.mpeg25 ? "2.5" : (fr.lsf ? "2.0" : "1.0"), - layers[fr.lay],freqs[fr.sampling_frequency], - tabsel_123[fr.lsf][fr.lay-1][fr.bitrate_index], - modes[fr.mode],fr.framesize+4); - mp_msg(MSGT_DECAUDIO,MSGL_V,"Channels: %d, copyright: %s, original: %s, CRC: %s, emphasis: %d\n", - fr.stereo,fr.copyright?"Yes":"No", - fr.original?"Yes":"No",fr.error_protection?"Yes":"No", - fr.emphasis); -} - -#if 0 -#include "genre.h" - -// Read & print ID3 TAG. Do not call when playing!!! returns filesize. -int MP3_PrintTAG(void){ - struct id3tag { - char tag[3]; - char title[30]; - char artist[30]; - char album[30]; - char year[4]; - char comment[30]; - unsigned char genre; - }; - struct id3tag tag; - char title[31]={0,}; - char artist[31]={0,}; - char album[31]={0,}; - char year[5]={0,}; - char comment[31]={0,}; - char genre[31]={0,}; - int fsize; - int ret; - - fseek(mp3_file,0,SEEK_END); - fsize=ftell(mp3_file); - if(fseek(mp3_file,-128,SEEK_END)) return fsize; - ret=fread(&tag,128,1,mp3_file); - if(ret!=1 || tag.tag[0]!='T' || tag.tag[1]!='A' || tag.tag[2]!='G') return fsize; - - strncpy(title,tag.title,30); - strncpy(artist,tag.artist,30); - strncpy(album,tag.album,30); - strncpy(year,tag.year,4); - strncpy(comment,tag.comment,30); - - if ( tag.genre <= sizeof(genre_table)/sizeof(*genre_table) ) { - strncpy(genre, genre_table[tag.genre], 30); - } else { - strncpy(genre,"Unknown",30); - } - -// printf("\n"); - printf("Title : %30s Artist: %s\n",title,artist); - printf("Album : %30s Year : %4s\n",album,year); - printf("Comment: %30s Genre : %s\n",comment,genre); - printf("\n"); - return fsize-128; -} - -#endif diff -r bc0898c7399b -r b924f0df5a1d mp3lib/tabinit.c --- a/mp3lib/tabinit.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,75 +0,0 @@ -/* - * Modified for use with MPlayer, for details see the changelog at - * http://svn.mplayerhq.hu/mplayer/trunk/ - * $Id$ - */ - -#include "mpg123.h" - -real mp3lib_decwin[(512+32)]; -static real cos64[32], cos32[16], cos16[8], cos8[4], cos4[2]; -real *mp3lib_pnts[]={ cos64,cos32,cos16,cos8,cos4 }; - -static int intwinbase[] = { - 0, -1, -1, -1, -1, -1, -1, -2, -2, -2, - -2, -3, -3, -4, -4, -5, -5, -6, -7, -7, - -8, -9, -10, -11, -13, -14, -16, -17, -19, -21, - -24, -26, -29, -31, -35, -38, -41, -45, -49, -53, - -58, -63, -68, -73, -79, -85, -91, -97, -104, -111, - -117, -125, -132, -139, -147, -154, -161, -169, -176, -183, - -190, -196, -202, -208, -213, -218, -222, -225, -227, -228, - -228, -227, -224, -221, -215, -208, -200, -189, -177, -163, - -146, -127, -106, -83, -57, -29, 2, 36, 72, 111, - 153, 197, 244, 294, 347, 401, 459, 519, 581, 645, - 711, 779, 848, 919, 991, 1064, 1137, 1210, 1283, 1356, - 1428, 1498, 1567, 1634, 1698, 1759, 1817, 1870, 1919, 1962, - 2001, 2032, 2057, 2075, 2085, 2087, 2080, 2063, 2037, 2000, - 1952, 1893, 1822, 1739, 1644, 1535, 1414, 1280, 1131, 970, - 794, 605, 402, 185, -45, -288, -545, -814, -1095, -1388, - -1692, -2006, -2330, -2663, -3004, -3351, -3705, -4063, -4425, -4788, - -5153, -5517, -5879, -6237, -6589, -6935, -7271, -7597, -7910, -8209, - -8491, -8755, -8998, -9219, -9416, -9585, -9727, -9838, -9916, -9959, - -9966, -9935, -9863, -9750, -9592, -9389, -9139, -8840, -8492, -8092, - -7640, -7134, -6574, -5959, -5288, -4561, -3776, -2935, -2037, -1082, - -70, 998, 2122, 3300, 4533, 5818, 7154, 8540, 9975, 11455, - 12980, 14548, 16155, 17799, 19478, 21189, 22929, 24694, 26482, 28289, - 30112, 31947, 33791, 35640, 37489, 39336, 41176, 43006, 44821, 46617, - 48390, 50137, 51853, 53534, 55178, 56778, 58333, 59838, 61289, 62684, - 64019, 65290, 66494, 67629, 68692, 69679, 70590, 71420, 72169, 72835, - 73415, 73908, 74313, 74630, 74856, 74992, 75038 }; - -static void make_decode_tables(long scaleval) -{ - int i,j,k,kr,divv; - real *table,*costab; - - - for(i=0;i<5;i++) - { - kr=0x10>>i; divv=0x40>>i; - costab = mp3lib_pnts[i]; - for(k=0;k -#include - -#include -#include - -#include "config.h" -#include "libmpcodecs/ad_mp3lib.h" -#include "mp3lib/mp3.h" -#include "cpudetect.h" - -static inline unsigned int GetTimer(void){ - struct timeval tv; - struct timezone tz; -// float s; - gettimeofday(&tv,&tz); -// s=tv.tv_usec;s*=0.000001;s+=tv.tv_sec; - return tv.tv_sec * 1000000 + tv.tv_usec; -} - -static FILE* mp3file=NULL; - -int mplayer_audio_read(char *buf, int size) -{ - return fread(buf,1,size,mp3file); -} - -#define BUFFLEN 4608 -static unsigned char buffer[BUFFLEN]; - -int main(int argc,char* argv[]){ - int len; - int total=0; - unsigned int time1; - float length; -#ifdef DUMP_PCM - FILE *f=NULL; - f=fopen("test.pcm","wb"); -#endif - - mp3file=fopen((argc>1)?argv[1]:"test.mp3","rb"); - if(!mp3file){ printf("file not found\n"); exit(1); } - - GetCpuCaps(&gCpuCaps); - - // MPEG Audio: -#ifdef CONFIG_FAKE_MONO - MP3_Init(0); -#else - MP3_Init(); -#endif - MP3_samplerate=MP3_channels=0; - - time1=GetTimer(); - while((len=MP3_DecodeFrame(buffer,-1))>0 && total<2000000){ - total+=len; - // play it -#ifdef DUMP_PCM - fwrite(buffer,len,1,f); -#endif - //putchar('.');fflush(stdout); - } - time1=GetTimer()-time1; - length=(float)total/(float)(MP3_samplerate*MP3_channels*2); - printf("\nDecoding time: %8.6f\n",(float)time1*0.000001f); - printf("Uncompressed size: %d bytes (%8.3f secs)\n",total,length); - printf("CPU usage at normal playback: %5.2f %%\n",time1*0.0001f/length); - - fclose(mp3file); - return 0; -} diff -r bc0898c7399b -r b924f0df5a1d mp3lib/test2.c --- a/mp3lib/test2.c Sun Oct 21 11:14:13 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,89 +0,0 @@ -/* - * This file is part of MPlayer. - * - * MPlayer is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * MPlayer is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License along - * with MPlayer; if not, write to the Free Software Foundation, Inc., - * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. - */ - -#include -#include -#include - -#include -#include -#include -#include - -#include "config.h" -#include "libmpcodecs/ad_mp3lib.h" -#include "mp3lib/mp3.h" -#include "cpudetect.h" - -static FILE* mp3file=NULL; - -int mplayer_audio_read(char *buf, int size) -{ - return fread(buf,1,size,mp3file); -} - -#define BUFFLEN 4608 -static unsigned char buffer[BUFFLEN]; - - -int main(int argc,char* argv[]){ - int len; - int total=0; - int r; - int audio_fd; - - mp3file=fopen((argc>1)?argv[1]:"test.mp3","rb"); - if(!mp3file){ printf("file not found\n"); exit(1); } - - GetCpuCaps(&gCpuCaps); - - // MPEG Audio: -#ifdef CONFIG_FAKE_MONO - MP3_Init(0); -#else - MP3_Init(); -#endif - MP3_samplerate=MP3_channels=0; - len=MP3_DecodeFrame(buffer,-1); - - audio_fd=open("/dev/dsp", O_WRONLY); - if(audio_fd<0){ printf("Can't open audio device\n");exit(1); } - r=AFMT_S16_LE;ioctl (audio_fd, SNDCTL_DSP_SETFMT, &r); - r=MP3_channels-1;ioctl (audio_fd, SNDCTL_DSP_STEREO, &r); - r=MP3_samplerate;ioctl (audio_fd, SNDCTL_DSP_SPEED, &r); - printf("audio_setup: using %d Hz samplerate (requested: %d)\n",r,MP3_samplerate); - - while(1){ - int len2; - if(len==0) len=MP3_DecodeFrame(buffer,-1); - if(len<=0) break; // EOF - - // play it - len2=write(audio_fd,buffer,len); - if(len2<0) break; // ERROR? - len-=len2; total+=len2; - if(len>0){ - // this shouldn't happen... - memcpy(buffer,buffer+len2,len); - putchar('!');fflush(stdout); - } - } - - fclose(mp3file); - return 0; -}