[3/3] dca: Integer core decoder.

Message ID 1447434662-11846-3-git-send-email-alexandra@khirnov.net
State New
Headers show

Commit Message

Alexandra Hájková Nov. 13, 2015, 5:11 p.m.
Provide a bit-exact reconstruction.

The dca core decoder converts integer coefficients read from the
bitstream to floats just after reading them (along with dequantization).
All the other steps of the audio reconstruction are done with floats
which makes the output for the DTS lossless extension (XLL)
actually lossy.
This patch trasforms the dca core to work with integer coefficients
till the point QMF filters are called. The integer coefficients are
transformed to floats at this point if needed, the new bitexact QMF
filters are used for the XLL streams or for the case the new
-force_fixed option is set. This option forces fixed-point
reconstruction for any kind of input.
---
 doc/decoders.texi     |   9 +
 libavcodec/dca.h      |  13 +-
 libavcodec/dca_exss.c |   7 +
 libavcodec/dcadec.c   | 253 +++++++++++++++++--------
 libavcodec/dcadsp.c   | 500 +++++++++++++++++++++++++++++++++++++++++++++++++-
 libavcodec/dcadsp.h   |  15 +-
 6 files changed, 711 insertions(+), 86 deletions(-)

Comments

Derek Buitenhuis Nov. 14, 2015, 9:24 p.m. | #1
Random lazy questions incoming.

On 11/13/2015 5:11 PM, Alexandra Hájková wrote:
> Provide a bit-exact reconstruction.
> 
> The dca core decoder converts integer coefficients read from the
> bitstream to floats just after reading them (along with dequantization).
> All the other steps of the audio reconstruction are done with floats
> which makes the output for the DTS lossless extension (XLL)
> actually lossy.
> This patch trasforms the dca core to work with integer coefficients
> till the point QMF filters are called. The integer coefficients are
> transformed to floats at this point if needed, the new bitexact QMF
> filters are used for the XLL streams or for the case the new
> -force_fixed option is set. This option forces fixed-point
> reconstruction for any kind of input.
> ---
>  doc/decoders.texi     |   9 +
>  libavcodec/dca.h      |  13 +-
>  libavcodec/dca_exss.c |   7 +
>  libavcodec/dcadec.c   | 253 +++++++++++++++++--------
>  libavcodec/dcadsp.c   | 500 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  libavcodec/dcadsp.h   |  15 +-
>  6 files changed, 711 insertions(+), 86 deletions(-)

Was this tested against any known sample input/output?

> + * The functions idct_perform32_fixed, qmf_32_subbands_fixed, idct_perform64_fixed,
> + * qmf_64_subbands_fixed and the auxiliary functions they are using are adapted
> + * from libdcadec, https://github.com/foo86/dcadec/tree/master/libdcadec.

You need to fix up these functions so they do not use for loops with variable
declarations in them, and also add him to the copyright.

- Derek
Alexandra Hájková Nov. 15, 2015, 8:30 a.m. | #2
On Sat, Nov 14, 2015 at 10:24 PM, Derek Buitenhuis
<derek.buitenhuis@gmail.com> wrote:
> Random lazy questions incoming.
>
> On 11/13/2015 5:11 PM, Alexandra Hájková wrote:
>> Provide a bit-exact reconstruction.
>>
>> The dca core decoder converts integer coefficients read from the
>> bitstream to floats just after reading them (along with dequantization).
>> All the other steps of the audio reconstruction are done with floats
>> which makes the output for the DTS lossless extension (XLL)
>> actually lossy.
>> This patch trasforms the dca core to work with integer coefficients
>> till the point QMF filters are called. The integer coefficients are
>> transformed to floats at this point if needed, the new bitexact QMF
>> filters are used for the XLL streams or for the case the new
>> -force_fixed option is set. This option forces fixed-point
>> reconstruction for any kind of input.
>> ---
>>  doc/decoders.texi     |   9 +
>>  libavcodec/dca.h      |  13 +-
>>  libavcodec/dca_exss.c |   7 +
>>  libavcodec/dcadec.c   | 253 +++++++++++++++++--------
>>  libavcodec/dcadsp.c   | 500 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>  libavcodec/dcadsp.h   |  15 +-
>>  6 files changed, 711 insertions(+), 86 deletions(-)
>
> Was this tested against any known sample input/output?

It was but the results are ambiguous, I hope this will be clearer till the next
version of this patch.
>
>> + * The functions idct_perform32_fixed, qmf_32_subbands_fixed, idct_perform64_fixed,
>> + * qmf_64_subbands_fixed and the auxiliary functions they are using are adapted
>> + * from libdcadec, https://github.com/foo86/dcadec/tree/master/libdcadec.
>
> You need to fix up these functions so they do not use for loops with variable
> declarations in them, and also add him to the copyright.

I'll do.
>
> - Derek
> _______________________________________________
> libav-devel mailing list
> libav-devel@libav.org
> https://lists.libav.org/mailman/listinfo/libav-devel

Thank you for the review.
Niels =?iso-8859-1?Q?M=F6ller?= Nov. 15, 2015, 1:10 p.m. | #3
"Alexandra Hájková" <alexandra.khirnova@gmail.com> writes:

> This patch trasforms the dca core to work with integer coefficients
> till the point QMF filters are called. The integer coefficients are
> transformed to floats at this point if needed, the new bitexact QMF
> filters are used for the XLL streams or for the case the new
> -force_fixed option is set. This option forces fixed-point
> reconstruction for any kind of input.

I just want to say see that it's great to see progress on this. I and
Benjamin made a half-hearted attempt at integrating the libdca bitexact
transform code a while back, without getting very far.

Your approach to handle everything as integers until you get to call the
filters seems like the right thing to do.

Happy hacking,
/Niels

Patch

diff --git a/doc/decoders.texi b/doc/decoders.texi
index 99d2008..e513540 100644
--- a/doc/decoders.texi
+++ b/doc/decoders.texi
@@ -53,4 +53,13 @@  Loud sounds are fully compressed.  Soft sounds are enhanced.
 
 @end table
 
+@section dca
+
+@table @option
+
+@item -force_fixed 1
+Force fixed-point reconstruction for any kind of input.
+
+@end table
+
 @c man end AUDIO DECODERS
diff --git a/libavcodec/dca.h b/libavcodec/dca.h
index 6548d75..1477c50 100644
--- a/libavcodec/dca.h
+++ b/libavcodec/dca.h
@@ -139,7 +139,7 @@  typedef struct DCAAudioHeader {
     int scalefactor_huffman[DCA_PRIM_CHANNELS_MAX]; ///< scale factor code book
     int bitalloc_huffman[DCA_PRIM_CHANNELS_MAX];    ///< bit allocation quantizer select
     int quant_index_huffman[DCA_PRIM_CHANNELS_MAX][DCA_ABITS_MAX]; ///< quantization index codebook select
-    float scalefactor_adj[DCA_PRIM_CHANNELS_MAX][DCA_ABITS_MAX];   ///< scale factor adjustment
+    int scalefactor_adj[DCA_PRIM_CHANNELS_MAX][DCA_ABITS_MAX];   ///< scale factor adjustment
 
     int subframes;              ///< number of subframes
     int total_channels;         ///< number of channels including extensions
@@ -147,15 +147,16 @@  typedef struct DCAAudioHeader {
 } DCAAudioHeader;
 
 typedef struct DCAChan {
-    DECLARE_ALIGNED(32, float, subband_samples)[DCA_BLOCKS_MAX][DCA_SUBBANDS][8];
+    DECLARE_ALIGNED(32, int, subband_samples)[DCA_BLOCKS_MAX][DCA_SUBBANDS][8];
 
     /* Subband samples history (for ADPCM) */
-    DECLARE_ALIGNED(16, float, subband_samples_hist)[DCA_SUBBANDS][4];
+    DECLARE_ALIGNED(16, int, subband_samples_hist)[DCA_SUBBANDS][4];
     int hist_index;
 
     /* Half size is sufficient for core decoding, but for 96 kHz data
      * we need QMF with 64 subbands and 1024 samples. */
     DECLARE_ALIGNED(32, float, subband_fir_hist)[1024];
+    DECLARE_ALIGNED(32, int, subband_hist)[1024];
     DECLARE_ALIGNED(32, float, subband_fir_noidea)[64];
 
     /* Primary audio coding side information */
@@ -220,7 +221,7 @@  typedef struct DCAContext {
     uint16_t core_downmix_codes[DCA_PRIM_CHANNELS_MAX + 1][4];   ///< embedded downmix coefficients (9-bit codes)
 
 
-    float lfe_data[2 * DCA_LFE_MAX * (DCA_BLOCKS_MAX + 4)];      ///< Low frequency effect data
+    int lfe_data[2 * DCA_LFE_MAX * (DCA_BLOCKS_MAX + 4)];      ///< Low frequency effect data
     int lfe_scale_factor;
 
     /* Subband samples history (for ADPCM) */
@@ -230,7 +231,7 @@  typedef struct DCAContext {
 
     int output;                 ///< type of output
 
-    float *samples_chanptr[DCA_PRIM_CHANNELS_MAX + 1];
+    void *samples_chanptr[DCA_PRIM_CHANNELS_MAX + 1];
     float *extra_channels[DCA_PRIM_CHANNELS_MAX + 1];
     uint8_t *extra_channels_buffer;
     unsigned int extra_channels_buffer_size;
@@ -247,6 +248,8 @@  typedef struct DCAContext {
     int core_ext_mask;          ///< present extensions in the core substream
     int exss_ext_mask;          ///< Non-core extensions
 
+    int fixed;            ///< force using fixedpoint QMF
+
     /* XCh extension information */
     int xch_present;            ///< XCh extension present and valid
     int xch_base_channel;       ///< index of first (only) channel containing XCH data
diff --git a/libavcodec/dca_exss.c b/libavcodec/dca_exss.c
index 2895e20..55afa88 100644
--- a/libavcodec/dca_exss.c
+++ b/libavcodec/dca_exss.c
@@ -23,7 +23,9 @@ 
 
 #include "dca.h"
 #include "dca_syncwords.h"
+#include "dcadata.h"
 #include "get_bits.h"
+#include "avcodec.h"
 
 /* extensions that reside in core substream */
 #define DCA_CORE_EXTS (DCA_EXT_XCH | DCA_EXT_XXCH | DCA_EXT_X96)
@@ -343,6 +345,11 @@  void ff_dca_exss_parse_header(DCAContext *s)
                            "DTS-XLL: ignoring XLL extension\n");
                     break;
                 }
+                av_log(s->avctx, AV_LOG_ERROR,
+                           "bps = %d\n", ff_dca_bits_per_sample[s->source_pcm_res]);
+
+                s->avctx->sample_fmt = AV_SAMPLE_FMT_S32P;
+                s->avctx->bits_per_raw_sample = ff_dca_bits_per_sample[s->source_pcm_res];
                 av_log(s->avctx, AV_LOG_DEBUG,
                        "DTS-XLL: decoding XLL extension\n");
                 if (ff_dca_xll_decode_header(s)        == 0 &&
diff --git a/libavcodec/dcadec.c b/libavcodec/dcadec.c
index 610857d..2e57736 100644
--- a/libavcodec/dcadec.c
+++ b/libavcodec/dcadec.c
@@ -44,6 +44,7 @@ 
 #include "dcadata.h"
 #include "dcadsp.h"
 #include "dcahuff.h"
+#include "dcamath.h"
 #include "fft.h"
 #include "fmtconvert.h"
 #include "get_bits.h"
@@ -225,7 +226,7 @@  static inline void get_array(GetBitContext *gb, int *dst, int len, int bits)
 static int dca_parse_audio_coding_header(DCAContext *s, int base_channel)
 {
     int i, j;
-    static const float adj_table[4] = { 1.0, 1.1250, 1.2500, 1.4375 };
+    static const int adj_table[4] = { 16, 18, 20, 23 };
     static const int bitlen[11] = { 0, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3 };
     static const int thr[11]    = { 0, 1, 3, 3, 3, 3, 7, 7, 7, 7, 7 };
 
@@ -641,7 +642,7 @@  static void qmf_64_subbands(DCAContext *s, int chans, float samples_in[64][SAMPL
     }
 }
 
-static void lfe_interpolation_fir(DCAContext *s, const float *samples_in,
+static void lfe_interpolation_fir(DCAContext *s, const int *samples_in,
                                   float *samples_out)
 {
     /* samples_in: An array holding decimated samples.
@@ -697,13 +698,16 @@  static void lfe_interpolation_fir(DCAContext *s, const float *samples_in,
         op2                                     \
     }
 
-static void dca_downmix(float **samples, int srcfmt, int lfe_present,
-                        float coef[DCA_PRIM_CHANNELS_MAX + 1][2],
-                        const int8_t *channel_mapping)
+static void dca_downmix(DCAContext *dca)
 {
     int c, l, r, sl, sr, s;
     int i;
     float t, u, v;
+    int srcfmt = dca->amode;
+    int lfe_present = !!dca->lfe;
+    float (*coef)[2] = dca->downmix_coef;
+    const int8_t *channel_mapping = dca->channel_order_tab;
+    int **samples = (int **)dca->samples_chanptr;
 
     switch (srcfmt) {
     case DCA_MONO:
@@ -785,14 +789,27 @@  static int decode_blockcodes(int code1, int code2, int levels, int32_t *values)
 static const uint8_t abits_sizes[7]  = { 7, 10, 12, 13, 15, 17, 19 };
 static const uint8_t abits_levels[7] = { 3,  5,  7,  9, 13, 17, 25 };
 
+static void dequantize(int *samples, int step_size, int scale) {
+    int64_t step = (int64_t)step_size * scale;
+    int shift, i;
+    int32_t step_scale;
+
+    if (step > (1 << 23))
+        shift = av_log2(step >> 23) + 1;
+    else
+        shift = 0;
+    step_scale = (int32_t)(step >> shift);
+
+    for (i = 0; i < SAMPLES_PER_SUBBAND; i++) {
+        samples[i] = dca_clip23(dca_norm((int64_t)samples[i] * step_scale, 22 - shift));
+    }
+}
+
 static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
 {
     int k, l;
     int subsubframe = s->current_subsubframe;
-
-    const float *quant_step_table;
-
-    LOCAL_ALIGNED_16(int32_t, block, [SAMPLES_PER_SUBBAND * DCA_SUBBANDS]);
+    const int *quant_step_table;
 
     /*
      * Audio data
@@ -800,13 +817,13 @@  static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
 
     /* Select quantization step size table */
     if (s->bit_rate_index == 0x1f)
-        quant_step_table = ff_dca_lossless_quant_d;
+        quant_step_table = ff_dca_lossless_quant;
     else
-        quant_step_table = ff_dca_lossy_quant_d;
+        quant_step_table = ff_dca_lossy_quant;
 
     for (k = base_channel; k < s->audio_header.prim_channels; k++) {
-        float (*subband_samples)[8] = s->dca_chan[k].subband_samples[block_index];
-        float rscale[DCA_SUBBANDS];
+        int (*subband_samples)[8] = s->dca_chan[k].subband_samples[block_index];
+        int64_t rscale[DCA_SUBBANDS];
 
         if (get_bits_left(&s->gb) < 0)
             return AVERROR_INVALIDDATA;
@@ -817,7 +834,7 @@  static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
             /* Select the mid-tread linear quantizer */
             int abits = s->dca_chan[k].bitalloc[l];
 
-            float quant_step_size = quant_step_table[abits];
+            int quant_step_size = quant_step_table[abits];
 
             /*
              * Determine quantization index code book and its type
@@ -831,14 +848,16 @@  static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
              */
             if (!abits) {
                 rscale[l] = 0;
-                memset(block + SAMPLES_PER_SUBBAND * l, 0, SAMPLES_PER_SUBBAND * sizeof(block[0]));
+                memset(subband_samples[l], 0, SAMPLES_PER_SUBBAND *
+                       sizeof(subband_samples[l][0]));
             } else {
                 /* Deal with transients */
                 int sfi = s->dca_chan[k].transition_mode[l] &&
                     subsubframe >= s->dca_chan[k].transition_mode[l];
-                rscale[l] = quant_step_size * s->dca_chan[k].scale_factor[l][sfi] *
+                rscale[l] = s->dca_chan[k].scale_factor[l][sfi] *
                             s->audio_header.scalefactor_adj[k][sel];
 
+
                 if (abits >= 11 || !dca_smpl_bitalloc[abits].vlc[sel].table) {
                     if (abits <= 7) {
                         /* Block code */
@@ -850,7 +869,7 @@  static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
                         block_code1 = get_bits(&s->gb, size);
                         block_code2 = get_bits(&s->gb, size);
                         err         = decode_blockcodes(block_code1, block_code2,
-                                                        levels, block + SAMPLES_PER_SUBBAND * l);
+                                                        levels, subband_samples[l]);
                         if (err) {
                             av_log(s->avctx, AV_LOG_ERROR,
                                    "ERROR: block code look-up failed\n");
@@ -859,20 +878,18 @@  static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
                     } else {
                         /* no coding */
                         for (m = 0; m < SAMPLES_PER_SUBBAND; m++)
-                            block[SAMPLES_PER_SUBBAND * l + m] = get_sbits(&s->gb, abits - 3);
+                            subband_samples[l][m] = get_sbits(&s->gb, abits - 3);
                     }
                 } else {
                     /* Huffman coded */
                     for (m = 0; m < SAMPLES_PER_SUBBAND; m++)
-                        block[SAMPLES_PER_SUBBAND * l + m] = get_bitalloc(&s->gb,
-                                                        &dca_smpl_bitalloc[abits], sel);
+                        subband_samples[l][m] = get_bitalloc(&s->gb,
+                                                             &dca_smpl_bitalloc[abits], sel);
                 }
             }
+            dequantize(subband_samples[l], quant_step_size, rscale[l]);
         }
 
-        s->fmt_conv.int32_to_float_fmul_array8(&s->fmt_conv, subband_samples[0],
-                                               block, rscale, SAMPLES_PER_SUBBAND * s->audio_header.vq_start_subband[k]);
-
         for (l = 0; l < s->audio_header.vq_start_subband[k]; l++) {
             int m;
             /*
@@ -882,49 +899,52 @@  static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
                 int n;
                 if (s->predictor_history)
                     subband_samples[l][0] += (ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][0] *
-                                                 s->dca_chan[k].subband_samples_hist[l][3] +
-                                                 ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][1] *
-                                                 s->dca_chan[k].subband_samples_hist[l][2] +
-                                                 ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][2] *
-                                                 s->dca_chan[k].subband_samples_hist[l][1] +
-                                                 ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][3] *
-                                                 s->dca_chan[k].subband_samples_hist[l][0]) *
-                                                (1.0f / 8192);
+                                              (int64_t)s->dca_chan[k].subband_samples_hist[l][3] +
+                                              ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][1] *
+                                              (int64_t)s->dca_chan[k].subband_samples_hist[l][2] +
+                                              ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][2] *
+                                              (int64_t)s->dca_chan[k].subband_samples_hist[l][1] +
+                                              ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][3] *
+                                              (int64_t)s->dca_chan[k].subband_samples_hist[l][0]) +
+                                              (1 << 12) >> 13;
                 for (m = 1; m < SAMPLES_PER_SUBBAND; m++) {
-                    float sum = ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][0] *
-                                subband_samples[l][m - 1];
+                    int64_t sum = ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][0] *
+                                  (int64_t)subband_samples[l][m - 1];
                     for (n = 2; n <= 4; n++)
                         if (m >= n)
                             sum += ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][n - 1] *
-                                   subband_samples[l][m - n];
+                                   (int64_t)subband_samples[l][m - n];
                         else if (s->predictor_history)
                             sum += ff_dca_adpcm_vb[s->dca_chan[k].prediction_vq[l]][n - 1] *
-                                   s->dca_chan[k].subband_samples_hist[l][m - n + 4];
-                    subband_samples[l][m] += sum * 1.0f / 8192;
+                                   (int64_t)s->dca_chan[k].subband_samples_hist[l][m - n + 4];
+                    subband_samples[l][m] += (int)(sum + (1 << 12) >> 13);
                 }
             }
-
         }
         /* Backup predictor history for adpcm */
         for (l = 0; l < DCA_SUBBANDS; l++)
             AV_COPY128(s->dca_chan[k].subband_samples_hist[l], &subband_samples[l][4]);
 
-
         /*
          * Decode VQ encoded high frequencies
          */
         if (s->audio_header.subband_activity[k] > s->audio_header.vq_start_subband[k]) {
+            int i, j;
+
             if (!s->debug_flag & 0x01) {
                 av_log(s->avctx, AV_LOG_DEBUG,
                        "Stream with high frequencies VQ coding\n");
                 s->debug_flag |= 0x01;
             }
 
-            s->dcadsp.decode_hf(subband_samples, s->dca_chan[k].high_freq_vq,
-                                ff_dca_high_freq_vq, subsubframe * SAMPLES_PER_SUBBAND,
-                                s->dca_chan[k].scale_factor,
-                                s->audio_header.vq_start_subband[k],
-                                s->audio_header.subband_activity[k]);
+            // this should be SIMDified
+            for (j = s->audio_header.vq_start_subband[k]; j < s->audio_header.subband_activity[k]; j++) {
+                /* 1 vector -> 32 sampjes but we only need the 8 samples
+                 * for this subsubframe. */
+                const int8_t *ptr = &ff_dca_high_freq_vq[s->dca_chan[k].high_freq_vq[j]][subsubframe * SAMPLES_PER_SUBBAND];
+                for (i = 0; i < 8; i++)
+                    subband_samples[j][i] = ptr[i] * s->dca_chan[k].scale_factor[j][0] + 8 >> 4;
+            }
         }
     }
 
@@ -942,31 +962,72 @@  static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
 static int dca_filter_channels(DCAContext *s, int block_index, int upsample)
 {
     int k;
+    float param[DCA_SUBBANDS];
+
+    for (k = 0; k < DCA_SUBBANDS; k++)
+        param[k] = 16;
+
+    // for the 96 kHz lossless
+    if (s->fixed && upsample) {
+        int **subband_samples_hi = NULL;
+
+        for (k = 0; k < s->audio_header.prim_channels; k++) {
+            int (*subband_samples)[SAMPLES_PER_SUBBAND] =
+                s->dca_chan[k].subband_samples[block_index];
+            int *samples_out = s->samples_chanptr[s->channel_order_tab[k]];
+
+            qmf_64_subbands_fixed(subband_samples, subband_samples_hi,
+                                  s->dca_chan[k].subband_hist, samples_out, 8);
+        }
+      // X96 extension is not supported now
+    } else if (0) { //(s->core_ext_mask & DCA_EXT_X96) {
+        LOCAL_ALIGNED_16(float, samples, [64], [SAMPLES_PER_SUBBAND]);
 
-    if (upsample) {
         if (!s->qmf64_table) {
             s->qmf64_table = qmf64_precompute();
             if (!s->qmf64_table)
                 return AVERROR(ENOMEM);
         }
 
-        /* 64 subbands QMF */
         for (k = 0; k < s->audio_header.prim_channels; k++) {
-            float (*subband_samples)[SAMPLES_PER_SUBBAND] = s->dca_chan[k].subband_samples[block_index];
+            int (*subband_samples)[SAMPLES_PER_SUBBAND] =
+                s->dca_chan[k].subband_samples[block_index];
+
+            s->fmt_conv.int32_to_float_fmul_array8(&s->fmt_conv, samples[0],
+                                                   subband_samples[0], param,
+                                                   64 * SAMPLES_PER_SUBBAND);
 
             if (s->channel_order_tab[k] >= 0)
-                qmf_64_subbands(s, k, subband_samples,
+                qmf_64_subbands(s, k, samples,
                                 s->samples_chanptr[s->channel_order_tab[k]],
                                 /* Upsampling needs a factor 2 here. */
                                 M_SQRT2 / 32768.0);
         }
+      // for the 48 kHz lossless
+    } else if (s->fixed) {
+        for (k = 0; k < s->audio_header.prim_channels; k++) {
+            int (*subband_samples)[SAMPLES_PER_SUBBAND] =
+                s->dca_chan[k].subband_samples[block_index];
+            int **subband_samples_hi = NULL;
+            int *samples_out = s->samples_chanptr[s->channel_order_tab[k]];
+
+            qmf_32_subbands_fixed(subband_samples, subband_samples_hi,
+                                  s->dca_chan[k].subband_hist,
+                                  samples_out, 8, s->multirate_inter);
+        }
     } else {
-        /* 32 subbands QMF */
+        LOCAL_ALIGNED_16(float, samples, [32], [SAMPLES_PER_SUBBAND]);
+
         for (k = 0; k < s->audio_header.prim_channels; k++) {
-            float (*subband_samples)[SAMPLES_PER_SUBBAND] = s->dca_chan[k].subband_samples[block_index];
+            int (*subband_samples)[SAMPLES_PER_SUBBAND] =
+                s->dca_chan[k].subband_samples[block_index];
+
+            s->fmt_conv.int32_to_float_fmul_array8(&s->fmt_conv, samples[0],
+                                                   subband_samples[0], param,
+                                                   32 * SAMPLES_PER_SUBBAND);
 
             if (s->channel_order_tab[k] >= 0)
-                qmf_32_subbands(s, k, subband_samples,
+                qmf_32_subbands(s, k, samples,
                                 s->samples_chanptr[s->channel_order_tab[k]],
                                 M_SQRT1_2 / 32768.0);
         }
@@ -974,29 +1035,44 @@  static int dca_filter_channels(DCAContext *s, int block_index, int upsample)
 
     /* Generate LFE samples for this subsubframe FIXME!!! */
     if (s->lfe) {
-        float *samples = s->samples_chanptr[ff_dca_lfe_index[s->amode]];
-        lfe_interpolation_fir(s,
-                              s->lfe_data + 2 * s->lfe * (block_index + 4),
-                              samples);
-        if (upsample) {
-            unsigned i;
-            /* Should apply the filter in Table 6-11 when upsampling. For
-             * now, just duplicate. */
-            for (i = 511; i > 0; i--) {
-                samples[2 * i]     =
-                samples[2 * i + 1] = samples[i];
+        if (s->fixed) {
+            int *samples = s->samples_chanptr[ff_dca_lfe_index[s->amode]];
+            int synth_x96 = 0; // X96 synthesis flag should be set if X96 would be implemented
+
+            lfe_interpolation_fir_fixed(samples,
+                                        s->lfe_data + 2 * s->lfe * (block_index + 4),
+                                        2 * s->lfe, synth_x96);
+        } else {
+            float *samples = s->samples_chanptr[ff_dca_lfe_index[s->amode]];
+            lfe_interpolation_fir(s,
+                                  s->lfe_data + 2 * s->lfe * (block_index + 4),
+                                  samples);
+            if (upsample) {
+                unsigned i;
+                /* Should apply the filter in Table 6-11 when upsampling. For
+                 * now, just duplicate. */
+                for (i = 511; i > 0; i--) {
+                    samples[2 * i]     =
+                        samples[2 * i + 1] = samples[i];
+                }
+                samples[1] = samples[0];
             }
-            samples[1] = samples[0];
         }
     }
 
+    for (k = 0; k < s->audio_header.prim_channels; k++)
+        if (s->fixed) {
+            int *samples = s->samples_chanptr[k];
+            for (int i = 0; i < 8 * 32; i++)
+                samples[i] <<= 8;
+        }
+
     /* FIXME: This downmixing is probably broken with upsample.
-     * Probably totally broken also with XLL in general. */
-    /* Downmixing to Stereo */
-    if (s->audio_header.prim_channels + !!s->lfe > 2 &&
-        s->avctx->request_channel_layout == AV_CH_LAYOUT_STEREO) {
-        dca_downmix(s->samples_chanptr, s->amode, !!s->lfe, s->downmix_coef,
-                    s->channel_order_tab);
+       Downmixing to Stereo */
+    if ((s->audio_header.prim_channels + !!s->lfe > 2 &&
+        s->avctx->request_channel_layout == AV_CH_LAYOUT_STEREO) &&
+        !(s->core_ext_mask & DCA_EXT_EXSS_XLL)) {
+        dca_downmix(s);
     }
 
     return 0;
@@ -1350,6 +1426,15 @@  static int set_channel_layout(AVCodecContext *avctx, int channels, int num_core_
     return 0;
 }
 
+// multiply int vector src with scalar mul and add it to destination vector dst
+static void vector_by_scalar(int *dst, const int *src, int mul, int len)
+{
+    int i;
+
+    for (i = 0; i < len; i++)
+        dst[i] += src[i] * (int64_t)mul + 0x8000 >> 16;
+}
+
 /**
  * Main frame decoding function
  * FIXME add arguments
@@ -1364,7 +1449,6 @@  static int dca_decode_frame(AVCodecContext *avctx, void *data,
     int lfe_samples;
     int num_core_channels = 0;
     int i, ret;
-    float  **samples_flt;
     DCAContext *s = avctx->priv_data;
     int channels, full_channels;
     int upsample = 0;
@@ -1432,6 +1516,7 @@  static int dca_decode_frame(AVCodecContext *avctx, void *data,
                    xll_nb_samples, frame->nb_samples);
             s->exss_ext_mask &= ~DCA_EXT_EXSS_XLL;
         } else {
+            s->fixed = 1;
             if (2 * frame->nb_samples == xll_nb_samples) {
                 av_log(s->avctx, AV_LOG_INFO,
                        "XLL: upsampling core channels by a factor of 2\n");
@@ -1458,7 +1543,6 @@  static int dca_decode_frame(AVCodecContext *avctx, void *data,
         av_log(avctx, AV_LOG_ERROR, "get_buffer() failed\n");
         return ret;
     }
-    samples_flt = (float **) frame->extended_data;
 
     /* allocate buffer for extra channels if downmixing */
     if (avctx->channels < full_channels) {
@@ -1486,7 +1570,7 @@  static int dca_decode_frame(AVCodecContext *avctx, void *data,
         int ch;
         unsigned block = upsample ? 512 : 256;
         for (ch = 0; ch < channels; ch++)
-            s->samples_chanptr[ch] = samples_flt[ch] + i * block;
+            s->samples_chanptr[ch] = (int *)frame->extended_data[ch] + i * block;
         for (; ch < full_channels; ch++)
             s->samples_chanptr[ch] = s->extra_channels[ch - channels] + i * block;
 
@@ -1495,11 +1579,21 @@  static int dca_decode_frame(AVCodecContext *avctx, void *data,
         /* If this was marked as a DTS-ES stream we need to subtract back- */
         /* channel from SL & SR to remove matrixed back-channel signal */
         if ((s->source_pcm_res & 1) && s->xch_present) {
-            float *back_chan = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel]];
-            float *lt_chan   = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel - 2]];
-            float *rt_chan   = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel - 1]];
-            s->fdsp.vector_fmac_scalar(lt_chan, back_chan, -M_SQRT1_2, 256);
-            s->fdsp.vector_fmac_scalar(rt_chan, back_chan, -M_SQRT1_2, 256);
+            if (s->fixed) {
+                int *back_chan = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel]];
+                int *lt_chan   = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel - 2]];
+                int *rt_chan   = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel - 1]];
+                vector_by_scalar(lt_chan, back_chan,
+                                 (int)(M_SQRT1_2 * -0x10000), 256);
+                vector_by_scalar(rt_chan, back_chan,
+                                 (int)(M_SQRT1_2 * -0x10000), 256);
+            } else {
+                float *back_chan = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel]];
+                float *lt_chan   = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel - 2]];
+                float *rt_chan   = s->samples_chanptr[s->channel_order_tab[s->xch_base_channel - 1]];
+                s->fdsp.vector_fmac_scalar(lt_chan, back_chan, -M_SQRT1_2, 256);
+                s->fdsp.vector_fmac_scalar(rt_chan, back_chan, -M_SQRT1_2, 256);
+            }
         }
     }
 
@@ -1546,7 +1640,10 @@  static av_cold int dca_decode_init(AVCodecContext *avctx)
     ff_dcadsp_init(&s->dcadsp);
     ff_fmt_convert_init(&s->fmt_conv, avctx);
 
-    avctx->sample_fmt = AV_SAMPLE_FMT_FLTP;
+    if (s->fixed)
+        avctx->sample_fmt = AV_SAMPLE_FMT_S32P;
+    else
+        avctx->sample_fmt = AV_SAMPLE_FMT_FLTP;
 
     /* allow downmixing to stereo */
     if (avctx->channels > 2 &&
@@ -1578,6 +1675,7 @@  static const AVProfile profiles[] = {
 static const AVOption options[] = {
     { "disable_xch", "disable decoding of the XCh extension", offsetof(DCAContext, xch_disable), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, AV_OPT_FLAG_DECODING_PARAM | AV_OPT_FLAG_AUDIO_PARAM },
     { "disable_xll", "disable decoding of the XLL extension", offsetof(DCAContext, xll_disable), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, AV_OPT_FLAG_DECODING_PARAM | AV_OPT_FLAG_AUDIO_PARAM },
+    { "force_fixed", "force fixedpoint decoding",            offsetof(DCAContext, fixed), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, AV_OPT_FLAG_DECODING_PARAM | AV_OPT_FLAG_AUDIO_PARAM },
     { NULL },
 };
 
@@ -1599,6 +1697,7 @@  AVCodec ff_dca_decoder = {
     .close           = dca_decode_end,
     .capabilities    = AV_CODEC_CAP_CHANNEL_CONF | AV_CODEC_CAP_DR1,
     .sample_fmts     = (const enum AVSampleFormat[]) { AV_SAMPLE_FMT_FLTP,
+                                                       AV_SAMPLE_FMT_S32P,
                                                        AV_SAMPLE_FMT_NONE },
     .profiles        = NULL_IF_CONFIG_SMALL(profiles),
     .priv_class      = &dca_decoder_class,
diff --git a/libavcodec/dcadsp.c b/libavcodec/dcadsp.c
index 34b5da2..9a0e35e 100644
--- a/libavcodec/dcadsp.c
+++ b/libavcodec/dcadsp.c
@@ -17,14 +17,22 @@ 
  * You should have received a copy of the GNU Lesser General Public
  * License along with Libav; if not, write to the Free Software
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ * The functions idct_perform32_fixed, qmf_32_subbands_fixed, idct_perform64_fixed,
+ * qmf_64_subbands_fixed, lfe_interpolation_fir_fixed and the auxiliary functions
+ * they are using (mod*, sub*, clp*) are adapted from libdcadec,
+ * https://github.com/foo86/dcadec/tree/master/libdcadec.
  */
 
+#include <stdio.h>
 #include "config.h"
 
 #include "libavutil/attributes.h"
 #include "libavutil/intreadwrite.h"
 
 #include "dcadsp.h"
+#include "dcamath.h"
+#include "dcadata.h"
 
 static void decode_hf_c(float dst[DCA_SUBBANDS][8],
                         const int32_t vq_num[DCA_SUBBANDS],
@@ -44,7 +52,7 @@  static void decode_hf_c(float dst[DCA_SUBBANDS][8],
     }
 }
 
-static inline void dca_lfe_fir(float *out, const float *in, const float *coefs,
+static inline void dca_lfe_fir(float *out, const int *in, const float *coefs,
                                int decifactor)
 {
     float *out2    = out + 2 * decifactor - 1;
@@ -93,12 +101,12 @@  static void dca_qmf_32_subbands(float samples_in[32][8], int sb_act,
     }
 }
 
-static void dca_lfe_fir0_c(float *out, const float *in, const float *coefs)
+static void dca_lfe_fir0_c(float *out, const int *in, const float *coefs)
 {
     dca_lfe_fir(out, in, coefs, 32);
 }
 
-static void dca_lfe_fir1_c(float *out, const float *in, const float *coefs)
+static void dca_lfe_fir1_c(float *out, const int *in, const float *coefs)
 {
     dca_lfe_fir(out, in, coefs, 64);
 }
@@ -115,3 +123,489 @@  av_cold void ff_dcadsp_init(DCADSPContext *s)
     if (ARCH_X86)
         ff_dcadsp_init_x86(s);
 }
+
+static void sum_a(const int * restrict input, int * restrict output, int len)
+{
+    int i;
+
+    for (i = 0; i < len; i++)
+        output[i] = input[2 * i] + input[2 * i + 1];
+}
+
+static void sum_b(const int * restrict input, int * restrict output, int len)
+{
+    int i;
+
+    output[0] = input[0];
+    for (i = 1; i < len; i++)
+        output[i] = input[2 * i] + input[2 * i - 1];
+}
+
+static void sum_c(const int * restrict input, int * restrict output, int len)
+{
+    int i;
+
+    for (i = 0; i < len; i++)
+        output[i] = input[2 * i];
+}
+
+static void sum_d(const int * restrict input, int * restrict output, int len)
+{
+    int i;
+
+    output[0] = input[1];
+    for (i = 1; i < len; i++)
+        output[i] = input[2 * i - 1] + input[2 * i + 1];
+}
+
+static void clp_v(int *input, int len)
+{
+    int i;
+
+    for (i = 0; i < len; i++)
+        input[i] = dca_clip23(input[i]);
+}
+
+static void dct_a(const int * restrict input, int * restrict output)
+{
+    int i, j;
+    static const int cos_mod[8][8] = {
+        { 8348215,  8027397,  7398092,  6484482,  5321677,  3954362,  2435084,   822227 },
+        { 8027397,  5321677,   822227, -3954362, -7398092, -8348215, -6484482, -2435084 },
+        { 7398092,   822227, -6484482, -8027397, -2435084,  5321677,  8348215,  3954362 },
+        { 6484482, -3954362, -8027397,   822227,  8348215,  2435084, -7398092, -5321677 },
+        { 5321677, -7398092, -2435084,  8348215,  -822227, -8027397,  3954362,  6484482 },
+        { 3954362, -8348215,  5321677,  2435084, -8027397,  6484482,   822227, -7398092 },
+        { 2435084, -6484482,  8348215, -7398092,  3954362,   822227, -5321677,  8027397 },
+        {  822227, -2435084,  3954362, -5321677,  6484482, -7398092,  8027397, -8348215 }
+    };
+
+    for (i = 0; i < 8; i++) {
+        int64_t res = INT64_C(0);
+        for (j = 0; j < 8; j++)
+            res += (int64_t)cos_mod[i][j] * input[j];
+        output[i] = dca_norm(res, 23);
+    }
+}
+
+static void dct_b(const int * restrict input, int * restrict output)
+{
+    int i, j;
+    static const int cos_mod[8][7] = {
+        {  8227423,  7750063,  6974873,  5931642,  4660461,  3210181,  1636536 },
+        {  6974873,  3210181, -1636536, -5931642, -8227423, -7750063, -4660461 },
+        {  4660461, -3210181, -8227423, -5931642,  1636536,  7750063,  6974873 },
+        {  1636536, -7750063, -4660461,  5931642,  6974873, -3210181, -8227423 },
+        { -1636536, -7750063,  4660461,  5931642, -6974873, -3210181,  8227423 },
+        { -4660461, -3210181,  8227423, -5931642, -1636536,  7750063, -6974873 },
+        { -6974873,  3210181,  1636536, -5931642,  8227423, -7750063,  4660461 },
+        { -8227423,  7750063, -6974873,  5931642, -4660461,  3210181, -1636536 }
+    };
+
+    for (i = 0; i < 8; i++) {
+        int64_t res = (int64_t)input[0] * (1 << 23);
+        for (j = 0; j < 7; j++)
+            res += (int64_t)cos_mod[i][j] * input[1 + j];
+        output[i] = dca_norm(res, 23);
+    }
+}
+
+static void mod_a(const int * restrict input, int * restrict output)
+{
+    int i, k;
+    static const int cos_mod[16] = {
+        4199362,   4240198,   4323885,   4454708,
+        4639772,   4890013,   5221943,   5660703,
+        -6245623,  -7040975,  -8158494,  -9809974,
+        -12450076, -17261920, -28585092, -85479984
+    };
+
+    for (i = 0; i < 8; i++)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[i] + input[8 + i]), 23);
+
+    for (i = 8, k = 7; i < 16; i++, k--)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[k] - input[8 + k]), 23);
+}
+
+static void mod_b(int * restrict input, int * restrict output)
+{
+    int i, k;
+    static const int cos_mod[8] = {
+        4214598,  4383036,  4755871,  5425934,
+        6611520,  8897610, 14448934, 42791536
+    };
+
+    for (i = 0; i < 8; i++)
+        input[8 + i] = dca_norm((int64_t)cos_mod[i] * input[8 + i], 23);
+
+    for (i = 0; i < 8; i++)
+        output[i] = input[i] + input[8 + i];
+
+    for (i = 8, k = 7; i < 16; i++, k--)
+        output[i] = input[k] - input[8 + k];
+}
+
+static void mod_c(const int * restrict input, int * restrict output)
+{
+    int i, k;
+    static const int cos_mod[32] = {
+        1048892,  1051425,   1056522,   1064244,
+        1074689,  1087987,   1104313,   1123884,
+        1146975,  1173922,   1205139,   1241133,
+        1282529,  1330095,   1384791,   1447815,
+        -1520688, -1605358,  -1704360,  -1821051,
+        -1959964, -2127368,  -2332183,  -2587535,
+        -2913561, -3342802,  -3931480,  -4785806,
+        -6133390, -8566050, -14253820, -42727120
+    };
+
+    for (i = 0; i < 16; i++)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[i] + input[16 + i]), 23);
+
+    for (i = 16, k = 15; i < 32; i++, k--)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[k] - input[16 + k]), 23);
+}
+
+void idct_perform32_fixed(int * restrict input, int * restrict output)
+{
+    int mag = 0;
+    int shift, round;
+    int i;
+
+    for (i = 0; i < 32; i++)
+        mag += abs(input[i]);
+
+    shift = mag > 0x400000 ? 2 : 0;
+    round = shift > 0 ? 1 << (shift - 1) : 0;
+
+    for (i = 0; i < 32; i++)
+        input[i] = (input[i] + round) >> shift;
+
+    sum_a(input, output +  0, 16);
+    sum_b(input, output + 16, 16);
+    clp_v(output, 32);
+
+    sum_a(output +  0, input +  0, 8);
+    sum_b(output +  0, input +  8, 8);
+    sum_c(output + 16, input + 16, 8);
+    sum_d(output + 16, input + 24, 8);
+    clp_v(input, 32);
+
+    dct_a(input +  0, output +  0);
+    dct_b(input +  8, output +  8);
+    dct_b(input + 16, output + 16);
+    dct_b(input + 24, output + 24);
+    clp_v(output, 32);
+
+    mod_a(output +  0, input +  0);
+    mod_b(output + 16, input + 16);
+    clp_v(input, 32);
+
+    mod_c(input, output);
+
+    for (i = 0; i < 32; i++)
+        output[i] = dca_clip23(output[i] * (1 << shift));
+}
+
+void qmf_32_subbands_fixed(int subband_samples[32][8], int **subband_samples_hi, int *history,
+                           int *pcm_samples, int nb_samples, int swich)
+{
+    const int32_t *filter_coeff;
+    int input[32];
+    int output[32];
+
+    // Select filter
+    if (!swich)
+        filter_coeff = ff_dca_fir_32bands_nonperfect_fixed;
+    else
+        filter_coeff = ff_dca_fir_32bands_perfect_fixed;
+
+    for (int sample = 0; sample < nb_samples; sample++) {
+        int i, j, k;
+
+        // Load in one sample from each subband
+        for (i = 0; i < 32; i++) {
+            input[i] = subband_samples[i][sample];
+            //printf("%d\n", input[i]);
+        }
+
+        // Inverse DCT
+        idct_perform32_fixed(input, output);
+
+        // Store history
+        for (i = 0, k = 31; i < 16; i++, k--) {
+            history[     i] = dca_clip23(output[i] - output[k]);
+            history[16 + i] = dca_clip23(output[i] + output[k]);
+        }
+
+        // One subband sample generates 32 interpolated ones
+        for (i = 0; i < 16; i++) {
+            // Clear accumulation
+            int64_t res = INT64_C(0);
+
+            // Accumulate
+            for (j = 32; j < 512; j += 64)
+                res += (int64_t)history[16 + i + j] * filter_coeff[i + j];
+            res = dca_round(res, 21);
+            for (j =  0; j < 512; j += 64)
+                res += (int64_t)history[     i + j] * filter_coeff[i + j];
+
+            // Save interpolated samples
+            pcm_samples[sample * 32 + i] = dca_clip23(dca_norm(res, 21)); // * (1.0f / (1 << 24));
+           // printf("%lf \n", (float)dca_clip23(dca_norm(res, 21)));
+
+        }
+
+        for (i = 16, k = 15; i < 32; i++, k--) {
+            // Clear accumulation
+            int64_t res = INT64_C(0);
+
+            // Accumulate
+            for (j = 32; j < 512; j += 64)
+                res += (int64_t)history[16 + k + j] * filter_coeff[i + j];
+            res = dca_round(res, 21);
+            for (j =  0; j < 512; j += 64)
+                res += (int64_t)history[     k + j] * filter_coeff[i + j];
+
+            // Save interpolated samples
+            pcm_samples[sample * 32 + i] = dca_clip23(dca_norm(res, 21)); // * (1.0f / (1 << 24));
+           // printf("%lf \n", (float)dca_clip23(dca_norm(res, 21)));
+            //printf("%d\n", dca_clip23(dca_norm(res, 21)));
+        }
+
+        // Shift history
+        for (i = 511; i >= 32; i--)
+            history[i] = history[i - 32];
+    }
+}
+
+static void mod64_a(const int * restrict input, int * restrict output)
+{
+    int i, k;
+    static const int cos_mod[32] = {
+        4195568,   4205700,   4226086,    4256977,
+        4298755,   4351949,   4417251,    4495537,
+        4587901,   4695690,   4820557,    4964534,
+        5130115,   5320382,   5539164,    5791261,
+        -6082752,  -6421430,  -6817439,   -7284203,
+        -7839855,  -8509474,  -9328732,  -10350140,
+        -11654242, -13371208, -15725922,  -19143224,
+        -24533560, -34264200, -57015280, -170908480
+    };
+
+    for (i = 0; i < 16; i++)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[i] + input[16 + i]), 23);
+
+    for (i = 16, k = 15; i < 32; i++, k--)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[k] - input[16 + k]), 23);
+}
+
+static void mod64_b(int * restrict input, int * restrict output)
+{
+    int i, k;
+    static const int cos_mod[16] = {
+         4199362,  4240198,  4323885,  4454708,
+         4639772,  4890013,  5221943,  5660703,
+         6245623,  7040975,  8158494,  9809974,
+        12450076, 17261920, 28585092, 85479984
+    };
+
+    for (i = 0; i < 16; i++)
+        input[16 + i] = dca_norm((int64_t)cos_mod[i] * input[16 + i], 23);
+
+    for (i = 0; i < 16; i++)
+        output[i] = input[i] + input[16 + i];
+
+    for (i = 16, k = 15; i < 32; i++, k--)
+        output[i] = input[k] - input[16 + k];
+}
+
+static void mod64_c(const int * restrict input, int * restrict output)
+{
+    int i, k;
+    static const int cos_mod[64] = {
+          741511,    741958,    742853,    744199,
+          746001,    748262,    750992,    754197,
+          757888,    762077,    766777,    772003,
+          777772,    784105,    791021,    798546,
+          806707,    815532,    825054,    835311,
+          846342,    858193,    870912,    884554,
+          899181,    914860,    931667,    949686,
+          969011,    989747,   1012012,   1035941,
+        -1061684,  -1089412,  -1119320,  -1151629,
+        -1186595,  -1224511,  -1265719,  -1310613,
+        -1359657,  -1413400,  -1472490,  -1537703,
+        -1609974,  -1690442,  -1780506,  -1881904,
+        -1996824,  -2128058,  -2279225,  -2455101,
+        -2662128,  -2909200,  -3208956,  -3579983,
+        -4050785,  -4667404,  -5509372,  -6726913,
+        -8641940, -12091426, -20144284, -60420720
+    };
+
+    for (i = 0; i < 32; i++)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[i] + input[32 + i]), 23);
+
+    for (i = 32, k = 31; i < 64; i++, k--)
+        output[i] = dca_norm((int64_t)cos_mod[i] * (input[k] - input[32 + k]), 23);
+}
+
+void idct_perform64_fixed(int * restrict input, int * restrict output)
+{
+    int mag = 0;
+    int shift;
+    int round;
+    int i;
+
+    for (i = 0; i < 64; i++)
+        mag += abs(input[i]);
+
+    shift = mag > 0x400000 ? 2 : 0;
+    round = shift > 0 ? 1 << (shift - 1) : 0;
+
+    for (i = 0; i < 64; i++)
+        input[i] = (input[i] + round) >> shift;
+
+    sum_a(input, output +  0, 32);
+    sum_b(input, output + 32, 32);
+    clp_v(output, 64);
+
+    sum_a(output +  0, input +  0, 16);
+    sum_b(output +  0, input + 16, 16);
+    sum_c(output + 32, input + 32, 16);
+    sum_d(output + 32, input + 48, 16);
+    clp_v(input, 64);
+
+    sum_a(input +  0, output +  0, 8);
+    sum_b(input +  0, output +  8, 8);
+    sum_c(input + 16, output + 16, 8);
+    sum_d(input + 16, output + 24, 8);
+    sum_c(input + 32, output + 32, 8);
+    sum_d(input + 32, output + 40, 8);
+    sum_c(input + 48, output + 48, 8);
+    sum_d(input + 48, output + 56, 8);
+    clp_v(output, 64);
+
+    dct_a(output +  0, input +  0);
+    dct_b(output +  8, input +  8);
+    dct_b(output + 16, input + 16);
+    dct_b(output + 24, input + 24);
+    dct_b(output + 32, input + 32);
+    dct_b(output + 40, input + 40);
+    dct_b(output + 48, input + 48);
+    dct_b(output + 56, input + 56);
+    clp_v(input, 64);
+
+    mod_a(input +  0, output +  0);
+    mod_b(input + 16, output + 16);
+    mod_b(input + 32, output + 32);
+    mod_b(input + 48, output + 48);
+    clp_v(output, 64);
+
+    mod64_a(output +  0, input +  0);
+    mod64_b(output + 32, input + 32);
+    clp_v(input, 64);
+
+    mod64_c(input, output);
+
+    for (i = 0; i < 64; i++)
+        output[i] = dca_clip23(output[i] * (1 << shift));
+}
+
+void qmf_64_subbands_fixed(int subband_samples[64][8], int **subband_samples_hi, int *history,
+                           int *pcm_samples, int nb_samples)
+{
+    int output[64];
+    int sample;
+
+    // Interpolation begins
+    for (sample = 0; sample < nb_samples; sample++) {
+        int i, j, k;
+
+        // Load in one sample from each subband
+        int input[64];
+        if (subband_samples_hi) {
+            // Full 64 subbands, first 32 are residual coded
+            for (i =  0; i < 32; i++)
+                input[i] = subband_samples[i][sample] + subband_samples_hi[i][sample];
+            for (i = 32; i < 64; i++)
+                input[i] = subband_samples_hi[i][sample];
+        } else {
+            // Only first 32 subbands
+            for (i =  0; i < 32; i++)
+                input[i] = subband_samples[i][sample];
+            for (i = 32; i < 64; i++)
+                input[i] = 0;
+        }
+
+        // Inverse DCT
+        idct_perform64_fixed(input, output);
+
+        // Store history
+        for (i = 0, k = 63; i < 32; i++, k--) {
+            history[     i] = dca_clip23(output[i] - output[k]);
+            history[32 + i] = dca_clip23(output[i] + output[k]);
+        }
+
+        // One subband sample generates 64 interpolated ones
+        for (i = 0; i < 32; i++) {
+            // Clear accumulation
+            int64_t res = INT64_C(0);
+
+            // Accumulate
+            for (j = 64; j < 1024; j += 128)
+                res += (int64_t)history[32 + i + j] * ff_dca_band_fir_x96[i + j];
+            res = dca_round(res, 20);
+            for (j =  0; j < 1024; j += 128)
+                res += (int64_t)history[     i + j] * ff_dca_band_fir_x96[i + j];
+
+            // Save interpolated samples
+            pcm_samples[sample * 64 + i] = dca_clip23(dca_norm(res, 20));
+        }
+
+        for (i = 32, k = 31; i < 64; i++, k--) {
+            // Clear accumulation
+            int64_t res = INT64_C(0);
+
+            // Accumulate
+            for (j = 64; j < 1024; j += 128)
+                res += (int64_t)history[32 + k + j] * ff_dca_band_fir_x96[i + j];
+            res = dca_round(res, 20);
+            for (j =  0; j < 1024; j += 128)
+                res += (int64_t)history[     k + j] * ff_dca_band_fir_x96[i + j];
+
+            // Save interpolated samples
+            pcm_samples[sample * 64 + i] = dca_clip23(dca_norm(res, 20));
+        }
+
+        // Shift history
+        for (i = 1023; i >= 64; i--)
+            history[i] = history[i - 64];
+    }
+}
+
+void lfe_interpolation_fir_fixed(int *pcm_samples, int *lfe_samples,
+                                 int nb_samples, int synth_x96)
+{
+    int dec_factor = 64;
+
+    // Interpolation
+    for (int i = 0; i < nb_samples; i++) {
+        // One decimated sample generates 64 or 128 interpolated ones
+        for (int j = 0; j < dec_factor; j++) {
+            // Clear accumulation
+            int64_t res = INT64_C(0);
+
+            // Accumulate
+            for (int k = 0; k < 512 / dec_factor; k++)
+                res += (int64_t)ff_dca_lfe_fir_64[k * dec_factor + j] *
+                    lfe_samples[12 + i - k];
+
+            // Save interpolated samples
+            pcm_samples[(i * dec_factor + j) << synth_x96] = dca_clip23(dca_norm(res, 23));
+        }
+    }
+
+    // Update history
+    for (int n = 12 - 1; n >= 0; n--)
+        lfe_samples[n] = lfe_samples[nb_samples + n];
+}
diff --git a/libavcodec/dcadsp.h b/libavcodec/dcadsp.h
index 0fa75a5..8eff7de 100644
--- a/libavcodec/dcadsp.h
+++ b/libavcodec/dcadsp.h
@@ -14,6 +14,10 @@ 
  * You should have received a copy of the GNU Lesser General Public
  * License along with Libav; if not, write to the Free Software
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ * The functions idct_perform32_fixed, qmf_32_subbands_fixed, idct_perform64_fixed,
+ * qmf_64_subbands_fixed and the auxiliary functions they are using are adapted
+ * from libdcadec, https://github.com/foo86/dcadec/tree/master/libdcadec.
  */
 
 #ifndef AVCODEC_DCADSP_H
@@ -25,7 +29,7 @@ 
 #define DCA_SUBBANDS 32
 
 typedef struct DCADSPContext {
-    void (*lfe_fir[2])(float *out, const float *in, const float *coefs);
+    void (*lfe_fir[2])(void *out, const int *in, const float *coefs);
     void (*qmf_32_subbands)(float samples_in[32][8], int sb_act,
                             SynthFilterContext *synth, FFTContext *imdct,
                             float synth_buf_ptr[512],
@@ -43,4 +47,13 @@  void ff_dcadsp_init(DCADSPContext *s);
 void ff_dcadsp_init_arm(DCADSPContext *s);
 void ff_dcadsp_init_x86(DCADSPContext *s);
 
+void idct_perform32_fixed(int * restrict input, int * restrict output);
+void qmf_32_subbands_fixed(int subband_samples[32][8], int **subband_samples_hi,
+                           int *history, int *pcm_samples, int nb_samples, int swich);
+void idct_perform64_fixed(int * restrict input, int * restrict output);
+void qmf_64_subbands_fixed(int subband_samples[64][8], int **subband_samples_hi,
+                           int *history, int *pcm_samples, int nb_samples);
+void lfe_interpolation_fir_fixed(int *pcm_samples, int *lfe_samples,
+                                 int nb_samples, int synth_x96);
+
 #endif /* AVCODEC_DCADSP_H */