bug 1176196 - update libjpeg-turbo to 1.4.2. r=jrmuizel
authorTed Mielczarek <ted@mielczarek.org>
Mon, 05 Oct 2015 09:31:36 -0400
changeset 266415 8543b8749c4e20418ea17649088324e264ff4d18
parent 266414 f641c7662bddcf35262c483455f45e1f798e1b09
child 266416 bab7142595b4e4f1266eab4f612b610d40f47905
push id29493
push userkwierso@gmail.com
push dateWed, 07 Oct 2015 17:31:17 +0000
treeherdermozilla-central@49d87bbe0122 [default view] [failures only]
perfherder[talos] [build metrics] [platform microbench] (compared to previous push)
reviewersjrmuizel
bugs1176196
milestone44.0a1
first release with
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
last release without
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
bug 1176196 - update libjpeg-turbo to 1.4.2. r=jrmuizel
media/libjpeg/MOZCHANGES
media/libjpeg/README-turbo.txt
media/libjpeg/jccolor.c
media/libjpeg/jcdctmgr.c
media/libjpeg/jchuff.c
media/libjpeg/jconfig.h
media/libjpeg/jdarith.c
media/libjpeg/jdcolor.c
media/libjpeg/jddctmgr.c
media/libjpeg/jdhuff.c
media/libjpeg/jdhuff.h
media/libjpeg/jdphuff.c
media/libjpeg/jfdctint.c
media/libjpeg/jidctint.c
media/libjpeg/jidctred.c
media/libjpeg/jmorecfg.h
media/libjpeg/jpegint.h
media/libjpeg/jversion.h
media/libjpeg/mozilla.diff
media/libjpeg/simd/jccolext-sse2-64.asm
media/libjpeg/simd/jcgryext-sse2-64.asm
media/libjpeg/simd/jcsample-sse2-64.asm
media/libjpeg/simd/jdcolext-sse2-64.asm
media/libjpeg/simd/jdmrgext-sse2-64.asm
media/libjpeg/simd/jdsample-sse2-64.asm
media/libjpeg/simd/jidctflt-sse2-64.asm
media/libjpeg/simd/jidctfst-sse2-64.asm
media/libjpeg/simd/jidctint-sse2-64.asm
media/libjpeg/simd/jidctred-sse2-64.asm
media/libjpeg/simd/jquantf-sse2-64.asm
media/libjpeg/simd/jquanti-sse2-64.asm
media/libjpeg/simd/jsimd_mips.c
media/libjpeg/simd/jsimd_mips_dspr2.S
media/update-libjpeg.sh
--- a/media/libjpeg/MOZCHANGES
+++ b/media/libjpeg/MOZCHANGES
@@ -1,21 +1,17 @@
 To upgrade to a new revision of libjpeg-turbo, do the following:
 
-* Check out libjpeg-turbo from SVN:
+* Check out libjpeg-turbo from git:
 
-    $ svn co https://libjpeg-turbo.svn.sourceforge.net/svnroot/libjpeg-turbo/trunk libjpeg-turbo
-
-* In a clean clone of mozilla-central, run the following commands
+    $ git clone https://github.com/libjpeg-turbo/libjpeg-turbo.git
 
-    $ rm -rf media/libjpeg
-    $ svn export --ignore-externals /path/to/libjpeg-turbo media/libjpeg
-    $ cd media/libjpeg
+* In a clean clone of mozilla-central, run the update script (tag defaults to HEAD):
 
-* Copy win/jsimdcfg.inc to simd/.
+    $ ./media/update-libjpeg.sh /path/to/libjpeg-turbo [tag]
 
 * Since libjpeg-turbo normally creates jconfig.h and jconfigint.h at build time
   and we use pre-generated versions, changes to jconfig.h.in and jconfigint.h.in
   should be looked for and noted for later inclusion.
 
 * Now look through the new files and rm any which are npotb.  When I upgraded
   to libjpeg-turbo 1.1.0, the only files I kept which didn't match
 
@@ -36,33 +32,28 @@ To upgrade to a new revision of libjpeg-
   A helpful command for finding the *.c files which aren't *currently* part of
   the build is
 
     diff <(ls *.c | sort) <(grep -o '\w*\.c' Makefile.in | sort)
 
   Of course, libjpeg-turbo might have added some new source files, so you'll
   have to look though and figure out which of these files to keep.
 
-* Restore files modified in the Mozilla repository.
-
-    $ hg revert --no-backup jconfig.h jconfigint.h Makefile.in MOZCHANGES \
-      mozilla.diff simd/Makefile.in genTables.py
-
 * Update jconfig.h and jconfigint.h as noted previously.
 
-* Apply Mozilla-specific changes to upstream files.
-
-    $ patch -p0 -i mozilla.diff
-
-* Update Makefile.in to build any new files.
+* Update moz.build to build any new files.
 
 * Finally, tell hg that we've added or removed some files:
 
     $ hg addremove
 
+== October 5, 2015 (libjpeg-turbo v1.4.2 d8da49effe6460d55239c4c009c57f42d8e4a494 2015-09-21) ==
+
+* Updated to v1.4.2 release.
+
 == January 15, 2015 (libjpeg-turbo v1.4.0 r1481 2015-01-07) ==
 
 * Updated to v1.4.0 release.
 
 == March 24, 2014 (libjpeg-turbo v1.3.1 r1205 2014-03-22) ==
 
 * Updated to v1.3.1 release.
 
--- a/media/libjpeg/README-turbo.txt
+++ b/media/libjpeg/README-turbo.txt
@@ -23,43 +23,18 @@ early 2010, libjpeg-turbo spun off into 
 of making high-speed JPEG compression/decompression technology available to a
 broader range of users and developers.
 
 
 *******************************************************************************
 **     License
 *******************************************************************************
 
-Most of libjpeg-turbo inherits the non-restrictive, BSD-style license used by
-libjpeg (see README.)  The TurboJPEG wrapper (both C and Java versions) and
-associated test programs bear a similar license, which is reproduced below:
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-- Redistributions of source code must retain the above copyright notice,
-  this list of conditions and the following disclaimer.
-- Redistributions in binary form must reproduce the above copyright notice,
-  this list of conditions and the following disclaimer in the documentation
-  and/or other materials provided with the distribution.
-- Neither the name of the libjpeg-turbo Project nor the names of its
-  contributors may be used to endorse or promote products derived from this
-  software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS",
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
-LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
-CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
-SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
-CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
-ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
-POSSIBILITY OF SUCH DAMAGE.
+libjpeg-turbo is covered by three compatible BSD-style open source licenses.
+Refer to LICENSE.txt for a roll-up of license terms.
 
 
 *******************************************************************************
 **     Using libjpeg-turbo
 *******************************************************************************
 
 libjpeg-turbo includes two APIs that can be used to compress and decompress
 JPEG images:
@@ -306,18 +281,19 @@ following reasons:
    v8a as opposed to the algorithm used in libjpeg v6b.  It should be noted,
    however, that this algorithm basically brings the accuracy of the floating
    point IDCT in line with the accuracy of the slow integer IDCT.  The floating
    point DCT/IDCT algorithms are mainly a legacy feature, and they do not
    produce significantly more accuracy than the slow integer algorithms (to put
    numbers on this, the typical difference in PNSR between the two algorithms
    is less than 0.10 dB, whereas changing the quality level by 1 in the upper
    range of the quality scale is typically more like a 1.0 dB difference.)
--- When not using the SIMD extensions, then the accuracy of the floating point
-   DCT/IDCT can depend on the compiler and compiler settings.
+-- If the floating point algorithms in libjpeg-turbo are not implemented using
+   SIMD instructions on a particular platform, then the accuracy of the
+   floating point DCT/IDCT can depend on the compiler settings.
 
 While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
 still using the same algorithms as libjpeg v6b, so there are several specific
 cases in which libjpeg-turbo cannot be expected to produce the same output as
 libjpeg v8:
 
 -- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
    implements those scaling algorithms differently than libjpeg v6b does, and
--- a/media/libjpeg/jccolor.c
+++ b/media/libjpeg/jccolor.c
@@ -1,16 +1,16 @@
 /*
  * jccolor.c
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1996, Thomas G. Lane.
  * libjpeg-turbo Modifications:
  * Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
- * Copyright (C) 2009-2012, D. R. Commander.
+ * Copyright (C) 2009-2012, 2015 D. R. Commander.
  * Copyright (C) 2014, MIPS Technologies, Inc., California
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains input colorspace conversion routines.
  */
 
 #define JPEG_INTERNALS
 #include "jinclude.h"
@@ -459,34 +459,64 @@ grayscale_convert (j_compress_ptr cinfo,
  */
 
 METHODDEF(void)
 null_convert (j_compress_ptr cinfo,
               JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
               JDIMENSION output_row, int num_rows)
 {
   register JSAMPROW inptr;
-  register JSAMPROW outptr;
+  register JSAMPROW outptr, outptr0, outptr1, outptr2, outptr3;
   register JDIMENSION col;
   register int ci;
   int nc = cinfo->num_components;
   JDIMENSION num_cols = cinfo->image_width;
 
-  while (--num_rows >= 0) {
-    /* It seems fastest to make a separate pass for each component. */
-    for (ci = 0; ci < nc; ci++) {
-      inptr = *input_buf;
-      outptr = output_buf[ci][output_row];
+  if (nc == 3) {
+    while (--num_rows >= 0) {
+      inptr = *input_buf++;
+      outptr0 = output_buf[0][output_row];
+      outptr1 = output_buf[1][output_row];
+      outptr2 = output_buf[2][output_row];
+      output_row++;
       for (col = 0; col < num_cols; col++) {
-        outptr[col] = inptr[ci]; /* don't need GETJSAMPLE() here */
-        inptr += nc;
+        outptr0[col] = *inptr++;
+        outptr1[col] = *inptr++;
+        outptr2[col] = *inptr++;
       }
     }
-    input_buf++;
-    output_row++;
+  } else if (nc == 4) {
+    while (--num_rows >= 0) {
+      inptr = *input_buf++;
+      outptr0 = output_buf[0][output_row];
+      outptr1 = output_buf[1][output_row];
+      outptr2 = output_buf[2][output_row];
+      outptr3 = output_buf[3][output_row];
+      output_row++;
+      for (col = 0; col < num_cols; col++) {
+        outptr0[col] = *inptr++;
+        outptr1[col] = *inptr++;
+        outptr2[col] = *inptr++;
+        outptr3[col] = *inptr++;
+      }
+    }
+  } else {
+    while (--num_rows >= 0) {
+      /* It seems fastest to make a separate pass for each component. */
+      for (ci = 0; ci < nc; ci++) {
+        inptr = *input_buf;
+        outptr = output_buf[ci][output_row];
+        for (col = 0; col < num_cols; col++) {
+          outptr[col] = inptr[ci]; /* don't need GETJSAMPLE() here */
+          inptr += nc;
+        }
+      }
+      input_buf++;
+      output_row++;
+    }
   }
 }
 
 
 /*
  * Empty method for start_pass.
  */
 
--- a/media/libjpeg/jcdctmgr.c
+++ b/media/libjpeg/jcdctmgr.c
@@ -1,17 +1,17 @@
 /*
  * jcdctmgr.c
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1994-1996, Thomas G. Lane.
  * libjpeg-turbo Modifications:
  * Copyright (C) 1999-2006, MIYASAKA Masaru.
  * Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
- * Copyright (C) 2011, 2014 D. R. Commander
+ * Copyright (C) 2011, 2014-2015 D. R. Commander
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains the forward-DCT management logic.
  * This code selects a particular DCT implementation to be used,
  * and it performs related housekeeping chores including coefficient
  * quantization.
  */
 
@@ -170,16 +170,29 @@ flss (UINT16 val)
 
 LOCAL(int)
 compute_reciprocal (UINT16 divisor, DCTELEM * dtbl)
 {
   UDCTELEM2 fq, fr;
   UDCTELEM c;
   int b, r;
 
+  if (divisor == 1) {
+    /* divisor == 1 means unquantized, so these reciprocal/correction/shift
+     * values will cause the C quantization algorithm to act like the
+     * identity function.  Since only the C quantization algorithm is used in
+     * these cases, the scale value is irrelevant.
+     */
+    dtbl[DCTSIZE2 * 0] = (DCTELEM) 1;                       /* reciprocal */
+    dtbl[DCTSIZE2 * 1] = (DCTELEM) 0;                       /* correction */
+    dtbl[DCTSIZE2 * 2] = (DCTELEM) 1;                       /* scale */
+    dtbl[DCTSIZE2 * 3] = (DCTELEM) (-sizeof(DCTELEM) * 8);  /* shift */
+    return 0;
+  }
+
   b = flss(divisor) - 1;
   r  = sizeof(DCTELEM) * 8 + b;
 
   fq = ((UDCTELEM2)1 << r) / divisor;
   fr = ((UDCTELEM2)1 << r) % divisor;
 
   c = divisor / 2; /* for rounding */
 
@@ -390,17 +403,18 @@ METHODDEF(void)
 quantize (JCOEFPTR coef_block, DCTELEM * divisors, DCTELEM * workspace)
 {
   int i;
   DCTELEM temp;
   JCOEFPTR output_ptr = coef_block;
 
 #if BITS_IN_JSAMPLE == 8
 
-  UDCTELEM recip, corr, shift;
+  UDCTELEM recip, corr;
+  int shift;
   UDCTELEM2 product;
 
   for (i = 0; i < DCTSIZE2; i++) {
     temp = workspace[i];
     recip = divisors[i + DCTSIZE2 * 0];
     corr =  divisors[i + DCTSIZE2 * 1];
     shift = divisors[i + DCTSIZE2 * 3];
 
--- a/media/libjpeg/jchuff.c
+++ b/media/libjpeg/jchuff.c
@@ -1,15 +1,15 @@
 /*
  * jchuff.c
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1997, Thomas G. Lane.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2009-2011, 2014 D. R. Commander.
+ * Copyright (C) 2009-2011, 2014-2015 D. R. Commander.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains Huffman entropy encoding routines.
  *
  * Much of the complexity here has to do with supporting output suspension.
  * If the data destination module demands suspension, we want to be able to
  * back up to the start of the current MCU.  To do this, we copy state
  * variables into local working storage, and update them back to the
@@ -371,17 +371,21 @@ dump_buffer (working_state * state)
     EMIT_BYTE() \
     EMIT_BYTE() \
     EMIT_BYTE() \
     EMIT_BYTE() \
     EMIT_BYTE() \
   } \
 }
 
-#if __WORDSIZE==64 || defined(_WIN64)
+#if !defined(_WIN32) && !defined(SIZEOF_SIZE_T)
+#error Cannot determine word size
+#endif
+
+#if SIZEOF_SIZE_T==8 || defined(_WIN64)
 
 #define EMIT_BITS(code, size) { \
   CHECKBUF47() \
   PUT_BITS(code, size) \
 }
 
 #define EMIT_CODE(code, size) { \
   temp2 &= (((INT32) 1)<<nbits) - 1; \
@@ -508,26 +512,24 @@ encode_one_block (working_state * state,
   temp2 += temp3;
 
   /* Find the number of bits needed for the magnitude of the coefficient */
   nbits = JPEG_NBITS(temp);
 
   /* Emit the Huffman-coded symbol for the number of bits */
   code = dctbl->ehufco[nbits];
   size = dctbl->ehufsi[nbits];
-  PUT_BITS(code, size)
-  CHECKBUF15()
+  EMIT_BITS(code, size)
 
   /* Mask off any extra bits in code */
   temp2 &= (((INT32) 1)<<nbits) - 1;
 
   /* Emit that number of bits of the value, if positive, */
   /* or the complement of its magnitude, if negative. */
-  PUT_BITS(temp2, nbits)
-  CHECKBUF15()
+  EMIT_BITS(temp2, nbits)
 
   /* Encode the AC coefficients per section F.1.2.2 */
 
   r = 0;                        /* r = run length of zeros */
 
 /* Manually unroll the k loop to eliminate the counter variable.  This
  * improves performance greatly on systems with a limited number of
  * registers (such as x86.)
--- a/media/libjpeg/jconfig.h
+++ b/media/libjpeg/jconfig.h
@@ -61,8 +61,15 @@
 /* # undef __CHAR_UNSIGNED__ */
 #endif
 
 /* Define to empty if `const' does not conform to ANSI C. */
 /* #undef const */
 
 /* Define to `unsigned int' if <sys/types.h> does not define. */
 /* #undef size_t */
+
+/* The size of `size_t', as computed by sizeof. */
+#ifdef HAVE_64BIT_BUILD
+#define SIZEOF_SIZE_T 8
+#else
+#define SIZEOF_SIZE_T 4
+#endif
--- a/media/libjpeg/jdarith.c
+++ b/media/libjpeg/jdarith.c
@@ -301,17 +301,17 @@ decode_mcu_DC_first (j_decompress_ptr ci
       st += 14;
       while (m >>= 1)
         if (arith_decode(cinfo, st)) v |= m;
       v += 1; if (sign) v = -v;
       entropy->last_dc_val[ci] += v;
     }
 
     /* Scale and output the DC coefficient (assumes jpeg_natural_order[0]=0) */
-    (*block)[0] = (JCOEF) (entropy->last_dc_val[ci] << cinfo->Al);
+    (*block)[0] = (JCOEF) LEFT_SHIFT(entropy->last_dc_val[ci], cinfo->Al);
   }
 
   return TRUE;
 }
 
 
 /*
  * MCU decoding for AC initial scan (either spectral selection,
--- a/media/libjpeg/jdcolor.c
+++ b/media/libjpeg/jdcolor.c
@@ -1,17 +1,17 @@
 /*
  * jdcolor.c
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1997, Thomas G. Lane.
  * Modified 2011 by Guido Vollbeding.
  * libjpeg-turbo Modifications:
  * Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
- * Copyright (C) 2009, 2011-2012, 2014, D. R. Commander.
+ * Copyright (C) 2009, 2011-2012, 2014-2015, D. R. Commander.
  * Copyright (C) 2013, Linaro Limited.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains output colorspace conversion routines.
  */
 
 #define JPEG_INTERNALS
 #include "jinclude.h"
@@ -359,33 +359,63 @@ rgb_gray_convert (j_decompress_ptr cinfo
  * converting from separate-planes to interleaved representation.
  */
 
 METHODDEF(void)
 null_convert (j_decompress_ptr cinfo,
               JSAMPIMAGE input_buf, JDIMENSION input_row,
               JSAMPARRAY output_buf, int num_rows)
 {
-  register JSAMPROW inptr, outptr;
-  register JDIMENSION count;
+  register JSAMPROW inptr, inptr0, inptr1, inptr2, inptr3, outptr;
+  register JDIMENSION col;
   register int num_components = cinfo->num_components;
   JDIMENSION num_cols = cinfo->output_width;
   int ci;
 
-  while (--num_rows >= 0) {
-    for (ci = 0; ci < num_components; ci++) {
-      inptr = input_buf[ci][input_row];
-      outptr = output_buf[0] + ci;
-      for (count = num_cols; count > 0; count--) {
-        *outptr = *inptr++;     /* needn't bother with GETJSAMPLE() here */
-        outptr += num_components;
+  if (num_components == 3) {
+    while (--num_rows >= 0) {
+      inptr0 = input_buf[0][input_row];
+      inptr1 = input_buf[1][input_row];
+      inptr2 = input_buf[2][input_row];
+      input_row++;
+      outptr = *output_buf++;
+      for (col = 0; col < num_cols; col++) {
+        *outptr++ = inptr0[col];
+        *outptr++ = inptr1[col];
+        *outptr++ = inptr2[col];
       }
     }
-    input_row++;
-    output_buf++;
+  } else if (num_components == 4) {
+    while (--num_rows >= 0) {
+      inptr0 = input_buf[0][input_row];
+      inptr1 = input_buf[1][input_row];
+      inptr2 = input_buf[2][input_row];
+      inptr3 = input_buf[3][input_row];
+      input_row++;
+      outptr = *output_buf++;
+      for (col = 0; col < num_cols; col++) {
+        *outptr++ = inptr0[col];
+        *outptr++ = inptr1[col];
+        *outptr++ = inptr2[col];
+        *outptr++ = inptr3[col];
+      }
+    }
+  } else {
+    while (--num_rows >= 0) {
+      for (ci = 0; ci < num_components; ci++) {
+        inptr = input_buf[ci][input_row];
+        outptr = *output_buf;
+        for (col = 0; col < num_cols; col++) {
+          outptr[ci] = inptr[col];
+          outptr += num_components;
+        }
+      }
+      output_buf++;
+      input_row++;
+    }
   }
 }
 
 
 /*
  * Color conversion for grayscale: just copy the data.
  * This also works for YCbCr -> grayscale conversion, in which
  * we just copy the Y (luminance) component and ignore chrominance.
--- a/media/libjpeg/jddctmgr.c
+++ b/media/libjpeg/jddctmgr.c
@@ -176,16 +176,17 @@ start_pass (j_decompress_ptr cinfo)
         method = JDCT_FLOAT;
         break;
 #endif
       default:
         ERREXIT(cinfo, JERR_NOT_COMPILED);
         break;
       }
       break;
+#ifdef IDCT_SCALING_SUPPORTED
     case 9:
       method_ptr = jpeg_idct_9x9;
       method = JDCT_ISLOW;      /* jidctint uses islow-style table */
       break;
     case 10:
       method_ptr = jpeg_idct_10x10;
       method = JDCT_ISLOW;      /* jidctint uses islow-style table */
       break;
@@ -213,16 +214,17 @@ start_pass (j_decompress_ptr cinfo)
     case 15:
       method_ptr = jpeg_idct_15x15;
       method = JDCT_ISLOW;      /* jidctint uses islow-style table */
       break;
     case 16:
       method_ptr = jpeg_idct_16x16;
       method = JDCT_ISLOW;      /* jidctint uses islow-style table */
       break;
+#endif
     default:
       ERREXIT1(cinfo, JERR_BAD_DCTSIZE, compptr->_DCT_scaled_size);
       break;
     }
     idct->pub.inverse_DCT[ci] = method_ptr;
     /* Create multiplier table from quant table.
      * However, we can skip this if the component is uninteresting
      * or if we already built the table.  Also, if no quant table
--- a/media/libjpeg/jdhuff.c
+++ b/media/libjpeg/jdhuff.c
@@ -86,36 +86,37 @@ typedef huff_entropy_decoder * huff_entr
  * Initialize for a Huffman-compressed scan.
  */
 
 METHODDEF(void)
 start_pass_huff_decoder (j_decompress_ptr cinfo)
 {
   huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
   int ci, blkn, dctbl, actbl;
+  d_derived_tbl **pdtbl;
   jpeg_component_info * compptr;
 
   /* Check that the scan parameters Ss, Se, Ah/Al are OK for sequential JPEG.
    * This ought to be an error condition, but we make it a warning because
    * there are some baseline files out there with all zeroes in these bytes.
    */
   if (cinfo->Ss != 0 || cinfo->Se != DCTSIZE2-1 ||
       cinfo->Ah != 0 || cinfo->Al != 0)
     WARNMS(cinfo, JWRN_NOT_SEQUENTIAL);
 
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
     compptr = cinfo->cur_comp_info[ci];
     dctbl = compptr->dc_tbl_no;
     actbl = compptr->ac_tbl_no;
     /* Compute derived values for Huffman tables */
     /* We may do this more than once for a table, but it's not expensive */
-    jpeg_make_d_derived_tbl(cinfo, TRUE, dctbl,
-                            & entropy->dc_derived_tbls[dctbl]);
-    jpeg_make_d_derived_tbl(cinfo, FALSE, actbl,
-                            & entropy->ac_derived_tbls[actbl]);
+    pdtbl = entropy->dc_derived_tbls + dctbl;
+    jpeg_make_d_derived_tbl(cinfo, TRUE, dctbl, pdtbl);
+    pdtbl = entropy->ac_derived_tbls + actbl;
+    jpeg_make_d_derived_tbl(cinfo, FALSE, actbl, pdtbl);
     /* Initialize DC predictions to 0 */
     entropy->saved.last_dc_val[ci] = 0;
   }
 
   /* Precalculate decoding info for each block in an MCU of this scan */
   for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
     ci = cinfo->MCU_membership[blkn];
     compptr = cinfo->cur_comp_info[ci];
@@ -414,29 +415,29 @@ jpeg_fill_bit_buffer (bitread_working_st
       cinfo->unread_marker = c1; \
       /* Back out pre-execution and fill the buffer with zero bits */ \
       buffer -= 2; \
       get_buffer &= ~0xFF; \
     } \
   } \
 }
 
-#if __WORDSIZE == 64 || defined(_WIN64)
+#if SIZEOF_SIZE_T==8 || defined(_WIN64)
 
 /* Pre-fetch 48 bytes, because the holding register is 64-bit */
 #define FILL_BIT_BUFFER_FAST \
-  if (bits_left < 16) { \
+  if (bits_left <= 16) { \
     GET_BYTE GET_BYTE GET_BYTE GET_BYTE GET_BYTE GET_BYTE \
   }
 
 #else
 
 /* Pre-fetch 16 bytes, because the holding register is 32-bit */
 #define FILL_BIT_BUFFER_FAST \
-  if (bits_left < 16) { \
+  if (bits_left <= 16) { \
     GET_BYTE GET_BYTE \
   }
 
 #endif
 
 
 /*
  * Out-of-line code for Huffman code decoding.
@@ -485,17 +486,18 @@ jpeg_huff_decode (bitread_working_state 
 /*
  * Figure F.12: extend sign bit.
  * On some machines, a shift and add will be faster than a table lookup.
  */
 
 #define AVOID_TABLES
 #ifdef AVOID_TABLES
 
-#define HUFF_EXTEND(x,s)  ((x) + ((((x) - (1<<((s)-1))) >> 31) & (((-1)<<(s)) + 1)))
+#define NEG_1 ((unsigned int)-1)
+#define HUFF_EXTEND(x,s)  ((x) + ((((x) - (1<<((s)-1))) >> 31) & (((NEG_1)<<(s)) + 1)))
 
 #else
 
 #define HUFF_EXTEND(x,s)  ((x) < extend_test[s] ? (x) + extend_offset[s] : (x))
 
 static const int extend_test[16] =   /* entry n is 2**(n-1) */
   { 0, 0x0001, 0x0002, 0x0004, 0x0008, 0x0010, 0x0020, 0x0040, 0x0080,
     0x0100, 0x0200, 0x0400, 0x0800, 0x1000, 0x2000, 0x4000 };
--- a/media/libjpeg/jdhuff.h
+++ b/media/libjpeg/jdhuff.h
@@ -62,17 +62,21 @@ EXTERN(void) jpeg_make_d_derived_tbl
  * as full as possible (not just to the number of bits needed; this
  * prefetching reduces the overhead cost of calling jpeg_fill_bit_buffer).
  * Note that jpeg_fill_bit_buffer may return FALSE to indicate suspension.
  * On TRUE return, jpeg_fill_bit_buffer guarantees that get_buffer contains
  * at least the requested number of bits --- dummy zeroes are inserted if
  * necessary.
  */
 
-#if __WORDSIZE == 64 || defined(_WIN64)
+#if !defined(_WIN32) && !defined(SIZEOF_SIZE_T)
+#error Cannot determine word size
+#endif
+
+#if SIZEOF_SIZE_T==8 || defined(_WIN64)
 
 typedef size_t bit_buf_type;    /* type of bit-extraction buffer */
 #define BIT_BUF_SIZE  64                /* size of buffer in bits */
 
 #else
 
 typedef INT32 bit_buf_type;     /* type of bit-extraction buffer */
 #define BIT_BUF_SIZE  32                /* size of buffer in bits */
--- a/media/libjpeg/jdphuff.c
+++ b/media/libjpeg/jdphuff.c
@@ -1,15 +1,15 @@
 /*
  * jdphuff.c
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1995-1997, Thomas G. Lane.
- * It was modified by The libjpeg-turbo Project to include only code relevant
- * to libjpeg-turbo.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2015, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains Huffman entropy decoding routines for progressive JPEG.
  *
  * Much of the complexity here has to do with supporting input suspension.
  * If the data source module demands suspension, we want to be able to back
  * up to the start of the current MCU.  To do this, we copy state variables
  * into local working storage, and update them back to the permanent
@@ -91,16 +91,17 @@ METHODDEF(boolean) decode_mcu_AC_refine 
  */
 
 METHODDEF(void)
 start_pass_phuff_decoder (j_decompress_ptr cinfo)
 {
   phuff_entropy_ptr entropy = (phuff_entropy_ptr) cinfo->entropy;
   boolean is_DC_band, bad;
   int ci, coefi, tbl;
+  d_derived_tbl **pdtbl;
   int *coef_bit_ptr;
   jpeg_component_info * compptr;
 
   is_DC_band = (cinfo->Ss == 0);
 
   /* Validate scan parameters */
   bad = FALSE;
   if (is_DC_band) {
@@ -163,23 +164,23 @@ start_pass_phuff_decoder (j_decompress_p
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
     compptr = cinfo->cur_comp_info[ci];
     /* Make sure requested tables are present, and compute derived tables.
      * We may build same derived table more than once, but it's not expensive.
      */
     if (is_DC_band) {
       if (cinfo->Ah == 0) {     /* DC refinement needs no table */
         tbl = compptr->dc_tbl_no;
-        jpeg_make_d_derived_tbl(cinfo, TRUE, tbl,
-                                & entropy->derived_tbls[tbl]);
+        pdtbl = entropy->derived_tbls + tbl;
+        jpeg_make_d_derived_tbl(cinfo, TRUE, tbl, pdtbl);
       }
     } else {
       tbl = compptr->ac_tbl_no;
-      jpeg_make_d_derived_tbl(cinfo, FALSE, tbl,
-                              & entropy->derived_tbls[tbl]);
+      pdtbl = entropy->derived_tbls + tbl;
+      jpeg_make_d_derived_tbl(cinfo, FALSE, tbl, pdtbl);
       /* remember the single active table */
       entropy->ac_derived_tbl = entropy->derived_tbls[tbl];
     }
     /* Initialize DC predictions to 0 */
     entropy->saved.last_dc_val[ci] = 0;
   }
 
   /* Initialize bitread state variables */
@@ -198,17 +199,18 @@ start_pass_phuff_decoder (j_decompress_p
 /*
  * Figure F.12: extend sign bit.
  * On some machines, a shift and add will be faster than a table lookup.
  */
 
 #define AVOID_TABLES
 #ifdef AVOID_TABLES
 
-#define HUFF_EXTEND(x,s)  ((x) < (1<<((s)-1)) ? (x) + (((-1)<<(s)) + 1) : (x))
+#define NEG_1 ((unsigned)-1)
+#define HUFF_EXTEND(x,s)  ((x) < (1<<((s)-1)) ? (x) + (((NEG_1)<<(s)) + 1) : (x))
 
 #else
 
 #define HUFF_EXTEND(x,s)  ((x) < extend_test[s] ? (x) + extend_offset[s] : (x))
 
 static const int extend_test[16] =   /* entry n is 2**(n-1) */
   { 0, 0x0001, 0x0002, 0x0004, 0x0008, 0x0010, 0x0020, 0x0040, 0x0080,
     0x0100, 0x0200, 0x0400, 0x0800, 0x1000, 0x2000, 0x4000 };
@@ -331,17 +333,17 @@ decode_mcu_DC_first (j_decompress_ptr ci
         r = GET_BITS(s);
         s = HUFF_EXTEND(r, s);
       }
 
       /* Convert DC difference to actual value, update last_dc_val */
       s += state.last_dc_val[ci];
       state.last_dc_val[ci] = s;
       /* Scale and output the coefficient (assumes jpeg_natural_order[0]=0) */
-      (*block)[0] = (JCOEF) (s << Al);
+      (*block)[0] = (JCOEF) LEFT_SHIFT(s, Al);
     }
 
     /* Completed MCU, so update state */
     BITREAD_SAVE_STATE(cinfo,entropy->bitstate);
     ASSIGN_STATE(entropy->saved, state);
   }
 
   /* Account for restart interval (no-op if not using restarts) */
@@ -399,17 +401,17 @@ decode_mcu_AC_first (j_decompress_ptr ci
         r = s >> 4;
         s &= 15;
         if (s) {
           k += r;
           CHECK_BIT_BUFFER(br_state, s, return FALSE);
           r = GET_BITS(s);
           s = HUFF_EXTEND(r, s);
           /* Scale and output coefficient in natural (dezigzagged) order */
-          (*block)[jpeg_natural_order[k]] = (JCOEF) (s << Al);
+          (*block)[jpeg_natural_order[k]] = (JCOEF) LEFT_SHIFT(s, Al);
         } else {
           if (r == 15) {        /* ZRL */
             k += 15;            /* skip 15 zeroes in band */
           } else {              /* EOBr, run length is 2^r + appended bits */
             EOBRUN = 1 << r;
             if (r) {            /* EOBr, r > 0 */
               CHECK_BIT_BUFFER(br_state, r, return FALSE);
               r = GET_BITS(r);
@@ -490,18 +492,18 @@ decode_mcu_DC_refine (j_decompress_ptr c
  * MCU decoding for AC successive approximation refinement scan.
  */
 
 METHODDEF(boolean)
 decode_mcu_AC_refine (j_decompress_ptr cinfo, JBLOCKROW *MCU_data)
 {
   phuff_entropy_ptr entropy = (phuff_entropy_ptr) cinfo->entropy;
   int Se = cinfo->Se;
-  int p1 = 1 << cinfo->Al;      /* 1 in the bit position being coded */
-  int m1 = (-1) << cinfo->Al;   /* -1 in the bit position being coded */
+  int p1 = 1 << cinfo->Al;        /* 1 in the bit position being coded */
+  int m1 = (NEG_1) << cinfo->Al;  /* -1 in the bit position being coded */
   register int s, k, r;
   unsigned int EOBRUN;
   JBLOCKROW block;
   JCOEFPTR thiscoef;
   BITREAD_STATE_VARS;
   d_derived_tbl * tbl;
   int num_newnz;
   int newnz_pos[DCTSIZE2];
--- a/media/libjpeg/jfdctint.c
+++ b/media/libjpeg/jfdctint.c
@@ -1,13 +1,15 @@
 /*
  * jfdctint.c
  *
+ * This file was part of the Independent JPEG Group's software.
  * Copyright (C) 1991-1996, Thomas G. Lane.
- * This file is part of the Independent JPEG Group's software.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2015, D. R. Commander
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains a slow-but-accurate integer implementation of the
  * forward DCT (Discrete Cosine Transform).
  *
  * A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT
  * on each column.  Direct algorithms are also available, but they are
  * much more complex and seem not to be any faster when reduced to code.
@@ -165,18 +167,18 @@ jpeg_fdct_islow (DCTELEM * data)
      * rotator "sqrt(2)*c1" should be "sqrt(2)*c6".
      */
 
     tmp10 = tmp0 + tmp3;
     tmp13 = tmp0 - tmp3;
     tmp11 = tmp1 + tmp2;
     tmp12 = tmp1 - tmp2;
 
-    dataptr[0] = (DCTELEM) ((tmp10 + tmp11) << PASS1_BITS);
-    dataptr[4] = (DCTELEM) ((tmp10 - tmp11) << PASS1_BITS);
+    dataptr[0] = (DCTELEM) LEFT_SHIFT(tmp10 + tmp11, PASS1_BITS);
+    dataptr[4] = (DCTELEM) LEFT_SHIFT(tmp10 - tmp11, PASS1_BITS);
 
     z1 = MULTIPLY(tmp12 + tmp13, FIX_0_541196100);
     dataptr[2] = (DCTELEM) DESCALE(z1 + MULTIPLY(tmp13, FIX_0_765366865),
                                    CONST_BITS-PASS1_BITS);
     dataptr[6] = (DCTELEM) DESCALE(z1 + MULTIPLY(tmp12, - FIX_1_847759065),
                                    CONST_BITS-PASS1_BITS);
 
     /* Odd part per figure 8 --- note paper omits factor of sqrt(2).
--- a/media/libjpeg/jidctint.c
+++ b/media/libjpeg/jidctint.c
@@ -1,14 +1,16 @@
 /*
  * jidctint.c
  *
+ * This file was part of the Independent JPEG Group's software.
  * Copyright (C) 1991-1998, Thomas G. Lane.
  * Modification developed 2002-2009 by Guido Vollbeding.
- * This file is part of the Independent JPEG Group's software.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2015, D. R. Commander
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains a slow-but-accurate integer implementation of the
  * inverse DCT (Discrete Cosine Transform).  In the IJG code, this routine
  * must also perform dequantization of the input coefficients.
  *
  * A 2-D IDCT can be done by 1-D IDCT on each column followed by 1-D IDCT
  * on each row (or vice versa, but it's more convenient to emit a row at
@@ -200,17 +202,18 @@ jpeg_idct_islow (j_decompress_ptr cinfo,
      * column DCT calculations can be simplified this way.
      */
 
     if (inptr[DCTSIZE*1] == 0 && inptr[DCTSIZE*2] == 0 &&
         inptr[DCTSIZE*3] == 0 && inptr[DCTSIZE*4] == 0 &&
         inptr[DCTSIZE*5] == 0 && inptr[DCTSIZE*6] == 0 &&
         inptr[DCTSIZE*7] == 0) {
       /* AC terms all zero */
-      int dcval = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]) << PASS1_BITS;
+      int dcval = LEFT_SHIFT(DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]),
+                             PASS1_BITS);
 
       wsptr[DCTSIZE*0] = dcval;
       wsptr[DCTSIZE*1] = dcval;
       wsptr[DCTSIZE*2] = dcval;
       wsptr[DCTSIZE*3] = dcval;
       wsptr[DCTSIZE*4] = dcval;
       wsptr[DCTSIZE*5] = dcval;
       wsptr[DCTSIZE*6] = dcval;
@@ -230,18 +233,18 @@ jpeg_idct_islow (j_decompress_ptr cinfo,
 
     z1 = MULTIPLY(z2 + z3, FIX_0_541196100);
     tmp2 = z1 + MULTIPLY(z3, - FIX_1_847759065);
     tmp3 = z1 + MULTIPLY(z2, FIX_0_765366865);
 
     z2 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
 
-    tmp0 = (z2 + z3) << CONST_BITS;
-    tmp1 = (z2 - z3) << CONST_BITS;
+    tmp0 = LEFT_SHIFT(z2 + z3, CONST_BITS);
+    tmp1 = LEFT_SHIFT(z2 - z3, CONST_BITS);
 
     tmp10 = tmp0 + tmp3;
     tmp13 = tmp0 - tmp3;
     tmp11 = tmp1 + tmp2;
     tmp12 = tmp1 - tmp2;
 
     /* Odd part per figure 8; the matrix is unitary and hence its
      * transpose is its inverse.  i0..i3 are y7,y5,y3,y1 respectively.
@@ -332,18 +335,18 @@ jpeg_idct_islow (j_decompress_ptr cinfo,
 
     z2 = (INT32) wsptr[2];
     z3 = (INT32) wsptr[6];
 
     z1 = MULTIPLY(z2 + z3, FIX_0_541196100);
     tmp2 = z1 + MULTIPLY(z3, - FIX_1_847759065);
     tmp3 = z1 + MULTIPLY(z2, FIX_0_765366865);
 
-    tmp0 = ((INT32) wsptr[0] + (INT32) wsptr[4]) << CONST_BITS;
-    tmp1 = ((INT32) wsptr[0] - (INT32) wsptr[4]) << CONST_BITS;
+    tmp0 = LEFT_SHIFT((INT32) wsptr[0] + (INT32) wsptr[4], CONST_BITS);
+    tmp1 = LEFT_SHIFT((INT32) wsptr[0] - (INT32) wsptr[4], CONST_BITS);
 
     tmp10 = tmp0 + tmp3;
     tmp13 = tmp0 - tmp3;
     tmp11 = tmp1 + tmp2;
     tmp12 = tmp1 - tmp2;
 
     /* Odd part per figure 8; the matrix is unitary and hence its
      * transpose is its inverse.  i0..i3 are y7,y5,y3,y1 respectively.
@@ -439,17 +442,17 @@ jpeg_idct_7x7 (j_decompress_ptr cinfo, j
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 7; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     tmp13 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp13 <<= CONST_BITS;
+    tmp13 = LEFT_SHIFT(tmp13, CONST_BITS);
     /* Add fudge factor here for final descale. */
     tmp13 += ONE << (CONST_BITS-PASS1_BITS-1);
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     tmp10 = MULTIPLY(z2 - z3, FIX(0.881747734));     /* c4 */
@@ -494,17 +497,17 @@ jpeg_idct_7x7 (j_decompress_ptr cinfo, j
   wsptr = workspace;
   for (ctr = 0; ctr < 7; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     tmp13 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    tmp13 <<= CONST_BITS;
+    tmp13 = LEFT_SHIFT(tmp13, CONST_BITS);
 
     z1 = (INT32) wsptr[2];
     z2 = (INT32) wsptr[4];
     z3 = (INT32) wsptr[6];
 
     tmp10 = MULTIPLY(z2 - z3, FIX(0.881747734));     /* c4 */
     tmp12 = MULTIPLY(z1 - z2, FIX(0.314692123));     /* c6 */
     tmp11 = tmp10 + tmp12 + tmp13 - MULTIPLY(z2, FIX(1.841218003)); /* c2+c4-c6 */
@@ -588,17 +591,17 @@ jpeg_idct_6x6 (j_decompress_ptr cinfo, j
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 6; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
     /* Add fudge factor here for final descale. */
     tmp0 += ONE << (CONST_BITS-PASS1_BITS-1);
     tmp2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     tmp10 = MULTIPLY(tmp2, FIX(0.707106781));   /* c4 */
     tmp1 = tmp0 + tmp10;
     tmp11 = RIGHT_SHIFT(tmp0 - tmp10 - tmp10, CONST_BITS-PASS1_BITS);
     tmp10 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     tmp0 = MULTIPLY(tmp10, FIX(1.224744871));   /* c2 */
@@ -606,19 +609,19 @@ jpeg_idct_6x6 (j_decompress_ptr cinfo, j
     tmp12 = tmp1 - tmp0;
 
     /* Odd part */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]);
     z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]);
     tmp1 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */
-    tmp0 = tmp1 + ((z1 + z2) << CONST_BITS);
-    tmp2 = tmp1 + ((z3 - z2) << CONST_BITS);
-    tmp1 = (z1 - z2 - z3) << PASS1_BITS;
+    tmp0 = tmp1 + LEFT_SHIFT(z1 + z2, CONST_BITS);
+    tmp2 = tmp1 + LEFT_SHIFT(z3 - z2, CONST_BITS);
+    tmp1 = LEFT_SHIFT(z1 - z2 - z3, PASS1_BITS);
 
     /* Final output stage */
 
     wsptr[6*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS);
     wsptr[6*5] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS);
     wsptr[6*1] = (int) (tmp11 + tmp1);
     wsptr[6*4] = (int) (tmp11 - tmp1);
     wsptr[6*2] = (int) RIGHT_SHIFT(tmp12 + tmp2, CONST_BITS-PASS1_BITS);
@@ -630,35 +633,35 @@ jpeg_idct_6x6 (j_decompress_ptr cinfo, j
   wsptr = workspace;
   for (ctr = 0; ctr < 6; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
     tmp2 = (INT32) wsptr[4];
     tmp10 = MULTIPLY(tmp2, FIX(0.707106781));   /* c4 */
     tmp1 = tmp0 + tmp10;
     tmp11 = tmp0 - tmp10 - tmp10;
     tmp10 = (INT32) wsptr[2];
     tmp0 = MULTIPLY(tmp10, FIX(1.224744871));   /* c2 */
     tmp10 = tmp1 + tmp0;
     tmp12 = tmp1 - tmp0;
 
     /* Odd part */
 
     z1 = (INT32) wsptr[1];
     z2 = (INT32) wsptr[3];
     z3 = (INT32) wsptr[5];
     tmp1 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */
-    tmp0 = tmp1 + ((z1 + z2) << CONST_BITS);
-    tmp2 = tmp1 + ((z3 - z2) << CONST_BITS);
-    tmp1 = (z1 - z2 - z3) << CONST_BITS;
+    tmp0 = tmp1 + LEFT_SHIFT(z1 + z2, CONST_BITS);
+    tmp2 = tmp1 + LEFT_SHIFT(z3 - z2, CONST_BITS);
+    tmp1 = LEFT_SHIFT(z1 - z2 - z3, CONST_BITS);
 
     /* Final output stage */
 
     outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0,
                                               CONST_BITS+PASS1_BITS+3)
                             & RANGE_MASK];
     outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0,
                                               CONST_BITS+PASS1_BITS+3)
@@ -709,27 +712,27 @@ jpeg_idct_5x5 (j_decompress_ptr cinfo, j
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 5; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     tmp12 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp12 <<= CONST_BITS;
+    tmp12 = LEFT_SHIFT(tmp12, CONST_BITS);
     /* Add fudge factor here for final descale. */
     tmp12 += ONE << (CONST_BITS-PASS1_BITS-1);
     tmp0 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     tmp1 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z1 = MULTIPLY(tmp0 + tmp1, FIX(0.790569415)); /* (c2+c4)/2 */
     z2 = MULTIPLY(tmp0 - tmp1, FIX(0.353553391)); /* (c2-c4)/2 */
     z3 = tmp12 + z2;
     tmp10 = z3 + z1;
     tmp11 = z3 - z1;
-    tmp12 -= z2 << 2;
+    tmp12 -= LEFT_SHIFT(z2, 2);
 
     /* Odd part */
 
     z2 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]);
 
     z1 = MULTIPLY(z2 + z3, FIX(0.831253876));     /* c3 */
     tmp0 = z1 + MULTIPLY(z2, FIX(0.513743148));   /* c1-c3 */
@@ -749,25 +752,25 @@ jpeg_idct_5x5 (j_decompress_ptr cinfo, j
   wsptr = workspace;
   for (ctr = 0; ctr < 5; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     tmp12 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    tmp12 <<= CONST_BITS;
+    tmp12 = LEFT_SHIFT(tmp12, CONST_BITS);
     tmp0 = (INT32) wsptr[2];
     tmp1 = (INT32) wsptr[4];
     z1 = MULTIPLY(tmp0 + tmp1, FIX(0.790569415)); /* (c2+c4)/2 */
     z2 = MULTIPLY(tmp0 - tmp1, FIX(0.353553391)); /* (c2-c4)/2 */
     z3 = tmp12 + z2;
     tmp10 = z3 + z1;
     tmp11 = z3 - z1;
-    tmp12 -= z2 << 2;
+    tmp12 -= LEFT_SHIFT(z2, 2);
 
     /* Odd part */
 
     z2 = (INT32) wsptr[1];
     z3 = (INT32) wsptr[3];
 
     z1 = MULTIPLY(z2 + z3, FIX(0.831253876));     /* c3 */
     tmp0 = z1 + MULTIPLY(z2, FIX(0.513743148));   /* c1-c3 */
@@ -823,17 +826,17 @@ jpeg_idct_3x3 (j_decompress_ptr cinfo, j
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 3; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
     /* Add fudge factor here for final descale. */
     tmp0 += ONE << (CONST_BITS-PASS1_BITS-1);
     tmp2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     tmp12 = MULTIPLY(tmp2, FIX(0.707106781)); /* c2 */
     tmp10 = tmp0 + tmp12;
     tmp2 = tmp0 - tmp12 - tmp12;
 
     /* Odd part */
@@ -853,17 +856,17 @@ jpeg_idct_3x3 (j_decompress_ptr cinfo, j
   wsptr = workspace;
   for (ctr = 0; ctr < 3; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
     tmp2 = (INT32) wsptr[2];
     tmp12 = MULTIPLY(tmp2, FIX(0.707106781)); /* c2 */
     tmp10 = tmp0 + tmp12;
     tmp2 = tmp0 - tmp12 - tmp12;
 
     /* Odd part */
 
     tmp12 = (INT32) wsptr[1];
@@ -914,17 +917,17 @@ jpeg_idct_9x9 (j_decompress_ptr cinfo, j
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
     /* Add fudge factor here for final descale. */
     tmp0 += ONE << (CONST_BITS-PASS1_BITS-1);
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     tmp3 = MULTIPLY(z3, FIX(0.707106781));      /* c6 */
@@ -978,17 +981,17 @@ jpeg_idct_9x9 (j_decompress_ptr cinfo, j
   wsptr = workspace;
   for (ctr = 0; ctr < 9; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
 
     z1 = (INT32) wsptr[2];
     z2 = (INT32) wsptr[4];
     z3 = (INT32) wsptr[6];
 
     tmp3 = MULTIPLY(z3, FIX(0.707106781));      /* c6 */
     tmp1 = tmp0 + tmp3;
     tmp2 = tmp0 - tmp3 - tmp3;
@@ -1086,27 +1089,27 @@ jpeg_idct_10x10 (j_decompress_ptr cinfo,
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     z3 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    z3 <<= CONST_BITS;
+    z3 = LEFT_SHIFT(z3, CONST_BITS);
     /* Add fudge factor here for final descale. */
     z3 += ONE << (CONST_BITS-PASS1_BITS-1);
     z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z1 = MULTIPLY(z4, FIX(1.144122806));         /* c4 */
     z2 = MULTIPLY(z4, FIX(0.437016024));         /* c8 */
     tmp10 = z3 + z1;
     tmp11 = z3 - z2;
 
-    tmp22 = RIGHT_SHIFT(z3 - ((z1 - z2) << 1),   /* c0 = (c4-c8)*2 */
-                        CONST_BITS-PASS1_BITS);
+    tmp22 = RIGHT_SHIFT(z3 - LEFT_SHIFT(z1 - z2, 1),
+                        CONST_BITS-PASS1_BITS);  /* c0 = (c4-c8)*2 */
 
     z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     z1 = MULTIPLY(z2 + z3, FIX(0.831253876));    /* c6 */
     tmp12 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c2-c6 */
     tmp13 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c2+c6 */
 
@@ -1121,28 +1124,28 @@ jpeg_idct_10x10 (j_decompress_ptr cinfo,
     z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]);
     z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]);
 
     tmp11 = z2 + z4;
     tmp13 = z2 - z4;
 
     tmp12 = MULTIPLY(tmp13, FIX(0.309016994));        /* (c3-c7)/2 */
-    z5 = z3 << CONST_BITS;
+    z5 = LEFT_SHIFT(z3, CONST_BITS);
 
     z2 = MULTIPLY(tmp11, FIX(0.951056516));           /* (c3+c7)/2 */
     z4 = z5 + tmp12;
 
     tmp10 = MULTIPLY(z1, FIX(1.396802247)) + z2 + z4; /* c1 */
     tmp14 = MULTIPLY(z1, FIX(0.221231742)) - z2 + z4; /* c9 */
 
     z2 = MULTIPLY(tmp11, FIX(0.587785252));           /* (c1-c9)/2 */
-    z4 = z5 - tmp12 - (tmp13 << (CONST_BITS - 1));
-
-    tmp12 = (z1 - tmp13 - z3) << PASS1_BITS;
+    z4 = z5 - tmp12 - LEFT_SHIFT(tmp13, CONST_BITS - 1);
+
+    tmp12 = LEFT_SHIFT(z1 - tmp13 - z3, PASS1_BITS);
 
     tmp11 = MULTIPLY(z1, FIX(1.260073511)) - z2 - z4; /* c3 */
     tmp13 = MULTIPLY(z1, FIX(0.642039522)) - z2 + z4; /* c7 */
 
     /* Final output stage */
 
     wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS);
     wsptr[8*9] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS);
@@ -1161,24 +1164,24 @@ jpeg_idct_10x10 (j_decompress_ptr cinfo,
   wsptr = workspace;
   for (ctr = 0; ctr < 10; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     z3 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    z3 <<= CONST_BITS;
+    z3 = LEFT_SHIFT(z3, CONST_BITS);
     z4 = (INT32) wsptr[4];
     z1 = MULTIPLY(z4, FIX(1.144122806));         /* c4 */
     z2 = MULTIPLY(z4, FIX(0.437016024));         /* c8 */
     tmp10 = z3 + z1;
     tmp11 = z3 - z2;
 
-    tmp22 = z3 - ((z1 - z2) << 1);               /* c0 = (c4-c8)*2 */
+    tmp22 = z3 - LEFT_SHIFT(z1 - z2, 1);         /* c0 = (c4-c8)*2 */
 
     z2 = (INT32) wsptr[2];
     z3 = (INT32) wsptr[6];
 
     z1 = MULTIPLY(z2 + z3, FIX(0.831253876));    /* c6 */
     tmp12 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c2-c6 */
     tmp13 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c2+c6 */
 
@@ -1187,34 +1190,34 @@ jpeg_idct_10x10 (j_decompress_ptr cinfo,
     tmp21 = tmp11 + tmp13;
     tmp23 = tmp11 - tmp13;
 
     /* Odd part */
 
     z1 = (INT32) wsptr[1];
     z2 = (INT32) wsptr[3];
     z3 = (INT32) wsptr[5];
-    z3 <<= CONST_BITS;
+    z3 = LEFT_SHIFT(z3, CONST_BITS);
     z4 = (INT32) wsptr[7];
 
     tmp11 = z2 + z4;
     tmp13 = z2 - z4;
 
     tmp12 = MULTIPLY(tmp13, FIX(0.309016994));        /* (c3-c7)/2 */
 
     z2 = MULTIPLY(tmp11, FIX(0.951056516));           /* (c3+c7)/2 */
     z4 = z3 + tmp12;
 
     tmp10 = MULTIPLY(z1, FIX(1.396802247)) + z2 + z4; /* c1 */
     tmp14 = MULTIPLY(z1, FIX(0.221231742)) - z2 + z4; /* c9 */
 
     z2 = MULTIPLY(tmp11, FIX(0.587785252));           /* (c1-c9)/2 */
-    z4 = z3 - tmp12 - (tmp13 << (CONST_BITS - 1));
-
-    tmp12 = ((z1 - tmp13) << CONST_BITS) - z3;
+    z4 = z3 - tmp12 - LEFT_SHIFT(tmp13, CONST_BITS - 1);
+
+    tmp12 = LEFT_SHIFT(z1 - tmp13, CONST_BITS) - z3;
 
     tmp11 = MULTIPLY(z1, FIX(1.260073511)) - z2 - z4; /* c3 */
     tmp13 = MULTIPLY(z1, FIX(0.642039522)) - z2 + z4; /* c7 */
 
     /* Final output stage */
 
     outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10,
                                               CONST_BITS+PASS1_BITS+3)
@@ -1281,17 +1284,17 @@ jpeg_idct_11x11 (j_decompress_ptr cinfo,
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     tmp10 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp10 <<= CONST_BITS;
+    tmp10 = LEFT_SHIFT(tmp10, CONST_BITS);
     /* Add fudge factor here for final descale. */
     tmp10 += ONE << (CONST_BITS-PASS1_BITS-1);
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     tmp20 = MULTIPLY(z2 - z3, FIX(2.546640132));     /* c2+c4 */
@@ -1354,17 +1357,17 @@ jpeg_idct_11x11 (j_decompress_ptr cinfo,
   wsptr = workspace;
   for (ctr = 0; ctr < 11; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     tmp10 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    tmp10 <<= CONST_BITS;
+    tmp10 = LEFT_SHIFT(tmp10, CONST_BITS);
 
     z1 = (INT32) wsptr[2];
     z2 = (INT32) wsptr[4];
     z3 = (INT32) wsptr[6];
 
     tmp20 = MULTIPLY(z2 - z3, FIX(2.546640132));     /* c2+c4 */
     tmp23 = MULTIPLY(z2 - z1, FIX(0.430815045));     /* c2-c6 */
     z4 = z1 + z3;
@@ -1475,31 +1478,31 @@ jpeg_idct_12x12 (j_decompress_ptr cinfo,
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     z3 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    z3 <<= CONST_BITS;
+    z3 = LEFT_SHIFT(z3, CONST_BITS);
     /* Add fudge factor here for final descale. */
     z3 += ONE << (CONST_BITS-PASS1_BITS-1);
 
     z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z4 = MULTIPLY(z4, FIX(1.224744871)); /* c4 */
 
     tmp10 = z3 + z4;
     tmp11 = z3 - z4;
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z4 = MULTIPLY(z1, FIX(1.366025404)); /* c2 */
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
     z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
-    z2 <<= CONST_BITS;
+    z2 = LEFT_SHIFT(z2, CONST_BITS);
 
     tmp12 = z1 - z2;
 
     tmp21 = z3 + tmp12;
     tmp24 = z3 - tmp12;
 
     tmp12 = z4 + z2;
 
@@ -1558,29 +1561,29 @@ jpeg_idct_12x12 (j_decompress_ptr cinfo,
   wsptr = workspace;
   for (ctr = 0; ctr < 12; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     z3 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    z3 <<= CONST_BITS;
+    z3 = LEFT_SHIFT(z3, CONST_BITS);
 
     z4 = (INT32) wsptr[4];
     z4 = MULTIPLY(z4, FIX(1.224744871)); /* c4 */
 
     tmp10 = z3 + z4;
     tmp11 = z3 - z4;
 
     z1 = (INT32) wsptr[2];
     z4 = MULTIPLY(z1, FIX(1.366025404)); /* c2 */
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
     z2 = (INT32) wsptr[6];
-    z2 <<= CONST_BITS;
+    z2 = LEFT_SHIFT(z2, CONST_BITS);
 
     tmp12 = z1 - z2;
 
     tmp21 = z3 + tmp12;
     tmp24 = z3 - tmp12;
 
     tmp12 = z4 + z2;
 
@@ -1691,17 +1694,17 @@ jpeg_idct_13x13 (j_decompress_ptr cinfo,
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
     /* Add fudge factor here for final descale. */
     z1 += ONE << (CONST_BITS-PASS1_BITS-1);
 
     z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z4 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     tmp10 = z3 + z4;
@@ -1779,17 +1782,17 @@ jpeg_idct_13x13 (j_decompress_ptr cinfo,
   wsptr = workspace;
   for (ctr = 0; ctr < 13; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     z1 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
 
     z2 = (INT32) wsptr[2];
     z3 = (INT32) wsptr[4];
     z4 = (INT32) wsptr[6];
 
     tmp10 = z3 + z4;
     tmp11 = z3 - z4;
 
@@ -1919,30 +1922,30 @@ jpeg_idct_14x14 (j_decompress_ptr cinfo,
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
     /* Add fudge factor here for final descale. */
     z1 += ONE << (CONST_BITS-PASS1_BITS-1);
     z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z2 = MULTIPLY(z4, FIX(1.274162392));         /* c4 */
     z3 = MULTIPLY(z4, FIX(0.314692123));         /* c12 */
     z4 = MULTIPLY(z4, FIX(0.881747734));         /* c8 */
 
     tmp10 = z1 + z2;
     tmp11 = z1 + z3;
     tmp12 = z1 - z4;
 
-    tmp23 = RIGHT_SHIFT(z1 - ((z2 + z3 - z4) << 1), /* c0 = (c4+c12-c8)*2 */
-                        CONST_BITS-PASS1_BITS);
+    tmp23 = RIGHT_SHIFT(z1 - LEFT_SHIFT(z2 + z3 - z4, 1),
+                        CONST_BITS-PASS1_BITS);  /* c0 = (c4+c12-c8)*2 */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     z3 = MULTIPLY(z1 + z2, FIX(1.105676686));    /* c6 */
 
     tmp13 = z3 + MULTIPLY(z1, FIX(0.273079590)); /* c2-c6 */
     tmp14 = z3 - MULTIPLY(z2, FIX(1.719280954)); /* c6+c10 */
@@ -1957,17 +1960,17 @@ jpeg_idct_14x14 (j_decompress_ptr cinfo,
     tmp24 = tmp12 - tmp15;
 
     /* Odd part */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]);
     z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]);
     z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]);
-    tmp13 = z4 << CONST_BITS;
+    tmp13 = LEFT_SHIFT(z4, CONST_BITS);
 
     tmp14 = z1 + z3;
     tmp11 = MULTIPLY(z1 + z2, FIX(1.334852607));           /* c3 */
     tmp12 = MULTIPLY(tmp14, FIX(1.197448846));             /* c5 */
     tmp10 = tmp11 + tmp12 + tmp13 - MULTIPLY(z1, FIX(1.126980169)); /* c3+c5-c1 */
     tmp14 = MULTIPLY(tmp14, FIX(0.752406978));             /* c9 */
     tmp16 = tmp14 - MULTIPLY(z1, FIX(1.061150426));        /* c9+c11-c13 */
     z1    -= z2;
@@ -1976,17 +1979,17 @@ jpeg_idct_14x14 (j_decompress_ptr cinfo,
     z1    += z4;
     z4    = MULTIPLY(z2 + z3, - FIX(0.158341681)) - tmp13; /* -c13 */
     tmp11 += z4 - MULTIPLY(z2, FIX(0.424103948));          /* c3-c9-c13 */
     tmp12 += z4 - MULTIPLY(z3, FIX(2.373959773));          /* c3+c5-c13 */
     z4    = MULTIPLY(z3 - z2, FIX(1.405321284));           /* c1 */
     tmp14 += z4 + tmp13 - MULTIPLY(z3, FIX(1.6906431334)); /* c1+c9-c11 */
     tmp15 += z4 + MULTIPLY(z2, FIX(0.674957567));          /* c1+c11-c5 */
 
-    tmp13 = (z1 - z3) << PASS1_BITS;
+    tmp13 = LEFT_SHIFT(z1 - z3, PASS1_BITS);
 
     /* Final output stage */
 
     wsptr[8*0]  = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS);
     wsptr[8*13] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS);
     wsptr[8*1]  = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS);
     wsptr[8*12] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS);
     wsptr[8*2]  = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS);
@@ -2006,27 +2009,27 @@ jpeg_idct_14x14 (j_decompress_ptr cinfo,
   wsptr = workspace;
   for (ctr = 0; ctr < 14; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     z1 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
     z4 = (INT32) wsptr[4];
     z2 = MULTIPLY(z4, FIX(1.274162392));         /* c4 */
     z3 = MULTIPLY(z4, FIX(0.314692123));         /* c12 */
     z4 = MULTIPLY(z4, FIX(0.881747734));         /* c8 */
 
     tmp10 = z1 + z2;
     tmp11 = z1 + z3;
     tmp12 = z1 - z4;
 
-    tmp23 = z1 - ((z2 + z3 - z4) << 1);          /* c0 = (c4+c12-c8)*2 */
+    tmp23 = z1 - LEFT_SHIFT(z2 + z3 - z4, 1);    /* c0 = (c4+c12-c8)*2 */
 
     z1 = (INT32) wsptr[2];
     z2 = (INT32) wsptr[6];
 
     z3 = MULTIPLY(z1 + z2, FIX(1.105676686));    /* c6 */
 
     tmp13 = z3 + MULTIPLY(z1, FIX(0.273079590)); /* c2-c6 */
     tmp14 = z3 - MULTIPLY(z2, FIX(1.719280954)); /* c6+c10 */
@@ -2041,17 +2044,17 @@ jpeg_idct_14x14 (j_decompress_ptr cinfo,
     tmp24 = tmp12 - tmp15;
 
     /* Odd part */
 
     z1 = (INT32) wsptr[1];
     z2 = (INT32) wsptr[3];
     z3 = (INT32) wsptr[5];
     z4 = (INT32) wsptr[7];
-    z4 <<= CONST_BITS;
+    z4 = LEFT_SHIFT(z4, CONST_BITS);
 
     tmp14 = z1 + z3;
     tmp11 = MULTIPLY(z1 + z2, FIX(1.334852607));           /* c3 */
     tmp12 = MULTIPLY(tmp14, FIX(1.197448846));             /* c5 */
     tmp10 = tmp11 + tmp12 + z4 - MULTIPLY(z1, FIX(1.126980169)); /* c3+c5-c1 */
     tmp14 = MULTIPLY(tmp14, FIX(0.752406978));             /* c9 */
     tmp16 = tmp14 - MULTIPLY(z1, FIX(1.061150426));        /* c9+c11-c13 */
     z1    -= z2;
@@ -2059,17 +2062,17 @@ jpeg_idct_14x14 (j_decompress_ptr cinfo,
     tmp16 += tmp15;
     tmp13 = MULTIPLY(z2 + z3, - FIX(0.158341681)) - z4;    /* -c13 */
     tmp11 += tmp13 - MULTIPLY(z2, FIX(0.424103948));       /* c3-c9-c13 */
     tmp12 += tmp13 - MULTIPLY(z3, FIX(2.373959773));       /* c3+c5-c13 */
     tmp13 = MULTIPLY(z3 - z2, FIX(1.405321284));           /* c1 */
     tmp14 += tmp13 + z4 - MULTIPLY(z3, FIX(1.6906431334)); /* c1+c9-c11 */
     tmp15 += tmp13 + MULTIPLY(z2, FIX(0.674957567));       /* c1+c11-c5 */
 
-    tmp13 = ((z1 - z3) << CONST_BITS) + z4;
+    tmp13 = LEFT_SHIFT(z1 - z3, CONST_BITS) + z4;
 
     /* Final output stage */
 
     outptr[0]  = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10,
                                                CONST_BITS+PASS1_BITS+3)
                              & RANGE_MASK];
     outptr[13] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10,
                                                CONST_BITS+PASS1_BITS+3)
@@ -2145,30 +2148,30 @@ jpeg_idct_15x15 (j_decompress_ptr cinfo,
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
     /* Add fudge factor here for final descale. */
     z1 += ONE << (CONST_BITS-PASS1_BITS-1);
 
     z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     z4 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     tmp10 = MULTIPLY(z4, FIX(0.437016024)); /* c12 */
     tmp11 = MULTIPLY(z4, FIX(1.144122806)); /* c6 */
 
     tmp12 = z1 - tmp10;
     tmp13 = z1 + tmp11;
-    z1 -= (tmp11 - tmp10) << 1;             /* c0 = (c6-c12)*2 */
+    z1 -= LEFT_SHIFT(tmp11 - tmp10, 1);     /* c0 = (c6-c12)*2 */
 
     z4 = z2 - z3;
     z3 += z2;
     tmp10 = MULTIPLY(z3, FIX(1.337628990)); /* (c2+c4)/2 */
     tmp11 = MULTIPLY(z4, FIX(0.045680613)); /* (c2-c4)/2 */
     z2 = MULTIPLY(z2, FIX(1.439773946));    /* c4+c14 */
 
     tmp20 = tmp13 + tmp10 + tmp11;
@@ -2238,28 +2241,28 @@ jpeg_idct_15x15 (j_decompress_ptr cinfo,
   wsptr = workspace;
   for (ctr = 0; ctr < 15; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     z1 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    z1 <<= CONST_BITS;
+    z1 = LEFT_SHIFT(z1, CONST_BITS);
 
     z2 = (INT32) wsptr[2];
     z3 = (INT32) wsptr[4];
     z4 = (INT32) wsptr[6];
 
     tmp10 = MULTIPLY(z4, FIX(0.437016024)); /* c12 */
     tmp11 = MULTIPLY(z4, FIX(1.144122806)); /* c6 */
 
     tmp12 = z1 - tmp10;
     tmp13 = z1 + tmp11;
-    z1 -= (tmp11 - tmp10) << 1;             /* c0 = (c6-c12)*2 */
+    z1 -= LEFT_SHIFT(tmp11 - tmp10, 1);     /* c0 = (c6-c12)*2 */
 
     z4 = z2 - z3;
     z3 += z2;
     tmp10 = MULTIPLY(z3, FIX(1.337628990)); /* (c2+c4)/2 */
     tmp11 = MULTIPLY(z4, FIX(0.045680613)); /* (c2-c4)/2 */
     z2 = MULTIPLY(z2, FIX(1.439773946));    /* c4+c14 */
 
     tmp20 = tmp13 + tmp10 + tmp11;
@@ -2387,17 +2390,17 @@ jpeg_idct_16x16 (j_decompress_ptr cinfo,
 
   inptr = coef_block;
   quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table;
   wsptr = workspace;
   for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) {
     /* Even part */
 
     tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
     /* Add fudge factor here for final descale. */
     tmp0 += 1 << (CONST_BITS-PASS1_BITS-1);
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
     tmp1 = MULTIPLY(z1, FIX(1.306562965));      /* c4[16] = c2[8] */
     tmp2 = MULTIPLY(z1, FIX_0_541196100);       /* c12[16] = c6[8] */
 
     tmp10 = tmp0 + tmp1;
@@ -2489,17 +2492,17 @@ jpeg_idct_16x16 (j_decompress_ptr cinfo,
   wsptr = workspace;
   for (ctr = 0; ctr < 16; ctr++) {
     outptr = output_buf[ctr] + output_col;
 
     /* Even part */
 
     /* Add fudge factor here for final descale. */
     tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2));
-    tmp0 <<= CONST_BITS;
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS);
 
     z1 = (INT32) wsptr[4];
     tmp1 = MULTIPLY(z1, FIX(1.306562965));      /* c4[16] = c2[8] */
     tmp2 = MULTIPLY(z1, FIX_0_541196100);       /* c12[16] = c6[8] */
 
     tmp10 = tmp0 + tmp1;
     tmp11 = tmp0 - tmp1;
     tmp12 = tmp0 + tmp2;
--- a/media/libjpeg/jidctred.c
+++ b/media/libjpeg/jidctred.c
@@ -1,13 +1,15 @@
 /*
  * jidctred.c
  *
+ * This file was part of the Independent JPEG Group's software.
  * Copyright (C) 1994-1998, Thomas G. Lane.
- * This file is part of the Independent JPEG Group's software.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2015, D. R. Commander
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains inverse-DCT routines that produce reduced-size output:
  * either 4x4, 2x2, or 1x1 pixels from an 8x8 DCT block.
  *
  * The implementation is based on the Loeffler, Ligtenberg and Moschytz (LL&M)
  * algorithm used in jidctint.c.  We simply replace each 8-to-8 1-D IDCT step
  * with an 8-to-4 step that produces the four averages of two adjacent outputs
@@ -138,30 +140,31 @@ jpeg_idct_4x4 (j_decompress_ptr cinfo, j
   for (ctr = DCTSIZE; ctr > 0; inptr++, quantptr++, wsptr++, ctr--) {
     /* Don't bother to process column 4, because second pass won't use it */
     if (ctr == DCTSIZE-4)
       continue;
     if (inptr[DCTSIZE*1] == 0 && inptr[DCTSIZE*2] == 0 &&
         inptr[DCTSIZE*3] == 0 && inptr[DCTSIZE*5] == 0 &&
         inptr[DCTSIZE*6] == 0 && inptr[DCTSIZE*7] == 0) {
       /* AC terms all zero; we need not examine term 4 for 4x4 output */
-      int dcval = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]) << PASS1_BITS;
+      int dcval = LEFT_SHIFT(DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]),
+                             PASS1_BITS);
 
       wsptr[DCTSIZE*0] = dcval;
       wsptr[DCTSIZE*1] = dcval;
       wsptr[DCTSIZE*2] = dcval;
       wsptr[DCTSIZE*3] = dcval;
 
       continue;
     }
 
     /* Even part */
 
     tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp0 <<= (CONST_BITS+1);
+    tmp0 = LEFT_SHIFT(tmp0, CONST_BITS+1);
 
     z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
     z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
 
     tmp2 = MULTIPLY(z2, FIX_1_847759065) + MULTIPLY(z3, - FIX_0_765366865);
 
     tmp10 = tmp0 + tmp2;
     tmp12 = tmp0 - tmp2;
@@ -212,17 +215,17 @@ jpeg_idct_4x4 (j_decompress_ptr cinfo, j
 
       wsptr += DCTSIZE;         /* advance pointer to next row */
       continue;
     }
 #endif
 
     /* Even part */
 
-    tmp0 = ((INT32) wsptr[0]) << (CONST_BITS+1);
+    tmp0 = LEFT_SHIFT((INT32) wsptr[0], CONST_BITS+1);
 
     tmp2 = MULTIPLY((INT32) wsptr[2], FIX_1_847759065)
          + MULTIPLY((INT32) wsptr[6], - FIX_0_765366865);
 
     tmp10 = tmp0 + tmp2;
     tmp12 = tmp0 - tmp2;
 
     /* Odd part */
@@ -289,28 +292,29 @@ jpeg_idct_2x2 (j_decompress_ptr cinfo, j
   wsptr = workspace;
   for (ctr = DCTSIZE; ctr > 0; inptr++, quantptr++, wsptr++, ctr--) {
     /* Don't bother to process columns 2,4,6 */
     if (ctr == DCTSIZE-2 || ctr == DCTSIZE-4 || ctr == DCTSIZE-6)
       continue;
     if (inptr[DCTSIZE*1] == 0 && inptr[DCTSIZE*3] == 0 &&
         inptr[DCTSIZE*5] == 0 && inptr[DCTSIZE*7] == 0) {
       /* AC terms all zero; we need not examine terms 2,4,6 for 2x2 output */
-      int dcval = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]) << PASS1_BITS;
+      int dcval = LEFT_SHIFT(DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]),
+                             PASS1_BITS);
 
       wsptr[DCTSIZE*0] = dcval;
       wsptr[DCTSIZE*1] = dcval;
 
       continue;
     }
 
     /* Even part */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
-    tmp10 = z1 << (CONST_BITS+2);
+    tmp10 = LEFT_SHIFT(z1, CONST_BITS+2);
 
     /* Odd part */
 
     z1 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]);
     tmp0 = MULTIPLY(z1, - FIX_0_720959822); /* sqrt(2) * (c7-c5+c3-c1) */
     z1 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]);
     tmp0 += MULTIPLY(z1, FIX_0_850430095); /* sqrt(2) * (-c1+c3+c5+c7) */
     z1 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]);
@@ -342,17 +346,17 @@ jpeg_idct_2x2 (j_decompress_ptr cinfo, j
 
       wsptr += DCTSIZE;         /* advance pointer to next row */
       continue;
     }
 #endif
 
     /* Even part */
 
-    tmp10 = ((INT32) wsptr[0]) << (CONST_BITS+2);
+    tmp10 = LEFT_SHIFT((INT32) wsptr[0], CONST_BITS+2);
 
     /* Odd part */
 
     tmp0 = MULTIPLY((INT32) wsptr[7], - FIX_0_720959822) /* sqrt(2) * (c7-c5+c3-c1) */
          + MULTIPLY((INT32) wsptr[5], FIX_0_850430095) /* sqrt(2) * (-c1+c3+c5+c7) */
          + MULTIPLY((INT32) wsptr[3], - FIX_1_272758580) /* sqrt(2) * (-c1+c3-c5-c7) */
          + MULTIPLY((INT32) wsptr[1], FIX_3_624509785); /* sqrt(2) * (c1+c3+c5+c7) */
 
--- a/media/libjpeg/jmorecfg.h
+++ b/media/libjpeg/jmorecfg.h
@@ -1,15 +1,16 @@
 /*
  * jmorecfg.h
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1997, Thomas G. Lane.
+ * Modified 1997-2009 by Guido Vollbeding.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2009, 2011, 2014, D. R. Commander.
+ * Copyright (C) 2009, 2011, 2014-2015, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains additional configuration options that customize the
  * JPEG software for special applications or support machine-dependent
  * optimizations.  Most users will not need to touch this file.
  */
 
 #include <stdint.h>
@@ -135,17 +136,19 @@ typedef int16_t INT16;
 /* INT32 must hold at least signed 32-bit values. */
 
 typedef int32_t INT32;
 
 /* Datatype used for image dimensions.  The JPEG standard only supports
  * images up to 64K*64K due to 16-bit fields in SOF markers.  Therefore
  * "unsigned int" is sufficient on all machines.  However, if you need to
  * handle larger images and you don't mind deviating from the spec, you
- * can change this datatype.
+ * can change this datatype.  (Note that changing this datatype will
+ * potentially require modifying the SIMD code.  The x86-64 SIMD extensions,
+ * in particular, assume a 32-bit JDIMENSION.)
  */
 
 typedef unsigned int JDIMENSION;
 
 #define JPEG_MAX_DIMENSION  65500L  /* a tad under 64K to prevent overflows */
 
 
 /* These macros are used in all function definitions and extern declarations.
--- a/media/libjpeg/jpegint.h
+++ b/media/libjpeg/jpegint.h
@@ -1,16 +1,16 @@
 /*
  * jpegint.h
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1997, Thomas G. Lane.
  * Modified 1997-2009 by Guido Vollbeding.
- * It was modified by The libjpeg-turbo Project to include only code relevant
- * to libjpeg-turbo.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2015, D. R. Commander
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file provides common declarations for the various JPEG modules.
  * These declarations are considered internal to the JPEG library; most
  * applications using the library shouldn't need to include this file.
  */
 
 
@@ -37,16 +37,28 @@ typedef enum {            /* Operating m
 #define DSTATE_SCANNING 205     /* start_decompress done, read_scanlines OK */
 #define DSTATE_RAW_OK   206     /* start_decompress done, read_raw_data OK */
 #define DSTATE_BUFIMAGE 207     /* expecting jpeg_start_output */
 #define DSTATE_BUFPOST  208     /* looking for SOS/EOI in jpeg_finish_output */
 #define DSTATE_RDCOEFS  209     /* reading file in jpeg_read_coefficients */
 #define DSTATE_STOPPING 210     /* looking for EOI in jpeg_finish_decompress */
 
 
+/*
+ * Left shift macro that handles a negative operand without causing any
+ * sanitizer warnings
+ */
+
+#ifdef __INT32_IS_ACTUALLY_LONG
+#define LEFT_SHIFT(a, b) ((INT32)((unsigned long)(a) << (b)))
+#else
+#define LEFT_SHIFT(a, b) ((INT32)((unsigned int)(a) << (b)))
+#endif
+
+
 /* Declarations for compression modules */
 
 /* Master control module */
 struct jpeg_comp_master {
   void (*prepare_for_pass) (j_compress_ptr cinfo);
   void (*pass_startup) (j_compress_ptr cinfo);
   void (*finish_pass) (j_compress_ptr cinfo);
 
--- a/media/libjpeg/jversion.h
+++ b/media/libjpeg/jversion.h
@@ -1,15 +1,15 @@
 /*
  * jversion.h
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-2012, Thomas G. Lane, Guido Vollbeding.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2010, 2012-2014, D. R. Commander.
+ * Copyright (C) 2010, 2012-2015, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains software version identification.
  */
 
 
 #if JPEG_LIB_VERSION >= 80
 
@@ -23,14 +23,14 @@
 
 #define JVERSION        "6b  27-Mar-1998"
 
 #endif
 
 #define JCOPYRIGHT      "Copyright (C) 1991-2012 Thomas G. Lane, Guido Vollbeding\n" \
                         "Copyright (C) 1999-2006 MIYASAKA Masaru\n" \
                         "Copyright (C) 2009 Pierre Ossman for Cendio AB\n" \
-                        "Copyright (C) 2009-2014 D. R. Commander\n" \
+                        "Copyright (C) 2009-2015 D. R. Commander\n" \
                         "Copyright (C) 2009-2011 Nokia Corporation and/or its subsidiary(-ies)\n" \
                         "Copyright (C) 2013-2014 MIPS Technologies, Inc.\n" \
                         "Copyright (C) 2013 Linaro Limited"
 
-#define JCOPYRIGHT_SHORT "Copyright (C) 1991-2014 The libjpeg-turbo Project and many others"
+#define JCOPYRIGHT_SHORT "Copyright (C) 1991-2015 The libjpeg-turbo Project and many others"
--- a/media/libjpeg/mozilla.diff
+++ b/media/libjpeg/mozilla.diff
@@ -1,34 +1,21 @@
---- jmorecfg.h	2014-11-25 05:07:43.000000000 -0500
-+++ jmorecfg.h	2015-01-14 21:46:56.465050782 -0500
-@@ -7,16 +7,17 @@
-  * Copyright (C) 2009, 2011, 2014, D. R. Commander.
-  * For conditions of distribution and use, see the accompanying README file.
-  *
-  * This file contains additional configuration options that customize the
-  * JPEG software for special applications or support machine-dependent
+diff --git jmorecfg.h jmorecfg.h
+index be89189..3b0d05a 100644
+--- jmorecfg.h
++++ jmorecfg.h
+@@ -13,6 +13,7 @@
   * optimizations.  Most users will not need to touch this file.
   */
  
 +#include <stdint.h>
  
  /*
   * Maximum number of components (color channels) allowed in JPEG image.
-  * To meet the letter of the JPEG spec, set this to 255.  However, darn
-  * few applications need more than 4 channels (maybe 5 for CMYK + alpha
-  * mask).  We recommend 10 as a reasonable compromise; use 4 if you are
-  * really short on memory.  (Each allowed component costs a hundred or so
-  * bytes of storage, whether actually used in an image or not.)
-@@ -116,45 +117,29 @@ typedef char JOCTET;
-  * They must be at least as wide as specified; but making them too big
-  * won't cost a huge amount of memory, so we don't provide special
-  * extraction code like we did for JSAMPLE.  (In other words, these
-  * typedefs live at a different point on the speed/space tradeoff curve.)
-  */
+@@ -122,42 +123,19 @@ typedef char JOCTET;
  
  /* UINT8 must hold at least the values 0..255. */
  
 -#ifdef HAVE_UNSIGNED_CHAR
 -typedef unsigned char UINT8;
 -#else /* not HAVE_UNSIGNED_CHAR */
 -#ifdef __CHAR_UNSIGNED__
 -typedef char UINT8;
@@ -52,19 +39,21 @@
 -#ifndef XMD_H                   /* X11/xmd.h correctly defines INT16 */
 -typedef short INT16;
 -#endif
 +typedef int16_t INT16;
  
  /* INT32 must hold at least signed 32-bit values. */
  
 -#ifndef XMD_H                   /* X11/xmd.h correctly defines INT32 */
+-#ifndef _BASETSD_H_		/* Microsoft defines it in basetsd.h */
+-#ifndef _BASETSD_H		/* MinGW is slightly different */
+-#ifndef QGLOBAL_H		/* Qt defines it in qglobal.h */
+-#define __INT32_IS_ACTUALLY_LONG
 -typedef long INT32;
 -#endif
+-#endif
+-#endif
+-#endif
 +typedef int32_t INT32;
  
  /* Datatype used for image dimensions.  The JPEG standard only supports
   * images up to 64K*64K due to 16-bit fields in SOF markers.  Therefore
-  * "unsigned int" is sufficient on all machines.  However, if you need to
-  * handle larger images and you don't mind deviating from the spec, you
-  * can change this datatype.
-  */
- 
--- a/media/libjpeg/simd/jccolext-sse2-64.asm
+++ b/media/libjpeg/simd/jccolext-sse2-64.asm
@@ -45,24 +45,24 @@ EXTN(jsimd_rgb_ycc_convert_sse2):
         sub     rsp, byte 4
         and     rsp, byte (-SIZEOF_XMMWORD)     ; align to 128 bits
         mov     [rsp],rax
         mov     rbp,rsp                         ; rbp = aligned rbp
         lea     rsp, [wk(0)]
         collect_args
         push    rbx
 
-        mov     rcx, r10
+        mov     ecx, r10d
         test    rcx,rcx
         jz      near .return
 
         push    rcx
 
         mov rsi, r12
-        mov rcx, r13
+        mov ecx, r13d
         mov     rdi, JSAMPARRAY [rsi+0*SIZEOF_JSAMPARRAY]
         mov     rbx, JSAMPARRAY [rsi+1*SIZEOF_JSAMPARRAY]
         mov     rdx, JSAMPARRAY [rsi+2*SIZEOF_JSAMPARRAY]
         lea     rdi, [rdi+rcx*SIZEOF_JSAMPROW]
         lea     rbx, [rbx+rcx*SIZEOF_JSAMPROW]
         lea     rdx, [rdx+rcx*SIZEOF_JSAMPROW]
 
         pop     rcx
--- a/media/libjpeg/simd/jcgryext-sse2-64.asm
+++ b/media/libjpeg/simd/jcgryext-sse2-64.asm
@@ -45,24 +45,24 @@ EXTN(jsimd_rgb_gray_convert_sse2):
         sub     rsp, byte 4
         and     rsp, byte (-SIZEOF_XMMWORD)     ; align to 128 bits
         mov     [rsp],rax
         mov     rbp,rsp                         ; rbp = aligned rbp
         lea     rsp, [wk(0)]
         collect_args
         push    rbx
 
-        mov     rcx, r10
+        mov     ecx, r10d
         test    rcx,rcx
         jz      near .return
 
         push    rcx
 
         mov rsi, r12
-        mov rcx, r13
+        mov ecx, r13d
         mov     rdi, JSAMPARRAY [rsi+0*SIZEOF_JSAMPARRAY]
         lea     rdi, [rdi+rcx*SIZEOF_JSAMPROW]
 
         pop     rcx
 
         mov rsi, r11
         mov     eax, r14d
         test    rax,rax
--- a/media/libjpeg/simd/jcsample-sse2-64.asm
+++ b/media/libjpeg/simd/jcsample-sse2-64.asm
@@ -44,21 +44,21 @@
         global  EXTN(jsimd_h2v1_downsample_sse2)
 
 EXTN(jsimd_h2v1_downsample_sse2):
         push    rbp
         mov     rax,rsp
         mov     rbp,rsp
         collect_args
 
-        mov rcx, r13
+        mov ecx, r13d
         shl     rcx,3                   ; imul rcx,DCTSIZE (rcx = output_cols)
         jz      near .return
 
-        mov rdx, r10
+        mov edx, r10d
 
         ; -- expand_right_edge
 
         push    rcx
         shl     rcx,1                           ; output_cols * 2
         sub     rcx,rdx
         jle     short .expand_end
 
@@ -85,17 +85,17 @@ EXTN(jsimd_h2v1_downsample_sse2):
         dec     rax
         jg      short .expandloop
 
 .expand_end:
         pop     rcx                             ; output_cols
 
         ; -- h2v1_downsample
 
-        mov     rax, r12        ; rowctr
+        mov     eax, r12d        ; rowctr
         test    eax,eax
         jle     near .return
 
         mov     rdx, 0x00010000         ; bias pattern
         movd    xmm7,edx
         pcmpeqw xmm6,xmm6
         pshufd  xmm7,xmm7,0x00          ; xmm7={0, 1, 0, 1, 0, 1, 0, 1}
         psrlw   xmm6,BYTE_BIT           ; xmm6={0xFF 0x00 0xFF 0x00 ..}
@@ -188,21 +188,21 @@ EXTN(jsimd_h2v1_downsample_sse2):
         global  EXTN(jsimd_h2v2_downsample_sse2)
 
 EXTN(jsimd_h2v2_downsample_sse2):
         push    rbp
         mov     rax,rsp
         mov     rbp,rsp
         collect_args
 
-        mov     rcx, r13
+        mov     ecx, r13d
         shl     rcx,3                   ; imul rcx,DCTSIZE (rcx = output_cols)
         jz      near .return
 
-        mov     rdx, r10
+        mov     edx, r10d
 
         ; -- expand_right_edge
 
         push    rcx
         shl     rcx,1                           ; output_cols * 2
         sub     rcx,rdx
         jle     short .expand_end
 
@@ -229,17 +229,17 @@ EXTN(jsimd_h2v2_downsample_sse2):
         dec     rax
         jg      short .expandloop
 
 .expand_end:
         pop     rcx                             ; output_cols
 
         ; -- h2v2_downsample
 
-        mov     rax, r12        ; rowctr
+        mov     eax, r12d        ; rowctr
         test    rax,rax
         jle     near .return
 
         mov     rdx, 0x00020001         ; bias pattern
         movd    xmm7,edx
         pcmpeqw xmm6,xmm6
         pshufd  xmm7,xmm7,0x00          ; xmm7={1, 2, 1, 2, 1, 2, 1, 2}
         psrlw   xmm6,BYTE_BIT           ; xmm6={0xFF 0x00 0xFF 0x00 ..}
--- a/media/libjpeg/simd/jdcolext-sse2-64.asm
+++ b/media/libjpeg/simd/jdcolext-sse2-64.asm
@@ -47,24 +47,24 @@ EXTN(jsimd_ycc_rgb_convert_sse2):
         sub     rsp, byte 4
         and     rsp, byte (-SIZEOF_XMMWORD)     ; align to 128 bits
         mov     [rsp],rax
         mov     rbp,rsp                         ; rbp = aligned rbp
         lea     rsp, [wk(0)]
         collect_args
         push    rbx
 
-        mov     rcx, r10        ; num_cols
+        mov     ecx, r10d        ; num_cols
         test    rcx,rcx
         jz      near .return
 
         push    rcx
 
         mov     rdi, r11
-        mov     rcx, r12
+        mov     ecx, r12d
         mov     rsi, JSAMPARRAY [rdi+0*SIZEOF_JSAMPARRAY]
         mov     rbx, JSAMPARRAY [rdi+1*SIZEOF_JSAMPARRAY]
         mov     rdx, JSAMPARRAY [rdi+2*SIZEOF_JSAMPARRAY]
         lea     rsi, [rsi+rcx*SIZEOF_JSAMPROW]
         lea     rbx, [rbx+rcx*SIZEOF_JSAMPROW]
         lea     rdx, [rdx+rcx*SIZEOF_JSAMPROW]
 
         pop     rcx
--- a/media/libjpeg/simd/jdmrgext-sse2-64.asm
+++ b/media/libjpeg/simd/jdmrgext-sse2-64.asm
@@ -47,24 +47,24 @@ EXTN(jsimd_h2v1_merged_upsample_sse2):
         sub     rsp, byte 4
         and     rsp, byte (-SIZEOF_XMMWORD)     ; align to 128 bits
         mov     [rsp],rax
         mov     rbp,rsp                         ; rbp = aligned rbp
         lea     rsp, [wk(0)]
         collect_args
         push    rbx
 
-        mov     rcx, r10        ; col
+        mov     ecx, r10d        ; col
         test    rcx,rcx
         jz      near .return
 
         push    rcx
 
         mov     rdi, r11
-        mov     rcx, r12
+        mov     ecx, r12d
         mov     rsi, JSAMPARRAY [rdi+0*SIZEOF_JSAMPARRAY]
         mov     rbx, JSAMPARRAY [rdi+1*SIZEOF_JSAMPARRAY]
         mov     rdx, JSAMPARRAY [rdi+2*SIZEOF_JSAMPARRAY]
         mov     rdi, r13
         mov     rsi, JSAMPROW [rsi+rcx*SIZEOF_JSAMPROW]         ; inptr0
         mov     rbx, JSAMPROW [rbx+rcx*SIZEOF_JSAMPROW]         ; inptr1
         mov     rdx, JSAMPROW [rdx+rcx*SIZEOF_JSAMPROW]         ; inptr2
         mov     rdi, JSAMPROW [rdi]                             ; outptr
@@ -450,20 +450,20 @@ EXTN(jsimd_h2v1_merged_upsample_sse2):
 
 EXTN(jsimd_h2v2_merged_upsample_sse2):
         push    rbp
         mov     rax,rsp
         mov     rbp,rsp
         collect_args
         push    rbx
 
-        mov     rax, r10
+        mov     eax, r10d
 
         mov     rdi, r11
-        mov     rcx, r12
+        mov     ecx, r12d
         mov     rsi, JSAMPARRAY [rdi+0*SIZEOF_JSAMPARRAY]
         mov     rbx, JSAMPARRAY [rdi+1*SIZEOF_JSAMPARRAY]
         mov     rdx, JSAMPARRAY [rdi+2*SIZEOF_JSAMPARRAY]
         mov     rdi, r13
         lea     rsi, [rsi+rcx*SIZEOF_JSAMPROW]
 
         push    rdx                     ; inptr2
         push    rbx                     ; inptr1
--- a/media/libjpeg/simd/jdsample-sse2-64.asm
+++ b/media/libjpeg/simd/jdsample-sse2-64.asm
@@ -62,17 +62,17 @@ PW_EIGHT        times 8 dw  8
         global  EXTN(jsimd_h2v1_fancy_upsample_sse2)
 
 EXTN(jsimd_h2v1_fancy_upsample_sse2):
         push    rbp
         mov     rax,rsp
         mov     rbp,rsp
         collect_args
 
-        mov     rax, r11  ; colctr
+        mov     eax, r11d  ; colctr
         test    rax,rax
         jz      near .return
 
         mov     rcx, r10        ; rowctr
         test    rcx,rcx
         jz      near .return
 
         mov     rsi, r12        ; input_data
@@ -209,17 +209,17 @@ EXTN(jsimd_h2v2_fancy_upsample_sse2):
         sub     rsp, byte 4
         and     rsp, byte (-SIZEOF_XMMWORD)     ; align to 128 bits
         mov     [rsp],rax
         mov     rbp,rsp                         ; rbp = aligned rbp
         lea     rsp, [wk(0)]
         collect_args
         push    rbx
 
-        mov     rax, r11  ; colctr
+        mov     eax, r11d  ; colctr
         test    rax,rax
         jz      near .return
 
         mov     rcx, r10        ; rowctr
         test    rcx,rcx
         jz      near .return
 
         mov     rsi, r12        ; input_data
@@ -501,17 +501,17 @@ EXTN(jsimd_h2v2_fancy_upsample_sse2):
         global  EXTN(jsimd_h2v1_upsample_sse2)
 
 EXTN(jsimd_h2v1_upsample_sse2):
         push    rbp
         mov     rax,rsp
         mov     rbp,rsp
         collect_args
 
-        mov     rdx, r11
+        mov     edx, r11d
         add     rdx, byte (2*SIZEOF_XMMWORD)-1
         and     rdx, byte -(2*SIZEOF_XMMWORD)
         jz      near .return
 
         mov     rcx, r10        ; rowctr
         test    rcx,rcx
         jz      short .return
 
@@ -591,17 +591,17 @@ EXTN(jsimd_h2v1_upsample_sse2):
 
 EXTN(jsimd_h2v2_upsample_sse2):
         push    rbp
         mov     rax,rsp
         mov     rbp,rsp
         collect_args
         push    rbx
 
-        mov     rdx, r11
+        mov     edx, r11d
         add     rdx, byte (2*SIZEOF_XMMWORD)-1
         and     rdx, byte -(2*SIZEOF_XMMWORD)
         jz      near .return
 
         mov     rcx, r10        ; rowctr
         test    rcx,rcx
         jz      near .return
 
--- a/media/libjpeg/simd/jidctflt-sse2-64.asm
+++ b/media/libjpeg/simd/jidctflt-sse2-64.asm
@@ -321,17 +321,17 @@ EXTN(jsimd_idct_float_sse2):
         prefetchnta [rsi + (DCTSIZE2-8)*SIZEOF_JCOEF + 2*32]
         prefetchnta [rsi + (DCTSIZE2-8)*SIZEOF_JCOEF + 3*32]
 
         ; ---- Pass 2: process rows from work array, store into output array.
 
         mov     rax, [original_rbp]
         lea     rsi, [workspace]                        ; FAST_FLOAT * wsptr
         mov     rdi, r12        ; (JSAMPROW *)
-        mov     rax, r13
+        mov     eax, r13d
         mov     rcx, DCTSIZE/4                          ; ctr
 .rowloop:
 
         ; -- Even part
 
         movaps  xmm0, XMMWORD [XMMBLOCK(0,0,rsi,SIZEOF_FAST_FLOAT)]
         movaps  xmm1, XMMWORD [XMMBLOCK(2,0,rsi,SIZEOF_FAST_FLOAT)]
         movaps  xmm2, XMMWORD [XMMBLOCK(4,0,rsi,SIZEOF_FAST_FLOAT)]
--- a/media/libjpeg/simd/jidctfst-sse2-64.asm
+++ b/media/libjpeg/simd/jidctfst-sse2-64.asm
@@ -8,17 +8,17 @@
 ; x86 SIMD extension for IJG JPEG library
 ; Copyright (C) 1999-2006, MIYASAKA Masaru.
 ; For conditions of distribution and use, see copyright notice in jsimdext.inc
 ;
 ; This file should be assembled with NASM (Netwide Assembler),
 ; can *not* be assembled with Microsoft's MASM or any compatible
 ; assembler (including Borland's Turbo Assembler).
 ; NASM is available from http://nasm.sourceforge.net/ or
-; http://sourceforge.net/projecpt/showfiles.php?group_id=6208
+; http://sourceforge.net/project/showfiles.php?group_id=6208
 ;
 ; This file contains a fast, not so accurate integer implementation of
 ; the inverse DCT (Discrete Cosine Transform). The following code is
 ; based directly on the IJG's original jidctfst.c; see the jidctfst.c
 ; for more details.
 ;
 ; [TAB8]
 
@@ -318,17 +318,17 @@ EXTN(jsimd_idct_ifast_sse2):
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 1*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 2*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 3*32]
 
         ; ---- Pass 2: process rows from work array, store into output array.
 
         mov     rax, [original_rbp]
         mov     rdi, r12        ; (JSAMPROW *)
-        mov     rax, r13
+        mov     eax, r13d
 
         ; -- Even part
 
         ; xmm6=col0, xmm5=col2, xmm1=col4, xmm3=col6
 
         movdqa  xmm2,xmm6
         movdqa  xmm0,xmm5
         psubw   xmm6,xmm1               ; xmm6=tmp11
--- a/media/libjpeg/simd/jidctint-sse2-64.asm
+++ b/media/libjpeg/simd/jidctint-sse2-64.asm
@@ -510,17 +510,17 @@ EXTN(jsimd_idct_islow_sse2):
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 1*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 2*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 3*32]
 
         ; ---- Pass 2: process rows from work array, store into output array.
 
         mov     rax, [original_rbp]
         mov     rdi, r12        ; (JSAMPROW *)
-        mov     rax, r13
+        mov     eax, r13d
 
         ; -- Even part
 
         ; xmm7=col0, xmm1=col2, xmm0=col4, xmm2=col6
 
         ; (Original)
         ; z1 = (z2 + z3) * 0.541196100;
         ; tmp2 = z1 + z3 * -1.847759065;
--- a/media/libjpeg/simd/jidctred-sse2-64.asm
+++ b/media/libjpeg/simd/jidctred-sse2-64.asm
@@ -307,17 +307,17 @@ EXTN(jsimd_idct_4x4_sse2):
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 1*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 2*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 3*32]
 
         ; ---- Pass 2: process rows, store into output array.
 
         mov     rax, [original_rbp]
         mov     rdi, r12        ; (JSAMPROW *)
-        mov     rax, r13
+        mov     eax, r13d
 
         ; -- Even part
 
         pxor      xmm4,xmm4
         punpcklwd xmm4,xmm1             ; xmm4=tmp0
         psrad     xmm4,(16-CONST_BITS-1) ; psrad xmm4,16 & pslld xmm4,CONST_BITS+1
 
         ; -- Odd part
@@ -516,17 +516,17 @@ EXTN(jsimd_idct_2x2_sse2):
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 0*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 1*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 2*32]
         prefetchnta [rsi + DCTSIZE2*SIZEOF_JCOEF + 3*32]
 
         ; ---- Pass 2: process rows, store into output array.
 
         mov     rdi, r12        ; (JSAMPROW *)
-        mov     rax, r13
+        mov     eax, r13d
 
         ; | input:| result:|
         ; | A0 B0 |        |
         ; | A1 B1 | C0 C1  |
         ; | A3 B3 | D0 D1  |
         ; | A5 B5 |        |
         ; | A7 B7 |        |
 
--- a/media/libjpeg/simd/jquantf-sse2-64.asm
+++ b/media/libjpeg/simd/jquantf-sse2-64.asm
@@ -45,17 +45,17 @@ EXTN(jsimd_convsamp_float_sse2):
         collect_args
         push    rbx
 
         pcmpeqw  xmm7,xmm7
         psllw    xmm7,7
         packsswb xmm7,xmm7              ; xmm7 = PB_CENTERJSAMPLE (0x808080..)
 
         mov rsi, r10
-        mov     rax, r11
+        mov     eax, r11d
         mov rdi, r12
         mov     rcx, DCTSIZE/2
 .convloop:
         mov     rbx, JSAMPROW [rsi+0*SIZEOF_JSAMPROW]   ; (JSAMPLE *)
         mov rdx, JSAMPROW [rsi+1*SIZEOF_JSAMPROW]       ; (JSAMPLE *)
 
         movq    xmm0, XMM_MMWORD [rbx+rax*SIZEOF_JSAMPLE]
         movq    xmm1, XMM_MMWORD [rdx+rax*SIZEOF_JSAMPLE]
--- a/media/libjpeg/simd/jquanti-sse2-64.asm
+++ b/media/libjpeg/simd/jquanti-sse2-64.asm
@@ -45,17 +45,17 @@ EXTN(jsimd_convsamp_sse2):
         collect_args
         push    rbx
 
         pxor    xmm6,xmm6               ; xmm6=(all 0's)
         pcmpeqw xmm7,xmm7
         psllw   xmm7,7                  ; xmm7={0xFF80 0xFF80 0xFF80 0xFF80 ..}
 
         mov rsi, r10
-        mov rax, r11
+        mov eax, r11d
         mov rdi, r12
         mov     rcx, DCTSIZE/4
 .convloop:
         mov     rbx, JSAMPROW [rsi+0*SIZEOF_JSAMPROW]   ; (JSAMPLE *)
         mov rdx, JSAMPROW [rsi+1*SIZEOF_JSAMPROW]       ; (JSAMPLE *)
 
         movq    xmm0, XMM_MMWORD [rbx+rax*SIZEOF_JSAMPLE]       ; xmm0=(01234567)
         movq    xmm1, XMM_MMWORD [rdx+rax*SIZEOF_JSAMPLE]       ; xmm1=(89ABCDEF)
--- a/media/libjpeg/simd/jsimd_mips.c
+++ b/media/libjpeg/simd/jsimd_mips.c
@@ -557,30 +557,34 @@ jsimd_h2v1_fancy_upsample (j_decompress_
     jsimd_h2v1_fancy_upsample_mips_dspr2(cinfo->max_v_samp_factor,
                                          compptr->downsampled_width,
                                          input_data, output_data_ptr);
 }
 
 GLOBAL(int)
 jsimd_can_h2v2_merged_upsample (void)
 {
+  init_simd();
+
   if (BITS_IN_JSAMPLE != 8)
     return 0;
   if (sizeof(JDIMENSION) != 4)
     return 0;
 
   if (simd_support & JSIMD_MIPS_DSPR2)
     return 1;
 
   return 0;
 }
 
 GLOBAL(int)
 jsimd_can_h2v1_merged_upsample (void)
 {
+  init_simd();
+
   if (BITS_IN_JSAMPLE != 8)
     return 0;
   if (sizeof(JDIMENSION) != 4)
     return 0;
 
   if (simd_support & JSIMD_MIPS_DSPR2)
     return 1;
 
--- a/media/libjpeg/simd/jsimd_mips_dspr2.S
+++ b/media/libjpeg/simd/jsimd_mips_dspr2.S
@@ -911,17 +911,18 @@ 1:
     sll            t0, t7, 2       // t0 = thiscolsum * 4
     subu           t1, t0, t7      // t1 = thiscolsum * 3
     shra_r.w       t0, t0, 4
     addiu          t1, 7
     addu           t1, t1, t6
     srl            t1, t1, 4
     sb             t0, 0(s3)
     sb             t1, 1(s3)
-    addiu          s3, 2
+    beq            t8, s0, 22f     // skip to final iteration if width == 3
+     addiu          s3, 2
 2:
     lh             t0, 0(s0)       // t0 = A3|A2
     lh             t2, 0(s1)       // t2 = B3|B2
     addiu          s0, 2
     addiu          s1, 2
     preceu.ph.qbr  t0, t0          // t0 = 0|A3|0|A2
     preceu.ph.qbr  t2, t2          // t2 = 0|B3|0|B2
     shll.ph        t1, t0, 1
@@ -944,16 +945,17 @@ 2:
     addiu          t2, 7
     srl            t2, t2, 4       // t2 = (next1*3 + next2 + 7) >> 4
     sb             t1, 0(s3)
     sb             t0, 1(s3)
     sb             t4, 2(s3)
     sb             t2, 3(s3)
     bne            t8, s0, 2b
      addiu         s3, 4
+22:
     beqz           s5, 4f
      addu          t8, s0, s5
 3:
     lbu            t0, 0(s0)
     lbu            t2, 0(s1)
     addiu          s0, 1
     addiu          s1, 1
     sll            t3, t6, 1
@@ -1804,22 +1806,21 @@ 2:
     lbu     t0, 0(t6)
     sb      t0, 0(t5)
     sb      t0, 1(t5)
     addiu   t4, -2
     addiu   t6, 1
     bgtz    t4, 2b
      addiu  t5, 2
 3:
-    ulw     t6, 0(t7)       // t6 = outptr
-    ulw     t5, 4(t7)       // t5 = outptr[1]
+    lw      t6, 0(t7)       // t6 = outptr[0]
+    lw      t5, 4(t7)       // t5 = outptr[1]
     addu    t4, t6, a1      // t4 = new end address
-    subu    t8, t4, t9
-    beqz    t8, 5f
-     nop
+    beq     a1, t9, 5f
+     subu   t8, t4, t9
 4:
     ulw     t0, 0(t6)
     ulw     t1, 4(t6)
     ulw     t2, 8(t6)
     usw     t0, 0(t5)
     ulw     t0, 12(t6)
     usw     t1, 4(t5)
     usw     t2, 8(t5)
new file mode 100755
--- /dev/null
+++ b/media/update-libjpeg.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+set -v -e -x
+
+if [ $# -lt 1 ]; then
+  echo "Usage: update-libjpeg.sh /path/to/libjpeg-turbo/ [tag]"
+  exit 1
+fi
+
+srcdir=`realpath $(dirname $0)`
+topsrcdir=${srcdir}/..
+rm -rf $srcdir/libjpeg
+
+repo=$1
+tag=${2-HEAD}
+
+(cd $repo; git archive --prefix=media/libjpeg/ $tag) | (cd $srcdir/..; tar xf -)
+
+cd $srcdir/libjpeg
+cp win/jsimdcfg.inc simd/
+
+revert_files="jconfig.h jconfigint.h moz.build Makefile.in MOZCHANGES mozilla.diff simd/jsimdcfg.inc"
+if test -d ${topsrcdir}/.hg; then
+    hg revert --no-backup $revert_files
+elif test -d ${topsrcdir}/.git; then
+    git checkout HEAD -- $revert_files
+fi
+
+patch -p0 -i mozilla.diff