author Henri Sivonen <hsivonen@hsivonen.fi>
Fri, 06 Jul 2018 10:44:43 +0300
changeset 489140 4ef0f163fdeb9afeddd87b37bfd987298c038542
parent 477446 a1ce34b0b183f55cc356b4330df8daa1531cc177
child 508163 6f3709b3878117466168c40affa7bca0b60cf75b
permissions -rw-r--r--
Bug 1402247 - Use encoding_rs for XPCOM string encoding conversions. r=Nika,erahm,froydnj. Correctness improvements: * UTF errors are handled safely per spec instead of dangerously truncating strings. * There are fewer converter implementations. Performance improvements: * The old code did exact buffer length math, which meant doing UTF math twice on each input string (once for length calculation and another time for conversion). Exact length math is more complicated when handling errors properly, which the old code didn't do. The new code does UTF math on the string content only once (when converting) but risks allocating more than once. There are heuristics in place to lower the probability of reallocation in cases where the double math avoidance isn't enough of a saving to absorb an allocation and memcpy. * Previously, in UTF-16 <-> UTF-8 conversions, an ASCII prefix was optimized but a single non-ASCII code point pessimized the rest of the string. The new code tries to get back on the fast ASCII path. * UTF-16 to Latin1 conversion guarantees less about handling of out-of-range input to eliminate an operation from the inner loop on x86/x86_64. * When assigning to a pre-existing string, the new code tries to reuse the old buffer instead of first releasing the old buffer and then allocating a new one. * When reallocating from the new code, the memcpy covers only the data that is part of the logical length of the old string instead of memcpying the whole capacity. (For old callers old excess memcpy behavior is preserved due to bogus callers. See bug 1472113.) * UTF-8 strings in XPConnect that are in the Latin1 range are passed to SpiderMonkey as Latin1. New features: * Conversion between UTF-8 and Latin1 is added in order to enable faster future interop between Rust code (or otherwise UTF-8-using code) and text node and SpiderMonkey code that uses Latin1. MozReview-Commit-ID: JaJuExfILM9

/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim: set ts=8 sts=2 et sw=2 tw=80: */
/* This Source Code Form is subject to the terms of the Mozilla Public
 * License, v. 2.0. If a copy of the MPL was not distributed with this
 * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

#include "nsISupportsImpl.h"
#include "nsIInputStreamLength.h"

class nsIInputStream;

namespace mozilla {

// This class helps to retrieve the stream's length.

class InputStreamLengthHelper final : public Runnable
                                    , public nsIInputStreamLengthCallback

  // This is one of the 2 entry points of this class. It returns false if the
  // length cannot be taken synchronously.
  static bool
  GetSyncLength(nsIInputStream* aStream,
                int64_t* aLength);

  // This is one of the 2 entry points of this class. The callback is executed
  // asynchronously when the length is known.
  static void
  GetAsyncLength(nsIInputStream* aStream,
                 const std::function<void(int64_t aLength)>& aCallback);


  InputStreamLengthHelper(nsIInputStream* aStream,
                          const std::function<void(int64_t aLength)>& aCallback);


  Run() override;

  ExecCallback(int64_t aLength);

  nsCOMPtr<nsIInputStream> mStream;
  std::function<void(int64_t aLength)> mCallback;

} // mozilla namespace