Bug 1058335 (part 2) - Remove unneeded comments and always-ignored warnings about chaining. r=roc.
authorNicholas Nethercote <nnethercote@mozilla.com>
Mon, 25 Aug 2014 17:43:57 -0700
changeset 201816 52b9465cc90b3141a46cd05daf52c7eded1c328d
parent 201815 9ebed11d8a2bf847913e14be0e9f33a6c3421040
child 201817 857d543db683db4217da6d5eab55cad73a50fc2b
push id48265
push usernnethercote@mozilla.com
push dateWed, 27 Aug 2014 05:20:30 +0000
treeherdermozilla-inbound@52b9465cc90b [default view] [failures only]
perfherder[talos] [build metrics] [platform microbench] (compared to previous push)
reviewersroc
bugs1058335
milestone34.0a1
first release with
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
last release without
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
Bug 1058335 (part 2) - Remove unneeded comments and always-ignored warnings about chaining. r=roc.
xpcom/glue/pldhash.cpp
xpcom/glue/pldhash.h
--- a/xpcom/glue/pldhash.cpp
+++ b/xpcom/glue/pldhash.cpp
@@ -238,26 +238,16 @@ MinCapacity(uint32_t aLength)
 {
   return (aLength * 4 + (3 - 1)) / 3;   // == ceil(aLength * 4 / 3)
 }
 
 MOZ_ALWAYS_INLINE bool
 PLDHashTable::Init(const PLDHashTableOps* aOps, void* aData,
                    uint32_t aEntrySize, const fallible_t&, uint32_t aLength)
 {
-#ifdef DEBUG
-  if (aEntrySize > 16 * sizeof(void*)) {
-    printf_stderr(
-      "pldhash: for the table at address %p, the given aEntrySize"
-      " of %lu definitely favors chaining over double hashing.\n",
-      (void*)this,
-      (unsigned long) aEntrySize);
-  }
-#endif
-
   if (aLength > PL_DHASH_MAX_INITIAL_LENGTH) {
     return false;
   }
 
   ops = aOps;
   data = aData;
 
   // Compute the smallest capacity allowing |aLength| elements to be inserted
--- a/xpcom/glue/pldhash.h
+++ b/xpcom/glue/pldhash.h
@@ -162,82 +162,24 @@ typedef PLDHashOperator (*PLDHashEnumera
 typedef size_t (*PLDHashSizeOfEntryExcludingThisFun)(
   PLDHashEntryHdr* aHdr, mozilla::MallocSizeOf aMallocSizeOf, void* aArg);
 
 /*
  * A PLDHashTable is currently 8 words (without the PL_DHASHMETER overhead)
  * on most architectures, and may be allocated on the stack or within another
  * structure or class (see below for the Init and Finish functions to use).
  *
- * To decide whether to use double hashing vs. chaining, we need to develop a
- * trade-off relation, as follows:
- *
- * Let alpha be the load factor, esize the entry size in words, count the
- * entry count, and pow2 the power-of-two table size in entries.
- *
- *   (PLDHashTable overhead)    > (PLHashTable overhead)
- *   (unused table entry space) > (malloc and .next overhead per entry) +
- *                                (buckets overhead)
- *   (1 - alpha) * esize * pow2 > 2 * count + pow2
- *
- * Notice that alpha is by definition (count / pow2):
- *
- *   (1 - alpha) * esize * pow2 > 2 * alpha * pow2 + pow2
- *   (1 - alpha) * esize        > 2 * alpha + 1
- *
- *   esize > (1 + 2 * alpha) / (1 - alpha)
- *
- * This assumes both tables must keep keyHash, key, and value for each entry,
- * where key and value point to separately allocated strings or structures.
- * If key and value can be combined into one pointer, then the trade-off is:
- *
- *   esize > (1 + 3 * alpha) / (1 - alpha)
- *
- * If the entry value can be a subtype of PLDHashEntryHdr, rather than a type
- * that must be allocated separately and referenced by an entry.value pointer
- * member, and provided key's allocation can be fused with its entry's, then
- * k (the words wasted per entry with chaining) is 4.
- *
- * To see these curves, feed gnuplot input like so:
- *
- *   gnuplot> f(x,k) = (1 + k * x) / (1 - x)
- *   gnuplot> plot [0:.75] f(x,2), f(x,3), f(x,4)
- *
- * For k of 2 and a well-loaded table (alpha > .5), esize must be more than 4
- * words for chaining to be more space-efficient than double hashing.
- *
- * Solving for alpha helps us decide when to shrink an underloaded table:
- *
- *   esize                     > (1 + k * alpha) / (1 - alpha)
- *   esize - alpha * esize     > 1 + k * alpha
- *   esize - 1                 > (k + esize) * alpha
- *   (esize - 1) / (k + esize) > alpha
- *
- *   alpha < (esize - 1) / (esize + k)
- *
- * Therefore double hashing should keep alpha >= (esize - 1) / (esize + k),
- * assuming esize is not too large (in which case, chaining should probably be
- * used for any alpha).  For esize=2 and k=3, we want alpha >= .2; for esize=3
- * and k=2, we want alpha >= .4.  For k=4, esize could be 6, and alpha >= .5
- * would still obtain.
- *
- * The current implementation uses a lower bound of 0.25 for alpha when
- * deciding whether to shrink the table (while still respecting
- * PL_DHASH_MIN_CAPACITY).
- *
- * Note a qualitative difference between chaining and double hashing: under
- * chaining, entry addresses are stable across table shrinks and grows.  With
- * double hashing, you can't safely hold an entry pointer and use it after an
- * ADD or REMOVE operation, unless you sample aTable->mGeneration before adding
- * or removing, and compare the sample after, dereferencing the entry pointer
- * only if aTable->mGeneration has not changed.
- *
- * The moral of this story: there is no one-size-fits-all hash table scheme,
- * but for small table entry size, and assuming entry address stability is not
- * required, double hashing wins.
+ * There used to be a long, math-heavy comment here about the merits of
+ * double hashing vs. chaining; it was removed in bug 1058335. In short, double
+ * hashing is more space-efficient unless the element size gets large (in which
+ * case you should keep using double hashing but switch to using pointer
+ * elements). Also, with double hashing, you can't safely hold an entry pointer
+ * and use it after an ADD or REMOVE operation, unless you sample
+ * aTable->mGeneration before adding or removing, and compare the sample after,
+ * dereferencing the entry pointer only if aTable->mGeneration has not changed.
  */
 struct PLDHashTable
 {
   /*
    * Virtual operations; see below. This field is public because it's commonly
    * zeroed to indicate that a table is no longer live.
    */
   const PLDHashTableOps* ops;