bug 1417680 - explore the feasibility of not shutting down NSS by no-op-ing the guts of the shutdown infrastructure r=jcj r=franziskus
☠☠ backed out by 82cf8804d907 ☠ ☠
authorDavid Keeler <dkeeler@mozilla.com>
Fri, 10 Nov 2017 15:03:23 -0800
changeset 449701 4928928a5e46e20fc92f8bb65598659016e8f052
parent 449700 a7c469000c39b05fa4740dd36546018cbbe2f66c
child 449702 ce9b8bbfd224edaf9f3ced038feff465bc1d7e65
push id1648
push usermtabara@mozilla.com
push dateThu, 01 Mar 2018 12:45:47 +0000
treeherdermozilla-release@cbb9688c2eeb [default view] [failures only]
perfherder[talos] [build metrics] [platform microbench] (compared to previous push)
reviewersjcj, franziskus
bugs1417680
milestone59.0a1
first release with
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
last release without
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
bug 1417680 - explore the feasibility of not shutting down NSS by no-op-ing the guts of the shutdown infrastructure r=jcj r=franziskus Adapted from https://wiki.mozilla.org/SecurityEngineering/NSS_Startup_and_Shutdown_in_Gecko : Properly implementing the coordinated shutdown of NSS has, to date, proved intractable. For architectural reasons and due to the significant complexity involved, the NSS resource tracking and shutdown infrastructure has been an ongoing source of crashes and hangs in Firefox. To that end, we have been exploring the possibility of not shutting down NSS at all. For this to work, we have had to address a number of potential concerns. Certificate and key database corruption: In theory, if Firefox were to exit without coordinating with NSS, data stored in the certificate and key databases (backed by BerkeleyDB) could be lost. To mitigate this, we have migrated to using the sqlite-backed implementation. The databases are now journaled, and short of a bug in sqlite, we do not anticipate data loss due to database corruption. PKCS#11 devices: In theory, if Firefox were to exit without coordinating with NSS and thus any attached PKCS#11 devices, data could be lost on these devices. However, it is our understanding that these devices must be robust against unexpected physical removal. Uncoordinated shutdown should present no worse a risk to user data. FIPS 140-2 mode: While Mozilla does not ship a version of Firefox that supports FIPS mode out of the box, Red Hat does. It is our understanding that clearing key material is a requirement of FIPS and that not shutting down NSS may pose a problem for this requirement. Red Hat's FIPS 140-2 Security Policy[0] specifies that the application (i.e. Firefox) using the module (i.e. NSS) is responsible for zeroization of key material. More specifically, it says "All plaintext secret and private keys must be zeroized when the Module is shut down (with a FC_Finalize call), reinitialized (with a FC_InitToken call), or when the session is closed (with a FC_CloseSession or FC_CloseAllSessions call)." Thus, if Firefox never shuts down NSS, this requirement is trivially met. Leak detection: By not shutting down NSS, technically we leak some allocated memory until shutdown. This could cause problems if our test infrastructure detected and reported these leaks. However, it appears not to (which itself is somewhat concerning). In any case, we will have to deal with this if and when we can detect these leaks. Given that these concerns all have at least a preliminary answer, we will move forward with attempting to not shut down NSS in Firefox. This may expose unexpected issues that may lead to a reassessment of the situation, so this will be on a trial basis only in Nightly. [0] https://csrc.nist.gov/CSRC/media/projects/cryptographic-module-validation-program/documents/security-policies/140sp3070.pdf MozReview-Commit-ID: LjgEl1UZqkC
security/manager/ssl/nsNSSComponent.cpp
security/manager/ssl/nsNSSShutDown.cpp
security/manager/ssl/nsNSSShutDown.h
--- a/security/manager/ssl/nsNSSComponent.cpp
+++ b/security/manager/ssl/nsNSSComponent.cpp
@@ -2138,16 +2138,21 @@ nsNSSComponent::InitializeNSS()
 }
 
 void
 nsNSSComponent::ShutdownNSS()
 {
   MOZ_LOG(gPIPNSSLog, LogLevel::Debug, ("nsNSSComponent::ShutdownNSS\n"));
   MOZ_RELEASE_ASSERT(NS_IsMainThread());
 
+  // If we don't do this we might try to unload the loadable roots while the
+  // loadable roots loading thread is setting up EV information, which can cause
+  // it to fail to find the roots it is expecting.
+  Unused << BlockUntilLoadableRootsLoaded();
+
   // This is idempotent and can happen as a result of observing
   // profile-before-change and being called from nsNSSComponent's destructor.
   // We need to do this before other cleanup because we must avoid acquiring
   // mMutex and then preventing threads holding nsNSSShutDownPreventionLocks
   // from continuing (which is what evaporateAllNSSResourcesAndShutDown does).
   MOZ_LOG(gPIPNSSLog, LogLevel::Debug, ("evaporating psm resources"));
   if (NS_FAILED(nsNSSShutDownList::evaporateAllNSSResourcesAndShutDown())) {
     MOZ_LOG(gPIPNSSLog, LogLevel::Error, ("failed to evaporate resources"));
@@ -2182,22 +2187,17 @@ nsNSSComponent::ShutdownNSS()
   // TLSServerSocket may be run with the session cache enabled. This ensures
   // those resources are cleaned up.
   Unused << SSL_ShutdownServerSessionIDCache();
 
   // Release the default CertVerifier. This will cause any held NSS resources
   // to be released (it's not an nsNSSShutDownObject, so we have to do this
   // manually).
   mDefaultCertVerifier = nullptr;
-
-  if (NSS_Shutdown() != SECSuccess) {
-    MOZ_LOG(gPIPNSSLog, LogLevel::Error, ("NSS SHUTDOWN FAILURE"));
-  } else {
-    MOZ_LOG(gPIPNSSLog, LogLevel::Debug, ("NSS shutdown =====>> OK <<====="));
-  }
+  // We don't actually shut down NSS.
 }
 
 nsresult
 nsNSSComponent::Init()
 {
   MOZ_RELEASE_ASSERT(NS_IsMainThread());
   if (!NS_IsMainThread()) {
     return NS_ERROR_NOT_SAME_THREAD;
--- a/security/manager/ssl/nsNSSShutDown.cpp
+++ b/security/manager/ssl/nsNSSShutDown.cpp
@@ -37,51 +37,28 @@ static const PLDHashTableOps gSetOps = {
   ObjectSetInitEntry
 };
 
 StaticMutex sListLock;
 Atomic<bool> sInShutdown(false);
 nsNSSShutDownList *singleton = nullptr;
 
 nsNSSShutDownList::nsNSSShutDownList()
-  : mObjects(&gSetOps, sizeof(ObjectHashEntry))
-  , mPK11LogoutCancelObjects(&gSetOps, sizeof(ObjectHashEntry))
+  : mPK11LogoutCancelObjects(&gSetOps, sizeof(ObjectHashEntry))
 {
 }
 
 nsNSSShutDownList::~nsNSSShutDownList()
 {
   MOZ_ASSERT(this == singleton);
   MOZ_ASSERT(sInShutdown,
              "evaporateAllNSSResourcesAndShutDown() should have been called");
   singleton = nullptr;
 }
 
-void nsNSSShutDownList::remember(nsNSSShutDownObject *o)
-{
-  StaticMutexAutoLock lock(sListLock);
-  if (!nsNSSShutDownList::construct(lock)) {
-    return;
-  }
-
-  MOZ_ASSERT(o);
-  singleton->mObjects.Add(o, fallible);
-}
-
-void nsNSSShutDownList::forget(nsNSSShutDownObject *o)
-{
-  StaticMutexAutoLock lock(sListLock);
-  if (!singleton) {
-    return;
-  }
-
-  MOZ_ASSERT(o);
-  singleton->mObjects.Remove(o);
-}
-
 void nsNSSShutDownList::remember(nsOnPK11LogoutCancelObject *o)
 {
   StaticMutexAutoLock lock(sListLock);
   if (!nsNSSShutDownList::construct(lock)) {
     return;
   }
 
   MOZ_ASSERT(o);
@@ -137,156 +114,34 @@ nsresult nsNSSShutDownList::evaporateAll
 
   // This makes this function idempotent and protects against reentering it
   // (which really shouldn't happen anyway, but just in case).
   if (sInShutdown) {
     return NS_OK;
   }
 
   StaticMutexAutoLock lock(sListLock);
-  // Other threads can acquire an nsNSSShutDownPreventionLock and cause this
-  // thread to block when it calls restructActivityToCurrentThread, below. If
-  // those other threads then attempt to create an nsNSSShutDownObject, they
-  // will call nsNSSShutDownList::remember, which attempts to acquire sListLock.
-  // Consequently, holding sListLock while we're in
-  // restrictActivityToCurrentThread would result in deadlock. sListLock
-  // protects the singleton, so if we enforce that the singleton only be created
-  // and destroyed on the main thread, and if we similarly enforce that this
-  // function is only called on the main thread, what we can do is check that
-  // the singleton hasn't already gone away and then we don't actually have to
-  // hold sListLock while calling restrictActivityToCurrentThread.
   if (!singleton) {
     return NS_OK;
   }
-
-  // Setting sInShutdown here means that threads calling
-  // nsNSSShutDownList::remember will return early (because
-  // nsNSSShutDownList::construct will return false) and not attempt to remember
-  // new objects. If they properly check isAlreadyShutDown(), they will also not
-  // attempt to call NSS functions or use NSS resources.
   sInShutdown = true;
-
-  {
-    StaticMutexAutoUnlock unlock(sListLock);
-    PRStatus rv = singleton->mActivityState.restrictActivityToCurrentThread();
-    if (rv != PR_SUCCESS) {
-      MOZ_LOG(gPIPNSSLog, LogLevel::Debug,
-              ("failed to restrict activity to current thread"));
-      return NS_ERROR_FAILURE;
-    }
-  }
-
-  MOZ_LOG(gPIPNSSLog, LogLevel::Debug, ("now evaporating NSS resources"));
-  for (auto iter = singleton->mObjects.Iter(); !iter.Done(); iter.Next()) {
-    auto entry = static_cast<ObjectHashEntry*>(iter.Get());
-    entry->obj->shutdown(nsNSSShutDownObject::ShutdownCalledFrom::List);
-    iter.Remove();
-  }
-
-  singleton->mActivityState.releaseCurrentThreadActivityRestriction();
   delete singleton;
-
   return NS_OK;
 }
 
-void nsNSSShutDownList::enterActivityState(/*out*/ bool& enteredActivityState)
-{
-  StaticMutexAutoLock lock(sListLock);
-  if (nsNSSShutDownList::construct(lock)) {
-    singleton->mActivityState.enter();
-    enteredActivityState = true;
-  }
-}
-
-void nsNSSShutDownList::leaveActivityState()
-{
-  StaticMutexAutoLock lock(sListLock);
-  if (singleton) {
-    singleton->mActivityState.leave();
-  }
-}
-
 bool nsNSSShutDownList::construct(const StaticMutexAutoLock& /*proofOfLock*/)
 {
   if (sInShutdown) {
     return false;
   }
 
   if (!singleton && XRE_IsParentProcess()) {
     singleton = new nsNSSShutDownList();
   }
 
   return !!singleton;
 }
 
-nsNSSActivityState::nsNSSActivityState()
-:mNSSActivityStateLock("nsNSSActivityState.mNSSActivityStateLock"),
- mNSSActivityChanged(mNSSActivityStateLock,
-                     "nsNSSActivityState.mNSSActivityStateLock"),
- mNSSActivityCounter(0),
- mNSSRestrictedThread(nullptr)
-{
-}
-
-nsNSSActivityState::~nsNSSActivityState()
-{
-}
-
-void nsNSSActivityState::enter()
-{
-  MutexAutoLock lock(mNSSActivityStateLock);
-
-  while (mNSSRestrictedThread && mNSSRestrictedThread != PR_GetCurrentThread()) {
-    mNSSActivityChanged.Wait();
-  }
-
-  ++mNSSActivityCounter;
-}
-
-void nsNSSActivityState::leave()
-{
-  MutexAutoLock lock(mNSSActivityStateLock);
-
-  --mNSSActivityCounter;
-
-  mNSSActivityChanged.NotifyAll();
-}
-
-PRStatus nsNSSActivityState::restrictActivityToCurrentThread()
-{
-  MutexAutoLock lock(mNSSActivityStateLock);
-
-  while (mNSSActivityCounter > 0) {
-    mNSSActivityChanged.Wait(PR_TicksPerSecond());
-  }
-
-  mNSSRestrictedThread = PR_GetCurrentThread();
-
-  return PR_SUCCESS;
-}
-
-void nsNSSActivityState::releaseCurrentThreadActivityRestriction()
-{
-  MutexAutoLock lock(mNSSActivityStateLock);
-
-  mNSSRestrictedThread = nullptr;
-
-  mNSSActivityChanged.NotifyAll();
-}
-
-nsNSSShutDownPreventionLock::nsNSSShutDownPreventionLock()
-  : mEnteredActivityState(false)
-{
-  nsNSSShutDownList::enterActivityState(mEnteredActivityState);
-}
-
-nsNSSShutDownPreventionLock::~nsNSSShutDownPreventionLock()
-{
-  if (mEnteredActivityState) {
-    nsNSSShutDownList::leaveActivityState();
-  }
-}
-
 bool
 nsNSSShutDownObject::isAlreadyShutDown() const
 {
-  return mAlreadyShutDown || sInShutdown;
+  return false;
 }
--- a/security/manager/ssl/nsNSSShutDown.h
+++ b/security/manager/ssl/nsNSSShutDown.h
@@ -11,103 +11,64 @@
 #include "mozilla/Mutex.h"
 #include "mozilla/StaticMutex.h"
 #include "nscore.h"
 #include "nspr.h"
 
 class nsNSSShutDownObject;
 class nsOnPK11LogoutCancelObject;
 
-// Singleton, owned by nsNSSShutDownList
-class nsNSSActivityState
-{
-public:
-  nsNSSActivityState();
-  ~nsNSSActivityState();
-
-  // Call enter/leave when PSM enters a scope during which
-  // shutting down NSS is prohibited.
-  void enter();
-  void leave();
-  // Wait for all activity to stop, and block any other thread on entering
-  // relevant PSM code.
-  PRStatus restrictActivityToCurrentThread();
-
-  // Go back to normal state.
-  void releaseCurrentThreadActivityRestriction();
+// Yes, this races. We don't care because we're only temporarily using this to
+// silence compiler warnings. Without it every instance of a
+// nsNSSShutDownPreventionLock would be unused, causing the compiler to
+// complain. This will be removed when we've demonstrated that not shutting down
+// NSS is feasible (and beneficial) and we can remove the machinery entirely in
+// bug 1421084.
+static int sSilenceCompilerWarnings;
 
-private:
-  // The lock protecting all our member variables.
-  mozilla::Mutex mNSSActivityStateLock;
-
-  // The activity variable, bound to our lock,
-  // used either to signal the activity counter reaches zero,
-  // or a thread restriction has been released.
-  mozilla::CondVar mNSSActivityChanged;
-
-  // The number of active scopes holding resources.
-  int mNSSActivityCounter;
-
-  // nullptr means "no restriction"
-  // if not null, activity is only allowed on that thread
-  PRThread* mNSSRestrictedThread;
-};
-
-// Helper class that automatically enters/leaves the global activity state
 class nsNSSShutDownPreventionLock
 {
 public:
-  nsNSSShutDownPreventionLock();
-  ~nsNSSShutDownPreventionLock();
-private:
-  // Keeps track of whether or not we actually managed to enter the NSS activity
-  // state. This is important because if we're attempting to shut down and we
-  // can't enter the NSS activity state, we need to not attempt to leave it when
-  // our destructor runs.
-  bool mEnteredActivityState;
+  nsNSSShutDownPreventionLock()
+  {
+    sSilenceCompilerWarnings++;
+  }
+
+  ~nsNSSShutDownPreventionLock()
+  {
+    sSilenceCompilerWarnings--;
+  }
 };
 
 // Singleton, used by nsNSSComponent to track the list of PSM objects,
 // which hold NSS resources and support the "early cleanup mechanism".
 class nsNSSShutDownList
 {
 public:
-  // track instances that support early cleanup
-  static void remember(nsNSSShutDownObject *o);
-  static void forget(nsNSSShutDownObject *o);
-
   // track instances that would like notification when
   // a PK11 logout operation is performed.
   static void remember(nsOnPK11LogoutCancelObject *o);
   static void forget(nsOnPK11LogoutCancelObject *o);
 
   // Release all tracked NSS resources and prevent nsNSSShutDownObjects from
   // using NSS functions.
   static nsresult evaporateAllNSSResourcesAndShutDown();
 
   // PSM has been asked to log out of a token.
   // Notify all registered instances that want to react to that event.
   static nsresult doPK11Logout();
 
-  // Signal entering/leaving a scope where shutting down NSS is prohibited.
-  // enteredActivityState will be set to true if we actually managed to enter
-  // the NSS activity state.
-  static void enterActivityState(/*out*/ bool& enteredActivityState);
-  static void leaveActivityState();
-
 private:
   static bool construct(const mozilla::StaticMutexAutoLock& /*proofOfLock*/);
 
   nsNSSShutDownList();
   ~nsNSSShutDownList();
 
 protected:
-  PLDHashTable mObjects;
   PLDHashTable mPK11LogoutCancelObjects;
-  nsNSSActivityState mActivityState;
 };
 
 /*
   A class deriving from nsNSSShutDownObject will have its instances
   automatically tracked in a list. However, it must follow some rules
   to assure correct behaviour.
 
   The tricky part is that it is not possible to call virtual
@@ -202,50 +163,33 @@ class nsNSSShutDownObject
 public:
   enum class ShutdownCalledFrom {
     List,
     Object,
   };
 
   nsNSSShutDownObject()
   {
-    mAlreadyShutDown = false;
-    nsNSSShutDownList::remember(this);
   }
 
   virtual ~nsNSSShutDownObject()
   {
     // The derived class must call
     //   shutdown(ShutdownCalledFrom::Object);
     // in its destructor
   }
 
-  void shutdown(ShutdownCalledFrom calledFrom)
+  void shutdown(ShutdownCalledFrom)
   {
-    if (!mAlreadyShutDown) {
-      switch (calledFrom) {
-        case ShutdownCalledFrom::Object:
-          nsNSSShutDownList::forget(this);
-          break;
-        case ShutdownCalledFrom::List:
-          virtualDestroyNSSReference();
-          break;
-        default:
-          MOZ_CRASH("shutdown() called from an unknown source");
-      }
-      mAlreadyShutDown = true;
-    }
   }
 
   bool isAlreadyShutDown() const;
 
 protected:
   virtual void virtualDestroyNSSReference() = 0;
-private:
-  volatile bool mAlreadyShutDown;
 };
 
 class nsOnPK11LogoutCancelObject
 {
 public:
   nsOnPK11LogoutCancelObject()
     : mIsLoggedOut(false)
   {