sqlitestore: create new connections on new PIDs draft
authorGregory Szorc <gregory.szorc@gmail.com>
Tue, 11 Dec 2018 15:46:46 -0800
changeset 53701 c5e2395f0a2a766126f8726ccd928a2a73ec82e2
parent 53700 76d8b20139a3b8b5835c7262216b97275845b582
push id1082
push usergszorc@mozilla.com
push dateWed, 12 Dec 2018 00:02:30 +0000
sqlitestore: create new connections on new PIDs If the Mercurial process fork()s, the Python thread ID remains unchanged. The previous code for returning a SQLite connection would recycle the existing connection among all children. Mercurial can fork() when performing working directory updates. For reasons I don't fully understand, the recycling of even a read-only SQLite connection was resulting in Python raising a "DatabaseError: database disk image is malformed" exception. This message comes from the bowels of SQLite. I suspect there is some internal client state in the SQLite database somewhere and having multiple clients race to update it results in badness. Who knows. This commit teaches the "get a SQLite connection" logic to also verify the PID matches before returning an existing connection. Differential Revision: https://phab.mercurial-scm.org/D5411
hgext/sqlitestore.py
--- a/hgext/sqlitestore.py
+++ b/hgext/sqlitestore.py
@@ -71,16 +71,17 @@ from mercurial import (
     pycompat,
     registrar,
     repository,
     util,
     verify,
 )
 from mercurial.utils import (
     interfaceutil,
+    procutil,
     storageutil,
 )
 
 try:
     from mercurial import zstd
     zstd.__version__
 except ImportError:
     zstd = None
@@ -1000,27 +1001,28 @@ class sqliterepository(localrepo.localre
             self._dbconn.commit()
 
         tr.addfinalize('sqlitestore', committransaction)
 
         return tr
 
     @property
     def _dbconn(self):
-        # SQLite connections can only be used on the thread that created
-        # them. In most cases, this "just works." However, hgweb uses
-        # multiple threads.
-        tid = threading.current_thread().ident
+        # SQLite connections can only be used on the OS and Python thread that
+        # created them. In most cases, this "just works." However, hgweb uses
+        # multiple Python threads. And Mercurial may fork. So we need to check
+        # global state before returning an existing connection.
+        key = (procutil.getpid(), threading.current_thread().ident)
 
         if self._db:
-            if self._db[0] == tid:
+            if self._db[0] == key:
                 return self._db[1]
 
         db = makedb(self.svfs.join('db.sqlite'))
-        self._db = (tid, db)
+        self._db = (key, db)
 
         return db
 
 def makedb(path):
     """Construct a database handle for a database at path."""
 
     db = sqlite3.connect(encoding.strfromlocal(path))
     db.text_factory = bytes