servo: Merge #12500 - Expand the documentation for DOMString (from Ms2ger:DOMString); r=nox
authorMs2ger <Ms2ger@gmail.com>
Wed, 20 Jul 2016 01:41:55 -0500
changeset 339326 90c92334a916642e221b3ff3c4477f60bf226a90
parent 339325 02cdb0067389c06511d1cfd947aeab278089c28e
child 339327 c240fafa26f6fdedc44dd6c3a7f3d172536e6b87
push id31307
push usergszorc@mozilla.com
push dateSat, 04 Feb 2017 00:59:06 +0000
treeherdermozilla-central@94079d43835f [default view] [failures only]
perfherder[talos] [build metrics] [platform microbench] (compared to previous push)
reviewersnox
servo: Merge #12500 - Expand the documentation for DOMString (from Ms2ger:DOMString); r=nox There was some confusion on IRC about its purpose; hopefully this will clarify the situation. Source-Repo: https://github.com/servo/servo Source-Revision: 86ed7bfc099a73e2f560e4cf45ddb0ccbc05f0e3
servo/components/script/dom/bindings/str.rs
--- a/servo/components/script/dom/bindings/str.rs
+++ b/servo/components/script/dom/bindings/str.rs
@@ -108,16 +108,51 @@ pub fn is_token(s: &[u8]) -> bool {
             32 => false, // separators
             x if x > 127 => false, // non-CHARs
             _ => true,
         }
     })
 }
 
 /// A DOMString.
+///
+/// This type corresponds to the [`DOMString`](idl) type in WebIDL.
+///
+/// [idl]: https://heycam.github.io/webidl/#idl-DOMString
+///
+/// Cenceptually, a DOMString has the same value space as a JavaScript String,
+/// i.e., an array of 16-bit *code units* representing UTF-16, potentially with
+/// unpaired surrogates present (also sometimes called WTF-16).
+///
+/// Currently, this type stores a Rust `String`, in order to avoid issues when
+/// integrating with the rest of the Rust ecosystem and even the rest of the
+/// browser itself.
+///
+/// However, Rust `String`s are guaranteed to be valid UTF-8, and as such have
+/// a *smaller value space* than WTF-16 (i.e., some JavaScript String values
+/// can not be represented as a Rust `String`). This introduces the question of
+/// what to do with values being passed from JavaScript to Rust that contain
+/// unpaired surrogates.
+///
+/// The hypothesis is that it does not matter much how exactly those values are
+/// transformed, because passing unpaired surrogates into the DOM is very rare.
+/// In order to test this hypothesis, Servo will panic when encountering any
+/// unpaired surrogates on conversion to `DOMString` by default. (The command
+/// line option `-Z replace-surrogates` instead causes Servo to replace the
+/// unpaired surrogate by a U+FFFD replacement character.)
+///
+/// Currently, the lack of crash reports about this issue provides some
+/// evidence to support the hypothesis. This evidence will hopefully be used to
+/// convince other browser vendors that it would be safe to replace unpaired
+/// surrogates at the boundary between JavaScript and native code. (This would
+/// unify the `DOMString` and `USVString` types, both in the WebIDL standard
+/// and in Servo.)
+///
+/// This type is currently `!Send`, in order to help with an independent
+/// experiment to store `JSString`s rather than Rust `String`s.
 #[derive(Clone, Debug, Eq, Hash, HeapSizeOf, Ord, PartialEq, PartialOrd)]
 pub struct DOMString(String);
 
 impl !Send for DOMString {}
 
 impl DOMString {
     /// Creates a new `DOMString`.
     pub fn new() -> DOMString {