HTML parsing rules at times differentiate character tokens that are all null bytes, all whitespace, or other content. This patch introduces a new function which may be used to classify text node sub-regions and lead to more efficient application of these parsing rules.
Further, when classified in this way, application code may skip some rules and decoding entirely, improving performance. For example, this can be used to ease the implementation of skipping inter-element whitespace, which is usually not rendered.
Developed in https://github.com/WordPress/wordpress-develop/pull/7236
Discussed in https://core.trac.wordpress.org/ticket/61974
Props dmsnell, jonsurrell.
Fixes#61974.
Built from https://develop.svn.wordpress.org/trunk@58970
git-svn-id: http://core.svn.wordpress.org/trunk@58366 1a063a9b-81f0-0310-95a4-ce76da25c4cd