Files
wordpress/wp-admin/includes/image.php
dmsnell ede3a79f96 Add wp_is_valid_utf8() for normalizing UTF-8 checks.
There are several existing mechanisms in Core to determine if a given string contains valid UTF-8 bytes or not. These are spread out and depend on which extensions are installed on the running system and what is set for `blog_charset`. The `seems_utf8()` function is one of these mechanisms.

`seems_utf8()` does not properly validate UTF-8, unfortunately, and is slow, and the purpose of the function is veiled behind its name and historic legacy.

This patch deprecates `seems_utf()` and introduces `wp_is_valid_utf8()`; a new, spec-compliant, efficient, and focused UTF-8 validator. This new validator defers to `mb_check_encoding()` where present, otherwise validating with a pure-PHP implementation. This makes the spec-compliant validator available on all systems regardless of their runtime environment.

Developed in https://github.com/WordPress/wordpress-develop/pull/9317
Discussed in https://core.trac.wordpress.org/ticket/38044

Props dmsnell, jonsurrell, jorbin.
Fixes #38044.

Built from https://develop.svn.wordpress.org/trunk@60630


git-svn-id: http://core.svn.wordpress.org/trunk@59966 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2025-08-12 18:15:36 +00:00

42 KiB