Encoding errors produce mojibake — text like "café" rendering as "café" or "—" rendering as "â€"". The root cause is a mismatch somewhere in the chain: file saved as one encoding, server declares another, browser interprets a third, database stores a fourth. The fix is alignment: UTF-8 everywhere, BOM nowhere, utf8mb4 in MySQL. This guide walks through auditing each layer and fixing the breaks. For related fixes, see the HTML Checker Fixes index.
<head> should contain near the top:
<meta charset="UTF-8">If you see
charset="ISO-8859-1", charset="windows-1252", or no charset at all — that's a problem.
curl -sI https://yourdomain.com/ | grep -i content-typeExpected:
content-type: text/html; charset=UTF-8. If charset is missing or different from meta, that's where mojibake comes from.
file -i /path/to/template.phpExpected:
charset=utf-8. charset=iso-8859-1 means the file is saved wrong. charset=utf-8 with BOM means there's a Byte Order Mark to remove.
SHOW VARIABLES LIKE 'character_set%';Want all entries set to
utf8mb4. utf8 (MySQL's mistake — only 3 bytes) doesn't support emoji or rare characters.
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Page Title</title> ...Browsers must see the charset declaration within the first 1024 bytes of the document. Putting it first in
<head> guarantees that.
http {
charset utf-8;
charset_types text/html text/css text/plain application/javascript application/json application/xml;
}
# In httpd.conf or .htaccess AddDefaultCharset UTF-8 AddCharset UTF-8 .html .css .js .json .xml
// At the top of any PHP file or in a global bootstrap
header('Content-Type: text/html; charset=UTF-8');
find . -type f \( -name "*.php" -o -name "*.html" \) -exec grep -l $'^\xef\xbb\xbf' {} +
Lists every file starting with a UTF-8 BOM.
find . -type f \( -name "*.php" -o -name "*.html" \) -exec sed -i '1s/^\xef\xbb\xbf//' {} +
Removes the BOM from the first line of each matching file. Test on a backup first.
mysqldump -u user -p database > backup-before-utf8mb4.sql
ALTER DATABASE mydatabase CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci; ALTER TABLE wp_posts CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; -- repeat for every tableIn WordPress, set
DB_CHARSET in wp-config.php to 'utf8mb4' after migration.
// PHP / MySQLi
$mysqli->set_charset('utf8mb4');
// PHP / PDO
new PDO('mysql:host=localhost;dbname=db;charset=utf8mb4', ...);
// WordPress: handled by wp-config DB_CHARSET
Verify UTF-8 is consistent across meta, headers, files and database.
Run HTML Checker →