Skip to content

DOMDocument::saveHTML() incorrectly escapes IPv6 URLs in element attributes #21390

@andronocean

Description

@andronocean

Description

URLs with an IPv6 address as the host use square brackets [] around the address, per RFC 3986. The saveHTML() method on DOMDocument incorrectly URL-encodes these square brackets in attributes that expect a URL value (like href, src, and action). Other attributes I tested don't seem to be affected.

This example with various permutations of attributes and IPv6 URLs:

<?php
$html = <<<EOD
<html>
<head>
<link rel='stylesheet' href='http://[::1]:5173/app.css'/>
<script src='https://[::1]:5173/app.js'></script>
</head>
<body>
<a href='http://[::1]' data-custom='http://[::1]'>anchor to http://[::1]</a>
<form action='http://[::1]'></form>
<blockquote cite='http://[::1]'></blockquote>
</body>
</html>
EOD;

$document = new DOMDocument();
$document->loadHTML($html);

print $document->saveHTML();

Resulted in this output:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<link rel="stylesheet" href="http://%5B::1%5D:5173/app.css">
<script src="https://%5B::1%5D:5173/app.js"></script>
</head>
<body>
<a href="http://%5B::1%5D" data-custom="http://[::1]">anchor</a>
<form action="http://%5B::1%5D"></form>
<blockquote cite="http://[::1]"></blockquote>
</body>
</html>

But I expected this output instead:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<link rel="stylesheet" href="http://[::1]:5173/app.css">
<script src="https://[::1]:5173/app.js"></script>
</head>
<body>
<a href="http://[::1]" data-custom="http://[::1]">anchor</a>
<form action="http://[::1]"></form>
<blockquote cite="http://[::1]"></blockquote>
</body>
</html>

(cite on <blockquote> seems to be unaffected, even though by spec it should be a URL.)

The internal representation of such an attribute within the class is unaffected; the escaping happens only on output with saveHTML().

I also checked Dom\HTMLDocument::saveHTML(), and that method returns all attributes correctly without escaping. I know that is the preferred version today, but a great many older codebases still rely on DOMDocument.

Live example comparing both classes: https://3v4l.org/9gXDT#v8.4.18

PHP Version

PHP 8.4.17 (cli) (built: Jan 13 2026 17:17:10) (NTS)
Copyright (c) The PHP Group
Built by Shivam Mathur
Zend Engine v4.4.17, Copyright (c) Zend Technologies
    with Xdebug v3.5.0, Copyright (c) 2002-2025, by Derick Rethans
    with Zend OPcache v8.4.17, Copyright (c), by Zend Technologies

Operating System

macOS 15.7.4

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions