Courtesy of TJ at devnet
function formatXmlString($xml) { // add marker linefeeds to aid the pretty-tokeniser (adds a linefeed between all tag-end boundaries) $xml = preg_replace('/(>)(<)(\/*)/', "$1\n$2$3", $xml); // now indent the tags $token = strtok($xml, "\n"); $result = ''; // holds formatted version as it is built $pad = 0; // initial indent $matches = array(); // returns from preg_matches() // scan each line and adjust indent based on opening/closing tags while ($token !== false) : // test for the various tag states // 1. open and closing tags on same line - no change if (preg_match('/.+<\/\w[^>]*>$/', $token, $matches)) : $indent=0; // 2. closing tag - outdent now elseif (preg_match('/^<\/\w/', $token, $matches)) : $pad--; // 3. opening tag - don't pad this one, only subsequent tags elseif (preg_match('/^<\w[^>]*[^\/]>.*$/', $token, $matches)) : $indent=1; // 4. no indentation needed else : $indent = 0; endif; // pad the line with the required number of leading spaces $line = str_pad($token, strlen($token)+$pad, ' ', STR_PAD_LEFT); $result .= $line . "\n"; // add to the cumulative result, with linefeed $token = strtok("\n"); // get the next token $pad += $indent; // update the pad size for subsequent lines endwhile; return $result; }
Hi,
I just wanted to say that that is a fantastic bit of code. Great stuff
Also, would there be a way to exclude certain tags such as and tags from creating a line break and formatting.
Thanks for the feedback matt – unfortunately there’s currently no way to selectively exclude tags from being formatted and indented… out of interest, what kind of tags did you have in mind? I could possibly put exceptions in if there was a need for them.
Thanks! Exactlyl I was looking for.
Simple and powerful. Tom
Very nice code here!
Nice stuff! Would be great to have tabs instead of spaces. Something like “formatXmlString($xml, $indent = ”\t”)”.
Cheers
Enrico
Doesn’t work for empty tags like – you need to remove the newline between them afterwards:
// add marker linefeeds to aid the pretty-tokeniser (adds a linefeed between all tag-end boundaries)
$xml = preg_replace(’/(>)(< )(\/*)/’, ”$1\n$2$3”, $xml);
// remove lines between an empty tag
$xml = preg_replace(”|/ ] )(\s([^>] ))?>\n\\1|”, “”, $xml);
Seems my comment got cut off:
// remove lines between an empty tag
$xml = preg_replace(”|/ ] )(\s([^>] ))?>\n\\1|”, “”, $xml);
Custom padding
1. Modify function declaration this way: function formatXmlString($xml, $padstr = ” ”);
2. change $line declaration this way:
$line = str_repeat($padstr, $pad).$token;
You can also use the linux xmllint command as so…
public static function formatXmlString($xml) {
$xml = escapeshellarg($xml);
$result = array();
exec(‘echo ’.$xml.’ | xmllint—format -’,$result);
$result = join(”\n”,$result);
return $result;
}
very nice, thanks!
good one, some little problems for me :
– i added \s* to get reed of existing whitespaces :
// add marker linefeeds to aid the pretty-tokeniser (adds a linefeed between all tag-end boundaries)
$xml = preg_replace(’/(>)\s*(< )(\/*)/’, ”$1\n$2$3”, $xml);
– i added $indent=0; too because the indent fail sometime for me ! :
// 2. closing tag – outdent now
elseif (preg_match(’/^< \/\w/’, $token, $matches)) :
$pad—;$indent=0;
thanks for this, made me win a lot of time.
Thanks! Just added as a handy php macro for use in EditPlus text editor.
Really great function you wrote here.