Flattening LaTeX source files for use with conversion software

I’ve been working with LaTeX quite a bit recently and I came across a very silly shortcoming that some of the programs that convert LaTeX to other formats suffer from. It’s moronic, but certain of them (pandoc, I’m looking at you), don’t follow includes (i.e. \input commands).

I have a PHP script which compiles all my LaTeX source files into various destination formats, including PDF, HTML, MediaWiki markup and text, and so in order to compensate for this ridiculous incapability I added this function which creates a “flattened” version of any LaTeX file by inserting the contents of any included files at the point they are included. It’s recursive, so if those included files also include files, they too are processed.

It’s simple but effective. Pass the path to the file to flatten. A string is returned with the entire flattened file. You may need to pay attention to relative paths in your \input commands if you includes are in sub-directories.

function FlattenLatexSourceFile($path) {
	// read in file
	$src = file($path);
	$flattened = "";

	// process each line
	foreach ($src as $line) {
		if (preg_match("/^\\\input/",$line)) {

			// include (input) line, extract filename and recurse
			$input = trim($line);
			$input = preg_replace("/^\\\input\{/","",$input);
			$input = preg_replace("/\}/","",$input);
			if (substr($input,-4,4) != ".tex") $input = "{$input}.tex";
			$flattened .= "% {$line}\n";
			$flattened .= FlattenLatexSourceFile($input);
			$flattened .= "\n% (end of included file {$input})\n";
		} else {
			// normal line
			$flattened .= $line;
	// return flattened file
	return $flattened;