sythyry: (Default)
[personal profile] sythyry
I want to transform some HTML files with XSL.
Techy details:
  • It's a lot of files, like a couple hundred.  Doing it via command line or something else batchly would be preferred.
  • It's on a Windows machine.
  • They're HTML, not XHTML.  (Which annoys me, 'cause I've got all the gadgetry to do it on XHTML)
  • I do, indeed, want to transform files -- I want to put the output files into somewhere convenient, so I can toss 'em on a web site or something.
  • I have surely forgotten details.
Thanks for any help!

Date: 2005-10-06 07:54 pm (UTC)
From: [identity profile] mattlazycat.livejournal.com
How much PHP do you know? There's a PEAR module called HTML_SAX that'll present an XML parser interface for badly formed documents (like HTML). If you can get XML_XSLT_Wrapper to use that instead of its default XML parser, it'll do transformations on multiple files for you. If not, you could use Tidy to convert your HTML to XML as an intermediary step. What sort of system are you running? I might be able to help.

I hope I'm not talking Nice Language at you! :)

Date: 2005-10-06 07:59 pm (UTC)
From: [identity profile] sythyry.livejournal.com
Nah, I've got a PhD in computer languages, I can follow. I've met PHP (but hadn't thought of using it on this) and JTidy. This approach sounds like it'll take a bit of programming to get the tools working -- I was hoping to hear about a command-line XSL processor with a -html flag on it, or something like that, but your approach sounds good if I actually have to do some work on the tools.

Date: 2005-10-06 08:00 pm (UTC)
From: [identity profile] mattlazycat.livejournal.com
If you're using windows, then Microsoft's Command Line Transformation Utility used with Tidy (link in the previous comment) should do pretty much what you wanted to. Easier than coding PHP anyway :)

Date: 2005-10-06 08:01 pm (UTC)
From: [identity profile] sythyry.livejournal.com
Sounds good! Thanks!

Profile

sythyry: (Default)
sythyry

January 2013

S M T W T F S
  12345
678 9101112
13141516171819
20212223242526
2728293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 17th, 2026 08:59 pm
Powered by Dreamwidth Studios