[NTLUG:Discuss] convert HTML to SGML

Mark Bickel exumb at exu.ericsson.se
Wed Aug 4 13:08:01 CDT 1999


> From discuss-admin at ntlug.org Wed Aug  4 10:41 CDT 1999
> From: Kendall Clark <kclark at ntlug.org>

> >>>>> "Mark" == Mark Bickel <exumb at exu.ericsson.se> writes:
> 
>     Mark> Hi all, I'm looking for programs that will convert HTML docs
>     Mark> to SGML. I've seen some that work SGML to HTML.  Any
>     Mark> sources?
> 
> Into what kind of SGML? As asked, this question is only marginally
> coherent. HTML *is* SGML already. HTML is one among many applications
> of SGML, so what you're asking is how to convert an application of
> SGML into what? Another kind of SGML?
> 
> I'm familiar with this area enough to give you some pointers, but only 
> with some more detail. :>

Yes, of course HTML is a subset of SGML.
I have a bunch of M$ Word docs that need converting into SGML. The
Word docs all have a corporate standard header and footer. There 
exists corporate standard SGML DTDs that incorporate equivalent 
headers, footers, page layout, etc. So I can export Word -> HTML.
Matching/replacing tags can be accomplished using roll-your-own
scripts to hammer the HTML into SGML that outputs looking close
(better) than the original M$Word code. I would prefer a more "out
of the box" solution that would streamline the conversion process,
as there are thousands of pages that need this conversion.

The goal is worthy: elimination of Word as a document format for
an entire library of documents that must be maintained and updated
on a regular basis, and having one standard - SGML.

Best,
Mark.Bickel at ericsson.com







More information about the Discuss mailing list