The long awaited XML 1.0 Specification was released in February. XML was created and developed by the W3C XML Working Group, which includes key industry players such as Adobe, ArborText, DataChannel, Inso, Hewlett-Packard, Isogen, Microsoft, NCSA, Netscape, SoftQuad, Sun, Texcel, Vignette, and Fuji Xerox together with experts in structured documents and electronic publishing.
XML 1.0 is a subset of SGML (Standard Generalised Markup Language, ISO 8879) for use on the Web. It retains SGML's basic features but in a form that is much easier to implement and understand. XML can be processed by existing commercial SGML tools and a rapidly growing number of free ones of its own.
XML is primarily aimed at the large-scale Web content providers for industry-specific mark-up, vendor-neutral data exchange, media-independent publishing, etc. It can also be used in metadata applications. XML is fully internationalised for both European and Asian languages, with all conforming processors required to support the Unicode character set. The language is designed for the quickest possible client-side processing consistent with its primary purpose as an electronic publishing and data interchange format.
The XML1.0 specification is available at: http://www.w3.org/TR/REC-xml
XML is carefully designed to avoid the requirement for delivery of multiple document components when one will do. All external addressing in the XML domain is via standard Web addresses.
XML, while much simpler than SGML, and optimised for network applications, is fully compatible, thus leveraging the substantial base of SGML tools and experience.
Programs to process XML are easy to write. Within a few days of the first public draft, freeware implementations arrived on the Internet. The number of implementations is now well into double figures, and is rapidly growing.
XML avoids the pitfalls of insufficient attention to internationalisation, and of being so general as to impair interoperability. This is made possible by leveraging the use of the Unicode (ISO 10646) standard for internationalised character sets.
While optimised for network delivery, the design of XML includes many features designed to support authoring, indexing, and other types of application. XML's general applicability is demonstrated by the first wave of applications which concentrate on structured machine-to-machine data interchange, and generalised metadata; none of these applications were particular design targets for the working group.
Unlike any other Internet data format, the specification of XML includes a precise and rigorous set of rules for error and exception handling. This ensures that XML data will normally be well-formed, and, when errors occur, common fallback procedures can be established.
The main characteristic of XML is user-defined tags. A simple example is:
<?XML version="1.0"?> <exam> <question>Who is the last King of England</question> <answer>George VI</answer> <question>How many queens were named Elizabeth </question> <answer>Two</answer> </exam>
Elements are the most common form of mark-up. They are delimited by angle brackets and define the content they enclose. Three elements (question, answer and exam) are used in the above example.
This could be used for transmitting the answers to an exam paper from one site to another. If the two parties involved had agreed the format, there is no reason why a formal Document Type Definition (DTD) needs to be specified.
However, to formalise the format of the exchange, the DTD for the above would be something like:
<!ELEMENT exam (question, answer)+ > <!ELEMENT question #PCDATA> <!ELEMENT answer #PCDATA>
This states that the answer to an exam paper consists of a set of questions each followed by answers, each of which is of type PCDATA (Parseable Character Data).
Libwww, W3C's general-purpose Web API, provides a sample implementation of HTTP and other Internet protocols and serves as a testbed for protocol experiments within W3C. It is freely available. See: www.w3.org/Library/Distribution.html.
The recent Release 5.1j is a "second generation" HTTP/1.1 implementation that uses persistent connections, pipelining, smart output buffering, and persistent caching. This was the version of the library used to test the performance of HTTP 1.1, CSS1 and PNG. See: http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html for the impressive results.
Libwww is a general code base that can be used as a basis for building a large variety of World Wide Web applications. Its main purpose is to provide services to transmit data objects rendered in many different media types either to or from a remote server using the most common Internet access methods or the local file system. It provides plain C reference implementations of those specifications and is especially designed to be used on a large set of different platforms. Version 3.1 supports more than 20 Unix flavours, VMS, Windows NT, and ongoing work is being done to extend the set of platforms.
As part of the Esprit W3C-LA leveraging Action, a one-day technical workshop is planned at RAL on Monday 27 April 1998. This workshop has been designed to highlight the new tools and techniques that will make up the "Web of the Future".
Topics to be covered include HTTP 1.1, XML, CSS, RDF, PICS, P3P, CGM and SMIL.
Further details can be obtained from the W3C Office at RAL (w3c-ral@inf.rl.ac.uk).
Events coming up in Europe in the next few weeks:
Seven new W3C members joined in February. The number of members has now reached 242 with a regional break down of:
Full | Affiliate | |
---|---|---|
Americas | 31 | 111 |
Europe | 32 | 35 |
Asia-Oceania | 15 | 18 |
The new members are:
The annual World-Wide Web Conference in Brisbane, Australia is now only a few weeks away (14-18 April). W3C will be running a track throughout the Conference describing many of the W3C activities. There will also make a major contribution to Developer's Day on the last day of the Conference. This is the best time of the year to visit Brisbane!