Contact us Heritage collections Image license terms
HOME ACL ACD C&A INF CCD CISD Literature
Further reading □ Overview □ 1998 □ 123456789101112 □ 1999 □ 131415161718192021222324 □ 2000 □ 252627282930313233343536 □ 2001 □ 373839404142434445464748 □ 2002 □ 495051525354555657585960 □ 2003 □ 616263646566676869707172 □ 2004 □ 737475767778798081828384 □ 2005 □ 858687888990919293949596 □ 2006 □ 979899100101102103104105106107108
Harwell Archives Contact us Heritage archives Image license terms

Search

   
CISD and DCILiteratureW3C UK News (1998-2006)
CISD and DCILiteratureW3C UK News (1998-2006)
ACL ACD C&A INF CCD CISD Archives
Further reading

Overview
1998
123456789101112
1999
131415161718192021222324
2000
252627282930313233343536
2001
373839404142434445464748
2002
495051525354555657585960
2003
616263646566676869707172
2004
737475767778798081828384
2005
858687888990919293949596
2006
979899100101102103104105106107108

Issue 52: April 2002

Editorial: Web Internationalisation

Results of recent surveys of web pages and web usage by Global Reach and FUNDREDES show that the English language content of the Web is now down to 40% of the total web content. The major 60% is presented in other languages. Similarly, web users are now mostly non-native English speakers whose browsers default to the character set of another language. These figures are extrapolatable to show the rise of non-English languages on the web will continue - particularly in the Far Eastern languages.

A consequence of this is that search engines and other web agents are now becoming smarter about the language in which pages are written and how to present them to users.

The document character set for XML and HTML 4.0 is Unicode (aka ISO 10646). This means that HTML browsers and XML processors should behave as if they used Unicode internally. But it doesn't mean that documents have to be transmitted in Unicode. As long as client and server agree on the encoding, they can use any encoding that can be converted to Unicode.

It is very important that the character encoding of any XML or (X)HTML document is clearly labeled. This can be done in the following ways:

With this information, clients can easily map these encodings to Unicode. In practice, a few encodings will be preferred, most likely: ISO-8859-1 (Latin-1), US-ASCII, UTF-8 , UTF-16, also the other encodings in the ISO-8859 series, iso-2022-jp, euc-kr, and so on.

If you are producing web pages in English, you must still make the character set declarations. If you do not, then readers whose web browsers default to non-English character encodings will see your web pages as a jumble of incomprehensible strokes. These are becoming the majority of web users that you are not presenting your material to if you do not make a character set declaration. If you do, then their browsers will make the mapping and present the English text as you intended.

Several UK W3C members produce web sites using non-English material (e.g. Arabic, Chinese, Japanese). A brief survey shows that most of these are either not using character set declarations, or are using proprietary charsets such as those provided by Microsoft. If you do not use a character set declaration then the chances are that your intended audience will not be able to read the web page. If you use a proprietary character set declaration then your web pages will not be readable by the audience who do not have that character set. They will not have the character set if they do not have the proprietary operating system or browser that provides the character set. If you use these proprietary character sets you are vastly limiting you audience and market. There will be no way for their tools to map from a proprietary character set that they do not know about to Unicode.

If you are a tool producer you should ensure that your tools are capable of handling character set declarations correctly.

The UK office of W3C is currently producing a primer on Web Site Internationalisation which will be announced in this newsletter when it is available. In the meantime, there is considerable guidance available from W3C on internationalisation on the W3C internationalisation web pages


W3C Interop Tour 2002 - Dublin Event.

The World Wide Web Consortium (W3C) will be holding a series of one day events around Europe this spring to promote W3C technology Recommendations and show how they facilitate interoperability on the World Wide Web.

The W3C Interop Tour of Europe 2002 will be holding the following events:

The Event on the 30th May in Dublin is arranged by the UK Office of W3C and it is hoped that as many readers from the UK and Ireland as possible can attend.

Euroweb 2002 Conference.

The Euroweb 2002 Conference will be held at St Anne's College, Oxford, UK on the 17th and 18th December 2002. EuroWeb 2002 will be a major international forum at which research on the World Wide Web, GRIDs and Web Services is presented. EuroWeb 2002 follows on from the success of the EuroWeb 2001, which was held in Pisa in December, 2001 on the topic of the web in public administration.

Jigsaw 2.2.1 Released.

8 April 2002: Jigsaw version 2.2.1 is available for download. The new version includes a security fix for URI parsing, a new JigShell utility, XHTML/HTML validation on PUT, JigEdit support for WebDAV, Apache mod_asis, and PushCache contributed by Paul Henshaw. The release notes list all new features and bug fixes. Jigsaw is W3C's leading-edge Web server platform implemented in Java. Learn more about the Jigsaw Activity.

Platform for Privacy Preferences (P3P) Becomes a W3C Recommendation

16 April 2002: The World Wide Web Consortium today released "The Platform for Privacy Preferences 1.0 (P3P 1.0)" as a W3C Recommendation. The specification has been reviewed by the W3C Membership, who favor its adoption by industry. P3P allows people to define and publish their Web site privacy policies, and helps automate how those policies are read. P3P also gives users control over the use of their personal information on Web sites they visit, thus promoting trust and confidence in the Web. Read the press release and testimonials.

RDF Primer Working Draft Published

The RDF Core Working Group has released the first public Working Draft of the RDF Primer. The Resource Description Framework (RDF) is a general-purpose language for representing information in the Web. This primer provides the fundamentals required to use RDF in applications. Read about the Semantic Web Activity.

IsaViz - A Visual Authoring Tool for RDF Announced

W3C's Semantic Web Advanced Development initiative announces the release of IsaViz, a visual environment for browsing and authoring RDF models represented as graphs. IsaViz has a 2.5D user interface allowing smooth zooming and navigation. IsaViz supports RDF/XML and N-Triple import and export, and SVG and PNG export. Developed by Emmanuel Pietriga of W3C and Xerox Research Centre Europe, IsaViz is based on the Xerox Visual Transformation Machine, Hewlett-Packard's Jena, Graphviz from AT&T Research, and Apache's Xerces. Learn more about IsaViz.


W3C Team Presentations in April

Browse past W3C Team talks and presentations and upcoming W3C appearances and events.


New W3C Members

Please welcome:


Current Software Releases

⇑ Top of page
© Chilton Computing and UKRI Science and Technology Facilities Council webmaster@chilton-computing.org.uk
Our thanks to UKRI Science and Technology Facilities Council for hosting this site