Contact us Heritage collections Image license terms
HOME ACL ACD C&A INF CCD CISD Literature
Further reading □ Overview □ 1998 □ 123456789101112 □ 1999 □ 131415161718192021222324 □ 2000 □ 252627282930313233343536 □ 2001 □ 373839404142434445464748 □ 2002 □ 495051525354555657585960 □ 2003 □ 616263646566676869707172 □ 2004 □ 737475767778798081828384 □ 2005 □ 858687888990919293949596 □ 2006 □ 979899100101102103104105106107108
Harwell Archives Contact us Heritage archives Image license terms

Search

   
CISD and DCILiteratureW3C UK News (1998-2006)
CISD and DCILiteratureW3C UK News (1998-2006)
ACL ACD C&A INF CCD CISD Archives
Further reading

Overview
1998
123456789101112
1999
131415161718192021222324
2000
252627282930313233343536
2001
373839404142434445464748
2002
495051525354555657585960
2003
616263646566676869707172
2004
737475767778798081828384
2005
858687888990919293949596
2006
979899100101102103104105106107108

Issue 85: January 2005

What language is this web page in?

In September 2004 GlobalReach estimate that only 35% of web users are native users of the English language; 35.7% use other European languages, while 32.3% use Asian languages. Those non-English users will not have browsers that default to an English character set, or the English language. They may use operating systems that do not support the proprietary character set used to develop a web page. In the April 2002 Newsletter the editorial addressed the issue of stating the character set used on a web page. Here the issue of the actual language used is addressed.

Once the character set is stated, then the web page also needs to state the natural language in which it is written. If the natural language is not stated then translation tools may not be able to automatically translate the text; search engines may not be able to filter the page correctly; CSS2 may not be able to render the page as intended; browsers may not be able to select the appropriate font; and the page may not be usable by accessibility aids such as text to speech readers, with the consequence that the owners, authors and hosts could all be liable for prosecution in the UK under the Disability Discrimination Act, and in other countries under their appropriate legislation. The Disability Rights Commission is willing to provide support for test cases being brought by individual disabled people so this legislation should not be dismissed lightly.

Character encoding does not enable unambiguous identification of a natural language. The language attribute unambiguously specifies the 'natural language' of web page content. It should always be used to indicate the primary language of the web page (in the main page container element). If the language changes within the main page container element this should also be reflected in a sub container element, eg., span, div, td, p, etc.

In XML the special attribute named xml:lang may be inserted in documents to specify the language used in the contents and attribute values of any element in an XML document as shown below. The values of the attribute are language identifiers as defined by IETF RFC 3066.

<p xml:lang="en-GB">What colour is it?</p>
<p xml:lang="en-US">What color is it?</p>

For HTML 4, language codes are specified by adding the lang attribute to the html tag as shown below for a document in Canadian French.

<html lang="fr-CA">

When serving XHTML as text/html, you should use both the lang attribute and the xml:lang attribute in the html element. The xml:lang attribute is the standard way to identify language information in XML. The following shows how you would mark up the previous example for XHTML 1.0 served as text/html.

<html lang="fr-CA" xml:lang="fr-CA" xmlns="http://www.w3.org/1999/xhtml">

The xml:lang attribute is not actually useful for handling the file as HTML, but takes over from the lang attribute any time you treat the document as XML for, say, scripting or validation.

If you are serving XHTML 1.0 pages as XML (ie. using a MIME type such as application/xhtml+xml) or serving pages as XHTML 1.1, you do not need the lang attribute, since this is part of the HTML language. The xml:lang attribute alone will suffice.

<html xml:lang="fr-CA" xmlns="http://www.w3.org/1999/xhtml">

Few authors write web pages by hand, and most of us rely on editors and other development tools. But you should still check that the code that these are producing states the character set and the language of the page, and if not you need to decide if you will risk the consequences described above, change your tools, or manually edit the pages.

Further Guidance on setting language of web pages is available from W3C in Tutorial: Using language information in XHTML, HTML and CSS.

W3C Celebrates Its Tenth Anniversary

2004-11-30: This year, the World Wide Web Consortium celebrates its tenth anniversary - ten years of its mission to lead the Web to its full potential. On 1 December, W3C Members, Team, invited speakers, and international media gathered in Boston, USA to reflect on the progress of the Web, W3C's central role in its growth, and the risks and opportunities facing the Web during W3C's second decade. "This special anniversary brings the opportunity to acknowledge the impact of the Web and the W3C's stewardship role," said Tim Berners-Lee, W3C Director. "I hope it will also inspire ever more collaboration, creativity, and understanding across the globe." Sign the greeting card, read the press release and read more about the W3C Tenth Anniversary Celebration.

Massachusetts Governor Declares December 2004 "W3C Month"

2004-12-09: In a proclamation issued 1 December, Massachusetts Governor Mitt Romney has declared December 2004 to be World Wide Web Consortium Month. Read by COO Steve Bratt at the W3C Tenth Anniversary Celebration, the proclamation cites W3C for "its good work and concern for the diverse users of the Web" and says W3C "earned their respect, trust and support." See the official document and read the full text.

W3C/Keio Presents at SFC Open Research Forum (ORF 2004) in Tokyo

2004-11-18: SFC Open Research Forum (ORF) (in Japanese) is an annual open house event of the Keio Research Institute of Shonan Fujisawa Campus (SFC), Keio University, Japan. At ORF 2004, W3C/Keio organized a talk session, "W3C Forum in ORF," on 24 November. Tatsuya Hagino chaired, and Masayasu Ishikawa, Martin Durst, Yoshio Fukushige and Kazhiro Kitagawa gave talks on Web technologies such as Compound Document Formats, Internationalization, the Semantic Web and Social Information Filtering. The event is open to interested companies and the general public.

Press Highlights

Browse W3C in the Press. A selection of articles since the last Newsletter:

W3C Team Talks

Browse upcoming W3C appearances and events.


New W3C Members

Please welcome:


Current Software Releases

⇑ Top of page
© Chilton Computing and UKRI Science and Technology Facilities Council webmaster@chilton-computing.org.uk
Our thanks to UKRI Science and Technology Facilities Council for hosting this site