Choosing File Formats

If you're moving to a different word processor, spreadsheet or presentation program to Microsoft Office's, such as OpenOffice, you can't expect the Microsoft Office a lot of other people might be using to understand your documents if saved in OpenOffice's native file formats. If you're sending files to other people, though maybe millions of them are already using OpenOffice, we're only at the beginning of the change that software, and other Free Software, is bringing about so you'll need to choose file formats others can read. This document should give you the understanding you need to make the decision of what file formats to choose and when.

The notation that is the underlying form of a file, that describes to the program you open it in its structure and presentation (such as justification, type face, and highlighting in a word processed document), can take various forms, each of which can achieve seemingly similar but importantly differing results technically, socially and/or politically.

Whenever you save a file, you implicity make a decision as to the file format you're saving in. Thus you should be mindful of the file formats you're using. In the majority of applications, when you first save, you choose a file name, location to save it to and usually (in a box below that for entering the name) there is the option for choosing the file format; there will usually be a default offering for this.

Some file formats have been developed by organisations, who, for economic reasons, will not explain the notation of their file format to others; have obscurred it from view of those looking; and will intentionally alter it over time, in a non backwardly compatible manner, to induce its users to upgrade and pay again for new versions. and so that only that organisation's software applications can read and write them.
This non-co-operative strategy, motivated by economics, significantly spoils the effectiveness of our work when using such software, and puts at risk the ability of that work to be read by people in the future.
This strategy locks users of that software in to using only that software for that particular function, once they've begun saving their files in its format. This is known as 'vendor lock in'. Some file formats have been reverse-engineered by other developers and incorporated into other software applications, but this method of blindfolded development often results in an inferior file conversion.
There exist other file formats which are defined as open standards by a standard setting body, their underlying notation is documented and freely available to anyone who wants to incorporate them into their own software, without charge; they're unencumbered by copyright, patent and trade mark restriction.

An example of the trouble we can get into with proprietary file formats is a recent law passed in the USA (the DMCA or Digital Millenium Copyright Act, which the US government is persuading other countries to implement, perhaps even using the threat of trade sanctions to help persuade) under which it could conceivably be illegal, punishable by jail sentence, to convert your own documents saved in Microsoft's file format to someone else's, because you would be 'cracking' the file format's notation, and that notation is copyrighted by Microsoft, not you, despite it being your work that is bound up in that notation.

When working with computers we should have the choice as to which applications we use, not have that choice taken from us by being locked in to a single file format. Microsoft, not through an ethic of quality but through illegal monopolistic practices, have succesfully locked people in to their file formats (Word's DOC format for example is one but they are trying this with them all, including trying to make the world wide web, whose success is almost entirely attributable to its open-ness, a closed format needing Microsoft's tools to use it) and their office suite. OpenOffice.org are trying to break that monopoly by reverse-engineering Microsoft's formats, developing filters by which we can convert to and from those file formats, and have developed a new free file format (based on XML, a standard format defined by the World Wide Web Consortium) to hopefully replace Microsoft's, throughout the world. Other office suites are incorporating the OpenOffice file format into their own (i.e. AbiWord).

The OpenOffice developers are trying to make a better world for us to work in, and the only way the change can happen is if we use the software they're creating freely for us and help displace non-free formats with free formats.

Because of hurdles Microsoft have put in the way of people reverse engineering their file formats, there are problems with converting to and from them using OpenOffice. Doubtless these problems will be worked out in time as more people use OpenOffice and report problems with it. Using OpenOffice's own file formats should aleviate this problem.

The best advice we can give is that you use OpenOffice's file formats internally and convert (by using the 'File' -> 'Save As' menu option) to other formats when sending to people outside of the organsation, remembering to choose the file format best suited to how it will be used (more on that in a later revision of this document).

Formats that word processed work can be saved in:

OpenOffice Write (.sxw) - an open file format, currently only readable in OpenOffice but soon in others, i.e. AbiWord

HTML (.html and .htm) - an open file format, a standard of the World Wide Web Consortium, readable by any web browser

plain text (.txt) - has no notation for presentation, readable by any word processor or plain text editor

Adobe's Portable Document Format (.pdf) - requires PDF reader or writer software, not universally acceptable. The next version of OpenOffice will be able to save in this format

Microsoft's Rich Text Format (.rtf) - a more freely available than most Microsoft format. Not perfect but doesn't more understandable in more word processor's than Microsoft Word's native format

Microsoft's Word (.doc) - requires expensive software only available for Microsoft Windows and Apple operating systems to read

References

OpenOffice.org: http://openoffice.org/