File Formats

Best practice is to save your data to standard formats that most software is capable of interpreting. The USU Libraries has a created a table of some of the most commonly used formats, with a PDF version available for download.

The information in this table was compiled from several resources:

For general tips on selecting a file format, see Document and store data using stable file formats by DataONE. Data management best practices strongly suggest using open (non-proprietary) file formats when possible to allow for easier utilization of data. However, any propriety format that is a de facto standard for a profession of that is supported by mutliple tools (e.g. Excel, Shapefile) can be used, see LoC Recommended Formats Statement, section IV. Rermember that some features in proprietary file formats may not be available when the files are utilizted in software or applications other than the original (e.g. cell shading or formulas in .xlsx files will be usable in Excel, but not in many statistical analysis softwares).

Data Type Preferred Formats Acceptable Formats
Text Documents Open Document Format (.odt or .ods) Portable Document Format (.pdf)
XML with Standard DTD or Link to Schema (.xml)
HTML (.htm or .html)
Plain Text - encoding: UTF-8 or ASCII (.txt)
PDF - other subtypes (.pdf)
Open Office XML (.docx or .xlsx)
Rich Text Format (.rtf)
EPUB (.epub)
Data Sets with Extensive Metadata SPSS Portable Format (.por)
Delimited Text and Command ('setup') File (SPSS, Stata, SAS, etc.)
Structured Text or Mark-up File of Metadata Information (e.g. DDI XML File)
Proprietary Formats of Statistical Packages:
SPSS (sav)
Stata (.dta)
MS Access (.mdb/.accdb)
Data Sets with Minimal Metadata Comma Separated Variables (.csv)
Office Open XML (.xlsx)
Tab Delimited (.txt)
Open Office Spreadsheet (.ods)
dBASE (.dbf)
Vector Image Scalable Vector Graphics (.svg) Standard CAD Drawing (.dwg)
AutoCAD Drawing Interchange Format (.dxf)
Computer Graphics Metafile (.cgm)
Adobe Illustrator (.ai)
Raster Image TIFF - Uncompressed (.tif or .tiff) JPEG (.jpeg or .jpg)
JPEG2000 - Uncompressed or Lossless Compressed (.jp2 or .j2k)
PDF/A or PDF/X - Graphic Exchange Format (.pdf)
PNG (.png)
TIFF - Lossless Compressed (.tif)
GIF (.gif)
RAW Image Format (.raw)
Photoshop Files (.psd)
BMP (.bmp)
Audio Files Broadcast Wave (.wav)
Audio Interchange File Format (.aif, .aiff)
Free Lossless Audio Coding (FLAC) (.flac)
MP3 (.mp3)
Advanced Audio Coding (.aac, .m4p, .m4a)
MIDI (.mid, .midi)
Ogg Vorbis (.ogg)
Video Files AVI - Uncompressed, Motion JPEG (.avi) AVI - MPEG-4 Codec (.avi)
AVI - Lossless Compressed (.avi)
Quicktime Movie - Lossless Compressed (.mov)
Windows Media Video (.wmv)
MPEG-2 (.mpg, .mpeg)
MPEG-4 - Preferred H.264 Codec
Vector Geospatial Data ESRI Shapefile (.shp, .shx, .dbf, .sbn, .sbx, .prj, .xml) PostGIS Database
Raster Geospatial Data Geo-referenced TIFF (.tif, .tfw) Digital Raster Graphic as TIFF (.tif, .tfw, fdg)
ESRI ArcInfo Grid (.adf, .dat, .nit)
Presentation Files PDF (.pdf)
Open Office (.odp)
Office Open XML (.pptx)