Go to

For More Information, Contact:
Andrew Sallans, Head of Strategic Data Initiatives
Sherry Lake, Senior Scientific Data Consultant
About the Scientific Data Consulting Group

 
Go to Scientific Data Consulting home

File Formats and Data Types

What types of data are we talking about?

Data can mean many different things, but there are typically four main categories that it can be sorted into for management purposes. The category that you choose will then have an effect upon the choices that you make throughout the rest of your data management plan.

Observational

  • Captured in real-time
  • Usually irreplaceable
  • Examples: Sensor readings, telemetry, survey results, images

Experimental

  • Data from lab equipment
  • Often reproducible, but can be expensive
  • Examples: gene sequences, chromatograms, magnetic field readings

Simulation

  • Data generated from test models
  • Models and metadata, where the Input more important than output data
  • Examples: climate models, economic models

Derived or compiled

  • Reproducible (but very expensive)
  • Examples: text and data mining, compiled database, 3D models

These data can come in many forms: text, numerical, mulitmedia, models, software, discipline specific (i.e., FITS in astronomy, CIF in chemistry), or instrument specific.

What are the issues around file formats?

One favorite saying is that the best part about standards is that there are plenty to choose from. This holds true for file formats, and means that it is important to think carefully about what file format will be best for long-term preservation and continued access to your data.

Consider the following:

  • Accessible in the future
  • Non-proprietary
  • Open, documented standard
  • Common, used by the research community
  • Standard representation (ASCII, Unicode)
  • Good if not software specific

Best Formats:

  • Unencrypted
  • Uncompressed
  • PDF, not Word
  • ASCII, not Excel
  • MPEG-4, not Quicktime
  • TIFF or JPEG2000, not GIF or JPG
  • XML or RDF, not RDBMS



University of Virginia Library
PO Box 400113, Charlottesville, VA 22904-4113
ph: (434) 924-3021, fax: (434) 924-1431, library@virginia.edu

Text Version    |   Libraries   |   Depts./Contacts   |  U.Va. Home   |   ITC

Website Feedback   |   Search   |   Questions? Ask a Librarian   |   Hours   |   Map   |   Policies   |   Jobs

Tracking Opt-out    |   © by the Rector and Visitors of the University of Virginia

Federal Library Depository logo  This library is a Congressionally designated depository for U.S. Government documents. Public access to the Government documents is guaranteed by public law.