Home Universal File Interface
 
Universal File Interface (UFI): querying large files without database loading Print E-mail

Why use the Universal File Interface?

One of the main objections we hear from technologists who avoid using databases to manage their data is that their data files are very large and too cumbersome/difficult/slow/costly to load into a database.

Database management systems typically have built-in mechanisms for loading unstructured text files, but the sorts of files used by technologists are different - they are often structured and binary.  Some examples are HDF5/BioHDF, NetCDF, GRIB, NITFS, and FITS.  The fact that they are structured and binary means that special software interfaces have to be used to work with them - loading them into a database requires writing custom code using these, often complex, interfaces.

The size of files worked on by technologists is growing exponentially - it is not uncommon, for example, for Next Generation DNA sequencing instruments and LOFAR astronomy instruments to generate terabytes of data in a single day. Loading such amounts of data into a traditional database is not an option!

Technologists rarely want to analyze a single file in isolation; rather, they may wish to compare different files or analyze file data in the context of other information - information that may be stored in a database, for instance. Carrying out such analyses often requires the writing of complex code to retrieve and merge information.

At BCS we have developed a solution to the challenges outlined above.  We call this solution the Universal File Interface (UFI). UFI allows the contents of files to be queried just as if the contents were stored in database tables. But database load times and large space requirements are avoided, and the file data can still be queried alongside, and joined with, actual database table data.

How does UFI work?

UFI is based on the IBM Informix Virtual Table Interface (VTI). VTI is a technology that supports making external data sets appear as tables to SQL queries and statements. The UFI server communicates with file-structure-specific adapter programs. The following diagram illustrates how UFI, VTI, and the adapter programs work together.

 

 

But what if I don't have an Informix DBMS?

In 2010 IBM made available the Innovator-C edition, a free version of its easy-to-administer Informix product.  This edition is limited in the number of cores and amount of memory available to the Informix engine, but since much of the work done by UFI happens outside the database server (i.e., in the adapter programs) this is really not an issue.

But is it fast?

File formats such as HDF5 and NetCDF are often used to store large direct access multidimensional arrays of data. UFI can take advantage of such structures and translate SQL WHERE clauses into very efficient direct accesses. In addition, UFI includes facilities for building indexes on file data.

What if my other database data is in non-Informix DBMS?

UFI virtual tables behave like normal Informix tables, meaning that products such as Oracle Bridge can be used to query these "tables" alongside Oracle tables.

What file types does UFI work with?

We have written adapters for HDF5, NetCDF, CSV, GDAL, and DBF. In addtion, we've developed an easy-to-use UFI Adapter SDK with which adapters for other file formats could be written.

How can I give it a try?

There are two options for using UFI: This e-mail address is being protected from spambots. You need JavaScript enabled to view it that can be run using the (free) VMware player and the (free) Informix 11.5 Developer Edition SLES 11 virtual appliance demo (see "VMware Option" below), or This e-mail address is being protected from spambots. You need JavaScript enabled to view it from BCS that can be installed into an existing Informix database instance (see "Native Informix Option" below).

VMware Option

The document The Informix / Universal File Interface (UFI) VMware Appliance explains how to use the Universal File Interface (UFI) inside the free Informix 11.5 Developer Edition SLES 11 virtual appliance demo.

Once you've installed UFI and want to continue with further development, the The BCS Universal File Interface (UFI) document will tell you everything you need to know!

To see how UFI works without actually installing it, you may wish to instead just download the document Running the Universal File Interface (UFI) NetCDF Demo on VMware, which provides screenshots of a demo that uses UFI.

UFI allows you to join file-based data with database table data in a single SQL SELECT statement.  Download the document Running the Universal File Interface (UFI) Spatial DataBlade Demo on VMware to see illustrations of another demo, this one using UFI along with the Spatial DataBlade to view weather forecast information. You may also view the animation here.

Native Informix Option

The document Installing the BCS Universal File Interface (UFI) Informix DataBlade explains how to use the Universal File Interface (UFI) inside an existing Informix DBMS instance (including instances running the free Innovator-C Edition of Informix).

Once you've installed UFI and want to continue with further development, the The BCS Universal File Interface (UFI) document will tell you everything you need to know!

 

To find out how to use UFI with your application, contact BCS.