|
Comparing Databases without Breaking the Bank or
Your Brain
by
Andrew Z. Adkins, III
Legal Technology Institute
at the University of Florida College of Law
Note: This article first appeared in Law Technology
News, February 2003
Have
you ever been involved in a dispute where one company files a lawsuit
claiming copyright infringement on a database product? How do you
look at one database and compare it to another? It's not as difficult
as you think. With a little time and some common software tools you
probably already have in your office, you might even be able to do
the work yourself.
A
BASIC INTRODUCTION TO DATABASE TERMS
Most people have heard of databases and have a basic understanding
of what a database does - it stores data in an organized manner. There
are some basic database terms you'll need to know.
-
Data
Field Data fields store individual pieces of information.
Data fields are named and have attributes associated with them.
For example, the length of the data field describes how much data
can be stored in that field. Similarly, if the data field "expects"
to store dates, it would be formatted as a date field format such
as MO/DAY/YR. There are also other types of attributes associated
with a data field (other than the length and format described above),
but let's not get too complicated - let's stick with these for now.
-
Data
Table a database is designed using data tables. Database
designers have their own "style," meaning if a new database
application is similar to a previously designed database, they will
most likely "borrow" from their previous design and modify
it, instead of reinventing the wheel, so to speak. In most cases,
databases are designed with multiple tables and the tables are also
named. Data tables are often designed around an organization of
data fields. For example, if the database application is designed
for a retail store, there may be a data table with information about
customers. There may be another table with information about suppliers.
There may be a third table with information about employees. Each
table will have a number of data fields that describe and store
the information in that data table.
-
Relationships
you've heard the term "relational database." Relationships
are used to relate one table in a database to another table in the
database. Relationships can be complex and because we're trying
to keep this simple, let's just mention that table relationships
are part of the database structure.
-
Queries
Queries provide you with the means of searching for information
within a database. Queries are typically programmed or created by
the database designer as part of the database system, though many
databases allow you to create "ad hoc" queries.
-
Forms
Many databases use a "form" that organizes the
data fields on the computer screen (similar to a form or checklist
you may use in your office). These forms are one way to enter data
into the database.
-
Reports
Reports are used to look at the data in an organized, printed
format. Reports are typically created (initially) by the database
designer and become part of the database.
-
Data
- data is the actual information stored in the database and is entered
into the data fields. This information usually belongs to the customer
purchasing the database and not the database developer. But that
(and other legal issues) should be covered in the Agreement.
-
Data
Records remember that the database contains tables, which
in turn contains data fields. When you enter information into the
database, a data record is created. For example, if you enter the
complete customer information in the database, that collective information
about the customer (individual pieces of information in the data
fields) becomes a data record.
COMPARING
DATABASES WHAT DO YOU COMPARE?
You could run yourself ragged trying to figure out how the various
bits and pieces of a database could be compared, but when it boils
down to it, there are four issues to review: the raw data itself,
the structure of the database, how information gets into and out of
the database, and the operational system itself.
-
"The"
Data Data, often called raw data, gets into the database
either by the end user entering data directly into the system through
a form or it may be imported. If one of your legal issues involves
the actual data, then by all means, this will be something you'll
need to compare between databases. It is not uncommon for commercial
databases to be "seeded" with false data, allowing a company
one method to see if it's database has been copied.
-
Database
Structure the database structure involves the actual
design of the database. Here is where you'll find the fields, tables,
relationships, records, forms, macros, queries, modules, and reports.
If you are comparing databases, you'll need to compare each of these
elements individually. This may sound like a daunting task, and
it could be, but take them one at a time, and you'll see the methodology
will flow easier with each comparison. You may also find that you
can start with a "top down" compare methodology and stop
when you have enough information. In some cases, you may only need
to compare at the table level (remember, there are a lot more data
fields than there are tables). You may need to compare the data
fields in one table in one database with the data fields in one
table in the other database. In the same sense, you may need to
compare relationships, records, forms, macros, queries, modules
and reports. The key is to first understand what you need to compare
within the database structure. You may not need to compare everything
within the databases. For example the dispute may only focus on
the data tables and data fields, and not the other database elements.
-
Access
to the Data
how does the data get into a database? If it's a desktop
database, most likely the data is entered through a "form"
that is created for the end user. In other cases, it may be a mass
import of data. If the database is used on the Internet, then the
database designer may have created program files that interface
with an Internet Web browser. There are three common programming
systems that interface databases to the Internet: Cold Fusion, Active
Server Pages, and PHP Hypertext Pre-Processor. Without going into
a lot of technical detail and trying to keep things simple, it's
suffice to know that each of these three systems is used to program
the system to send and receive information between the Internet
Web browser and the database.
-
Operational
System - the "Look and Feel" finally, a rose
is a rose is a rose, right? Well, not necessarily, especially when
the rose falls under legal definitions. This can get somewhat complicated,
but for purposes of this article, to compare two different operational
database systems, you should count on having two computers side-by-side.
Compare how the systems are accessed, the "flow" from
one screen to another, and the arrangement of data fields on the
forms (or how the Internet interface looks). You can pretty easily
spot any particular similarities it will be up to you how
to argue those issues for or defend against.
A
POOR MAN'S METHOD OF COMPARING DATABASES
Now that you know a little about databases and what to compare, let's
discuss several tools that you may already have on your desktop computer
that you can use in your comparison. You can use your word processor
to edit and manipulate text, such as names of data tables, data fields,
attributes, and relationships. You can use your spreadsheet to count,
evaluate, and analyze text, especially if you're doing a side-by-side
comparison. If you've got a copy of Microsoft Access, you can"look"
at the data and the structure of the data base.
Let's
assume you have all three of these tools and I'll walk you through
a sample step-by-step comparison methodology. Let's also assume you
have the databases you want to compare in Microsoft Access format
(there are many methods of comparing databases, but we're trying to
keep this simple and use common desktop tools).
The
first thing you want to do is to open the database using Microsoft
Access. You'll be able to see the structure of the database, including
the number of tables, the data fields within those tables, the raw
data, and various other functions and programs within the database.
You can also get a quick understanding of the structure of the database
using Microsoft Access' "Documenter" tool. It will analyze
the database structure and present a document describing the database
structure. This is where your comparison begins.
The
output of the Documenter tool is a text document. You can save it
to Microsoft Word in "rtf" (Rich Text Format) or save it
to Microsoft Excel format. Because the document may be quite large,
you might save it to Microsoft Word and then use Word's editing features
to edit down to what you are looking to compare. Remember, depending
on the particulars of the law suit, you may only need to look at certain
elements of the database structure.
For
simplicity, let's narrow down the database comparison to tables and
fields. You may want to keep a copy of the full output as backup for
your comparison. After editing out the unwanted information, such
as page numbers, file names and paths, you can then use a standard
"cut and paste" to select the entire edited text (tables
and data fields) and paste them into a spreadsheet. Now you've got
a complete listing of tables and all fields within those tables. If
you only need to compare the database tables and table names, then
delete the data field information. You might want to keep a count
of the number of data fields within those tables if it's important
to show differences between two databases. For example, one company
may follow one design style using many tables and fewer fields per
table while another company may follow a different design style using
fewer tables but more fields per table. This may be an important note
in database comparisons.
Do
the same function for the other database you want to compare. Make
sure you're comparing apples with apples, so use the same methodology
as for the first database. Now you can use Excel to review the table
names side-by-side. Are they the same? Are there similarities? What
about the number of data fields per table; any similarities? You can
decide from this simple review. If there are too many similarities
(such as the same name of a table or the same naming convention style),
then you may need to go one step further and compare the data fields
within the data tables. It's an extra step and will require a little
more work, but it's the same methodology.
Now
you've got information about two database structures. Is this enough
for your case or do you need other comparisons, such as queries or
programming files? Either way, you've got a good start in understanding
how to compare databases.
Andrew Z. Adkins
III is a nationally recognized expert in law office technology.
He is the director of the Legal Technology Institute at the University
of Florida Levin College of Law. He is a founding member of The Legal
Consulting Group, an association of independent computer consulting
firms working with the legal profession. He is also the author
of "Computerized Case Management Systems: Choosing and Implementing
the Right Software for You," published by the ABA Law Practice
Management Section.
|