Database Development Assignment: Create and Populate a Database using SQL
Creating a small author-article-publication database
In this assignment, we’ll follow the database design process with a step-by-step example. Your client, Wellness Week is a small, specialized publisher that publishes articles on health and wellness issues. The owner would like to have a small database that would track authors and articles.
Phase 1: Analysis
During the analysis phase the database designer will ask questions of the various stakeholders to try to get as complete a view as possible of the business requirements. It is very important at this stage not to limit interviews and interactions with a select few users or a subset of users. The director’s needs of the database, for example, might be quite different from someone who interacts regularly with authors.
The designer might ask questions like
· How do you define an author? Can an individual be listed as an author before even submitting an article?
· What types of information (data) would you like to know/capture about authors?
· How do you define an article? Are there different types of articles? What other sorts of information/data might be interesting to know about articles?
· Where are the articles published? Online, print, or both? Can a single article appear in more than one type of publication?
Below are some of the findings from the analysis phase. If you have additional questions or need clarification, do not hesitate to ask your instructor.
· Authors can sometimes simply express interest in submitting an article. So it would be good to have them in the database, even if they haven’t submitted anything.
· Wellness Week would like to have basic contact information for all authors: name, address, phone number, email.
· Wellness Week currently publishes a monthly newsletter only. They do not currently publish online. Each monthly newsletter has a specific theme (title). It is possible that an article might be reprinted in more than one edition of a newsletter.
· For each article, they would like to capture the author name, date of publication and length (in words).
· Just as an author can exist in the database without an article, in the planning stages of a publication, it would be helpful to have a publication without an article.
· The articles can have only one author.
Phase 2: Design
Based on this information, we can start to think about the conceptual design and consider the following entities:
Would publisher be an entity? The answer is no, but if you are unsure of why not, ask your instructor!
Now we can begin to consider the relationship between the various entities.
· An author can write many articles, or one, or none.
· An article can have only one author.
· Does an article need to have an author? We did not ask that question in our analysis phase, but it would be a good idea to ask. We’ll consider that there are no ‘anonymous’ articles, so each article must have one author (and only one).
We can further take a first shot at the following attributes:
· Author: first name, last name, address, city, state, zip, country, phone number, email address
· Article: title, author, length
· Publication: title, date of publication
The Entity Relationship Diagram might look like this:
There is ONE many-to-many relationship in the figure above. As explained in the Relationships Among Entities document, many-to-many relationships must be converted into one-to-many relationships so they can be implemented in a DBMS. An intersection entity between Article and Publication should be created. We’ll call it article_pub.
The next step in the design phase is to develop the logical and physical design. Here we’ll add attributes and create relationships between the entities and specify primary keys.
|Database Name: wellness_week|
|authorId||primary key, integer|
|articleId||primary key, integer|
|authorId||foreign key, integer|
|articleId||joint primary key, foreign key, integer|
|publicationId||joint primary key, foreign key, integer|
Note the Khan Academy SQL emulator we are using, the New SQL script page, may not accept joint primary keys and may give an error when defining a foreign key. For the purposes of this exercise, for the article_pub table, you can simply create the table WITHOUT defining any primary key.
When using SQLiteonline, you can make two joint primary keys that are defined as foreign keys. Imagine that we want to know the medicines a patient is taking- we can combine the ids from both tables assuming we have a table about patients and a table about medicine. Below is how you would format and get it started.
CREATE TABLE patient_medicine (patientID INTEGER, medicineID INTEGER, PRIMARY KEY (patientID, medicineID), FOREIGN KEY (patientID) REFERENCES patient(patientID) …(what do you think the second part would look like)?
|publicationId||Primary key, integer|
Phase 3: (Initial) Implementation
Part 1 of the assignment (40%):
During the tutorials you’ve taken in this course, you have seen databases created at various stages. You have not, however, seen a database created from beginning to end. Looking back at the tutorials and SQLite documentation, your task is to write the SQLite code to create the database described here. It should attend to all business specifications described in this document.
The code must be in SQLite, the version of SQL you have been using in the Khan Academy tutorials. If you simply do a search for ‘SQL’ ‘create database’, you will just as likely find some other flavor of SQL such as MySQL or PostgreSQL.
Use the sqliteonline.com site to make sure your code works. Your faculty member will be testing your code to grade your submission.
· Use AUTOINCREMENT for Primary Keys in author, article and publication tables
· Respect the precise naming conventions for the overall database, tables and attributes as given in Phase 2: Design
· Note that in both Part 1 and Part 2 of this assignment you must submit the actual SQL that you create so it can be tested by the instructor via text cut and paste. Screenshots of your database in the sqliteonline.com page may be submitted, but are insufficient on their own.
This is text that can be cut and pasted (comments not required):
/* begin SQLite code */
CREATE TABLE customers (id INTEGER PRIMARY KEY, name TEXT, age INTEGER, weight REAL);
INSERT INTO customers VALUES (73, “Brian”, 33, 50);
/* end SQLite code */
This is a screenshot that cannot be cut and pasted:
Phases 4-6: Implementation, Testing and Maintenance
One could argue that Testing should logically occur before Implementation, since for many ‘Implementation’ implies ‘roll out’ of the database. Whichever order you put them in, testing is essential BEFORE the database bears the full brunt of hundreds or thousands of concurrent connections and the demands of users.
Before the system has been populated with data, it is still theoretically possible to change the underlying schema of the database. Not always easy, but possible. Once the database has been populated with data, this becomes much more difficult, even impossible in certain situations.
Maintenance, including working on keeping up the efficiency and speed of the database in the wake of updates and deletes, is an on-going task. Backing up a database is obviously important and accepting that occasionally full or partial restores of a data base will be necessary.
Part 2 of the assignment (40%):
When you have successfully constructed the necessary code to create the database in SQL, you should populate the database with sample data of your choice, again using SQLite code. Note that the data need not be ‘real’, but should be realistic. For example, 177 Arbor Hill Drive, Silver Spring, MD 20903 is not a real address, but is realistic. 137 Abcdefg Avenue, Qwerty, KY 12345 is not realistic.
Your data should have a minimum of 5 authors, 5 articles and 2 publications. Otherwise follow any rules listed earlier in the Design phase.
As with Part 1, you can use the sqliteonline.com site to test your work:
Note that in both Part 1 and Part 2 of this assignment you must submit the actual SQL that you create so it can be tested by the instructor via text cut and paste. Screenshots of your database in the sqliteonline.com page may be submitted, but are insufficient on their own.
Test: For each table, make sure that your insert statements worked:
select count (*) from article (should be at least 5)
select count (*) from authors (should be at least 5)
select count (*) from publication (should be at least 2)
Part 3 of the assignment (Maintenance) (20%):
1. If Wellness Week decided that it would like to allow multiple authors for an article, would this require changing the (underlying) database schema? Why or why not?
2. If Wellness Week decided that it would like to establish an online presence, with some reprints and some new content (articles), would this require adding a new table? Why or why not? Note this does NOT mean we would use the database to store the articles or connect to the Internet. This merely means we would want to keep track of whether an article appeared online or in print or both and when.