Introduction to Databases

Setup

Overview

Time: min
Objectives
  • Install Microsoft Office to use Microsoft Access

Software setup

To practice the database design concepts in this workshop, Microsoft Access, will be used to create a simple database.

Install Microsoft Office

To use Microsoft Access for this workshop, please install Microsoft Office available through UIC Technology Solutions.

Follow the instructions from Technology Solutions to install: Microsoft Office Installation Instructions

Install the videoconferencing client

If you haven't used Zoom before, go to the official website to download and install the Zoom client for your computer.

Set up your workspace

Like other Carpentries workshops, you will be learning by "coding along" with the Instructors. To do this, you will need to have both the window for the tool you will be learning about (a terminal, RStudio, your web browser, etc..) and the window for the Zoom video conference client open. In order to see both at once, we recommend using one of the following set up options:

This blog post includes detailed information on how to set up your screen to follow along during the workshop.

Data files:

Please download the following file(s) to participate in the workshop:

Link to Database: Sample_Database

About the Data Used in this Workshop:

The dataset used is a sample dataset for executing queries on MS Access. It is a basic data on tables like Customer, Purchase and Items.

Key Points

  • Microsoft Access is available in the Microsoft Office Suite

  • Microsoft Office is available through UIC Technology Solutions


Intro to Database

Overview

Time: 12 min
Objectives
  • To understand the concepts of databases

Introduction to Databases

Data is available in large amounts in the current time and databases help store these data. A database is a collection of data which it holds in a structed format. Data from a database can be accessed in multiple ways.

image_1

Database Management Systems (DBMS)

DBMS is a software which is used to store, edit, and extract the data in a database. The features of a DBMS help to manage the data in a robust and efficient manner. A DBMS can accept queries from various interfaces and executes them against the data stored in a database and fetches & produces the results.

image_2

STRUCTURED vs UNSTRUCTURED DATA:

Structured data are highly organized. They can be mapped into specific fields such as names, country, zip codes etc. They are generally stored and organized in the format of tables and spreadsheets. Structured data are easy to work with and make up 20% of the data currently. Generally structured data are stored in RDBMS (Relational Database Management Systems) and are managed using SQL (Structured Query Language).

image_3

Unstructured data on the other hand do not have any pre-defined structure. This data cannot be processed (or) analyzed using any conventional methods/tools. With the Social media boom unstructured data is now being produced at a rapid pace making up 80% of the current data. Examples of this type of data include images, audio, video etc. Unstructured data are generally stored and handled by NoSQL databases.

image_4

image_5

Key Points

  • Introduction to Databases, DBMS.

  • Comparison between Structured and Unstructured Data


Introduction to ER Model

Overview

Time: 20 min
Objectives
  • To get familiarized with the concepts of Relational DBMS and ER Model

  • To get deeper understanding of various relationships in ER Model

Relational Database Management Systems (DBMS)

A Relational Database Management Systems (RDBMS) is a program used to manage a relational database. In a relational database data is mostly organized into tables and the database helps us identify and access data in relation to another data within the database. A single database may comprise of multiple tables and they might be related to each other.

rdbms_1

ENTITY–RELATIONSHIP(ER) MODEL

The ER model is used to describe the data in terms of objects (entities) along with the relationships between them. It helps us to understand how the underlying data in an organization are linked to each other.

ENTITY:

Each attribute has a particular domain i.e. for Name attribute the domain could be string of length 30, while for UIN would be 9-digit integers. Each entity set also has a primary key, which is an attribute whose value is used to uniquely identify each record (or) entity in the set, e.g. UIN .A list of attributes whose values are unique for each entity is known as candidate key, e.g. UIN, NetID and email.

rdbms_2

RELATIONSHIP:

A relationship helps define the relation between different entities i.e. how an entity is related to another. Consider an organization which has two entity sets, Employee and Department. The relationship between them would be “works in” as in employee “works in” a department. Information about the relationship is referred to as descriptive attributes.

Relationship set is a collection of relationships of the same type. For e.g. Employee E1 works in Department D1 is a relationship, while Employees E2, E3, E5, E6 works in Department D2 is a relationship set (since it is a collection of relationships).

A relationship set is represented in a diamond

rdbms_3

Types (Cardinality)

Cardinality refers to how many entities in tow entity sets can be related to each other.

rdbms_4

rdbms_5

rdbms_6

WEAK ENTITY

A weak entity is an entity set which does not have a primary key of its own. They are identified uniquely based on the primary key of another entity. For example, in an organization consider employees purchase policies for their dependents (family members), then the entity set Dependents is a weak entity.

Weak entities are represented using thick rectangular boxes.

rdbms_7

Note that Name is not a unique identifier on its own for Dependents entity set. There can be multiple people with the name John under dependents. Once an employee entity is chosen then the Name can be used to uniquely identify the dependents of that particular employee. Hence Name is known as partial key of weak entity set.

Restrictions:

“IS A” HIERARCHY

The IS A hierarchy is used to classify an entity set into subclasses. For e.g. an entity set Employee may be classified into subclasses of Hourly_Employee , Contract_Employee (or) Senior_Employee, Junior_Employee etc. All the attributes of the parent entity set are inherited i.e. present in its subclasses.

rdbms_9

Constraints:

ER MODEL RULES

Total Participation – All entities of an entity set participate in the relationship. This is represented by a thick line between entity set and relationship in the diagram.

Partial Participation – Not all entities of the entity set will participate in the relationship. This is represented by a thin line between entity set and relationship in the diagram.

At most one – Each entity in this entity set is related to at most one entity in the other entity set it is related. This is indicated by an arrow between entity set and relationship in the diagram.

rdbms_12

In the above diagram, there is a total participation by Employee entity with Department entity through works relationship. This indicates that all employees belong to some department in the organization. Similarly, there is a total participation by Department entity with Employee entity through works relationship indicating in each department there is at least one employee working. There is a partial participation by Employee entity with Department entity through the manages relationship. This shows not all employees manage a department but only some. The Department entity has a thick line indicating complete participation with an arrow at the end indicating at most one through the manages relationship. This shows that all departments are manages by some employee (total participation – thick line) but each department is managed by at most one (arrow) employee.

ER diagram of customer purchases a item Database

rdbms_13

Key Points

  • Introduction to ER Model

  • Explanation of Cardinality, various type of Entity Relationships and the rules of ER Model


Introduction to MS Access

Overview

Time: 45 min
Objectives
  • Introduction to MS Access application

  • Manipulating data using MS Access

Microsoft Access is a database management tool which is simple to use. It provides an easy, user friendly user interface to create, edit and manage databases.

In a relational database, data is generally stored in the form of tables. Each table here represents an entity set while the columns of the table represent the entity attributes. Each row in the table represents a single entity inside the entity set.

Note: Every UIC student should have free access to Microsoft office product. Once you open MS Access in your device if prompted to sign in, use your UIC email address and password to sign in.

CREATING DATABASE AND TABLES:

access_1

access_2

access_3

access_4

access_5

DATA TYPES – FIELD PROPERTIES & FORMATTING

access_6

access_7

access_8

access_9

access_10

access_11

Set the primary keys as below:

Customer - ID

Item - ID

Purchase - Customer ID, Item ID, Date

access_12

RELATIONSHIPS

access_13

access_14

access_15

Referential Integrity – Helps prevent actions which might affect the relationship. For e.g. Changing only the data type of ID column in Customer table to Text but not the Customer ID column in Purchase table.

access_16

QUERIES

Queries help fetch data from tables based on the search criteria provided

Query 1 – Items purchased by customers with First Name “Kevin”

access_17

access_18

access_19

access_20

access_21

Query 2 – Items purchased by customers with First Name “Kevin” OR “Selena”

access_22

access_23

Query 3 – Items purchased by customers with First Name “Kevin” AND Quantity greater than 3

access_24

access_25

Key Points

  • design and create databases using MS Access

  • designing queries to grab required data from databases


Intro Database Exercises

Overview

Time: 20 min
Objectives
  • Practice skills related to introductory database content

ER Diagram exercise

Draw a ER diagram representing the database of students attending courses.

Solution

Multiple solutions possible.

ER diagram of students attending courses Solution_1

MS Access Database queries exercise 1

Using MS Access, retrieve the Items purchased by customers with First Name “Selena” AND Quantity less than 5

Solution

Query Solution_1

Result Solution_1

MS Access Database queries exercise 2

Using MS Access, retrieve the Items purchased by customers with First Name “Kevin” or “Selena” on date 8th November, 2020

Solution

Query Solution_1

Result Solution_1

Key Points

  • Practicing the important topics of the workshop