Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... ·...

30
Database Overview What is Database? Database란 무엇인가

Transcript of Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... ·...

Page 1: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database Overview

What is Database?Database란무엇인가

Page 2: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database Intro: Why & How

Data vs. Information

Data is a collection of facts.

Information is data processed for knowledge.

Changing data into information

Organize data so that it can be viewed in a useful form.3. What form will the derived information take?

2. How will information be extracted?

1. What data to collect, how & why?

Requirements• Identify the Context of data → Metadata

• Organize 정리, 체계화, 조직화→ Structured Data 구조적데이터

• Summarize 요약 → Information

2Database Design

Page 3: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Data → Information: 1.Identify Context

Data

Obama, Barack H. 19610804

Bush, George H W. 19240612

Bush, George W. 19460706

Clinton, William J. 19460819

Carter, James E. 19241001

ContextLiving presidents, United States, 2016/1/1

• Name (last name, first name middle initial), birthdate (YYYYMMDD)

Class Roster, Database Design Course, LIS Department, KNU, Spring 2016

• Name (last name, first name middle initial), student ID

3Database Design

Page 4: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Data → Information: 2.Organize Data

Identify metadata (metadata의식별)

Identify additional data items ( data 식별을위한부가적요소)

4Database Design

Course Title Database System

Course ID gDB-s16

Credit Hours 3.0

Class Time Monday 7-9:30 p.m.

Semester Spring 2017

Instructor Yang, Kiduk

Department Library & Information Science

College School of Social Science

University Kyungpook National University

Lname Fname Init Stud_ID

Obama Barack H 19610804

Bush George HW 19240612

Bush George W 19460706

Clinton William J 19460819

Carter James E 19241001

Major Level GPA

LIS MS1 3.8

TCOM MS1 2.1

ACCT MS2 3.0

CS PHD1 3.9

LIS MS2 3.7

Page 5: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Data → Information: 3.Summarize

Patterns, Trends & Visualization

5

45%

LIS15%

CS

15%

ACCT

10%

DS

15%

TCOM

ACCT = Accounting

CS = Computer Science

DS = Data Science

LIS = Library & Information Science

TCOM = Telecommunication

Enrollment Pie Chart Enrollment over Time

0s 0f 1s 1f 2s 2f 3s 3f 4s 4f 5s 5f 6s

Semester

5

10

15

20

30

En

rollm

en

t

Database Design

Page 6: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database Intro: What

Function

Store 저장 / Retrieve 검색/ View 검토 data efficiently & effectively.

Characteristics

A collection of organized data related to a particular subject/purpose

• Structured data 구조적데이터, Security 보안, Control 통제

DataBase Management System (DBMS)

• (Data) Storage 저장, Processing 처리/가공, Retrieval 검색

User Interface• Data Entry 데이터입력, Search 입력, View/Report 검토/보고

6Database Design

Page 7: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database: Definitions

Database Collection of related data 관련된데이터and its metadata

organized in a structured format 구조적 형식

for optimized information management 정보관리

Database Management System (DBMS)Software that enables

easy creation 구축, modification 변경, & access 접속 of databases

for efficient and effective database management 데이터베이스관리

Database SystemIntegrated system 통합시스템of

hardware, software, data, procedures, & people

that define 결정 and regulate 규제

the collection, storage, management, & use of data within a database environment 데이터베이스환경

7Database Design

Page 8: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database Management System

8

Database Systems: Design, Implementation, & Management: Rob & Coronel

Software that enables easy creation 구축, modification 변경, & access 접속 of databases for efficient and effective database management 데이터베이스관리

→ Manages interaction between end users and database 이용자와 DB사이의상호작용관리

Database Design

Page 9: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database System Environment

Database System 9

Database Systems: Design, Implementation, & Management: Rob & Coronel

Hardware

Software- OS

- DBMS

- Applications

People

Procedures

Data

Page 10: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database Overview

Evolution of Database System

Page 11: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Evolution of Database

1960s 1970s 1980s 1990s 2000+

File-based

Hierarchical

Network

Relational Object-oriented

Web-based

Entity-Relationship

NoSQLNewSQL

11Database Design

Database 발전

Page 12: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database: Historical Roots (기원)

Manual File SystemTo keep track of data

Used tagged file folders in a filing cabinet

Organized according to expected use• e.g. file per customer

Easy to create, but hard to• locate data

• aggregate/summarize data

Computerized File SystemTo accommodate the data growth and information need

Manual file system structures were duplicated in the computer

Data Processing (DP) specialists wrote customized programs to• write, delete, update data (i.e. management)

• extract and present data in various formats (i.e. report)

12Database Design

Page 13: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

File System: Example

Database Systems: Design, Implementation, & Management: Rob & Coronel

13Database Design

Page 14: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

File System: Weakness

Weakness“Islands of data” in scattered file systems 분산된파일시스템.

ProblemsDuplication 중복

• Same data may be stored in multiple files

Inconsistency 불일치

• Same data may be stored with different values/formats

Rigidity 경직성

• Requires customized programming to implement any changes• Cannot do ad-hoc queries 즉석질의불가

ImplicationsWaste of space Data inaccuracies 오류

High overhead 간접비용 of data manipulation and maintenance

14Database Design

Page 15: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

File System: Problem Case

CUSTOMER file AGENT file SALES file

A_Name (15 char)

Carol Johnson

A_Name (20 char)

Carol T. Johnson

AGENT (20 char)

Carol J. Smith

• Inconsistent field name, field size

• inconsistent data values

• data duplication

15Database Design

Page 16: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Database System vs. File System

Database Systems: Design, Implementation, & Management: Rob & Coronel

16Database Design

Page 17: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Hierarchical Data Model 계층적데이터모델

Hierarchical Model To manage large amount of data for complex manufacturing projects

• Information Management System developed by Rockwell & IBM

→ Files connected in Parent-Child (1:M) relationships• 1 Parent - Multiple Children

Database Systems: Design, Implementation, & Management: Rob & Coronel

17Database Design

Page 18: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Hierarchical Data Model 계층적데이터모델

Strengths Conceptual Simplicity 개념적단순성

Groups of data could be related to each other Related data could be viewed together

Centralization of data Reduced redundancy 중복 and promoted consistency 일관성

Weaknesses Limited representation of data relationships

Did not allow Many-to-Many (M:N) relations

Structural Dependence 구조의존

Data access requires physical storage path

Complex Implementation 복잡한구현

Required in-depth knowledge of physical data storage

Lack of Standards 표준부족

Limited portability

18Database Design

Page 19: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Network Data Model 네트워크데이터모델

Network Model→ Extension of Hierarchical Model 계층모델의확장형

• Composed of Owner-Member (Parent-Child) sets

→ To represent Many-to-Many (M:N) relationships• Multiple Parents – Multiple Children

Database Systems: Design, Implementation, & Management: Rob & Coronel

19Database Design

Page 20: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Relational Data Model 관계형데이터모델

Problems with legacy database systemsRequired excessive effort to maintain

• Data manipulation (programs) too dependent on physical file structure

Hard to manipulate by end-users• No capacity for ad-hoc query (must rely on DB programmers).

Relational Model E. F. Codd’s proposal

• Separated the notion of physical representation (machine-view) from logical representation (human-view)

→ Eliminated pointers and used tables to represent data

• Considered ingenious but computationally impractical in 1970

Dominant database model of today

Separation of design from implementation → Flexible

Ad-hoc queries → Structured Query Language (SQL)

20Database Design

Page 21: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Relational Database: Example

Tables (i.e. Relations)

Provide a logical “human-level” view of the data and associations among groups of data

→ Organize data into rows 행 (records/tuples) and columns 열 (attributes)

→ Are related via shared attribute(s)

Customer_ID Customer_Account Agent_ID

1224 4556 23

1225 4558 25

Agent_ID Last_Name First_Name Phone

23 Sturm David 334-5678

25 Long Kyle 556-3421

Customer_ID Last_Name First_Name Phone Account_Balance

1224 Vira Dyne 678-9987 1223.95

1225 Davies Tricia 556-3342 234.25

21Database Design

Page 22: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Entity Relationship Model Peter Chen’s Landmark Paper (1976)

“The Relationship Model: Toward a Unified View of Data”

Graphical representation of entities and their relationships

Based on Entity, Attributes & Relationships Entity → e.g. EMPLOYEE

• Thing about which data are to be collected and stored

Attributes → e.g. SSN, last name, first name

• Characteristics of the entity

Relationships → i.e. 1:M, M:N, 1:1

• Associations between entities

Complements the relational data model concepts• Helps to visualize structure and content of data groups

• Entity Relationship Diagram (ERD)→ Tool for conceptual data modeling→ Formalizes a way to describe relationships between groups of data

22Database Design

Page 23: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

E-R Diagram: Chen Model

Entity 개체

represented by a rectangle with its name in capital letters.

Relationship 관계

represented by an active or passive verb inside the diamond that connects the related entities.

Connectivity 관계유형

i.e., types of relationship

written next to each entity box.

Database Systems: Design, Implementation, & Management: Rob & Coronel

23Database Design

Page 24: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

E-R Diagram: Crow’s Foot Model

Entity 개체

represented by a rectangle with its name in capital letters.

Relationship 관계

represented by an active or passive verb that connects the related entities.

Connectivity 관계유형

indicated by symbols next to entities.

2 vertical lines for 1

“crow’s foot” for M

Database Systems: Design, Implementation, & Management: Rob & Coronel

24Database Design

Page 25: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

E-R Model: Pros & Cons

Advantages Exceptional conceptual simplicity

Easily viewed and understood representation of database Facilitates database design and management

Integration with the relational database model Enables better database design via conceptual modeling

Disadvantages Incomplete model on its own

Limited representational power→ cannot model data constraints not tied to entity relationships

e.g. attribute constraints → cannot represent relationships between attributes within entities

No data manipulation language (e.g. SQL)

Loss of information content Hard to include attributes in ERD

25Database Design

Page 26: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Object-Oriented Database 객체지향

Semantic Data Model (SDM)► Modeled both data and their relationships in a single structure (object)

Developed by Hammer & McLeod in 1981

Object-oriented concepts became popular in 1990s► Modularity facilitated program reuse and construction of complex structures► Ability to handle complex data types (e.g. multimedia data)

Object-Oriented Database Model (OODBM)► Maintains the advantages of the ER model but adds more features► Object = entity + relationships (between & within entity)

consists of attributes & methods→ methods are all relevant operations that can be performed on an object

► Class Template for objects e.g. EMPLOYEE class = (employ1 object, employ2 object, …) organized in a class hierarchy

→ e.g. PERSON > EMPLOYEE, CUSTOMER

► Incorporates the notion of inheritance attributes and methods of a class are inherited by its descendent classes

26Database Design

Page 27: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

OO Database Model vs. E-R Model

Database Systems: Design, Implementation, & Management: Rob & Coronel

OODBM: - can accommodate relationships within a object- objects to be used as building blocks for autonomous structures

27Database Design

Page 28: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Object-Oriented Database: Pros & Cons

Advantages Semantic representation of data

Fuller and more meaningful description of data via object

Modularity, reusability, inheritance Ability to handle

Complex data Sophisticated information requirements

Disadvantages Lack of standards

No standard data access method

Complex navigational data access Class hierarchy traversal

Steep learning curve Difficult to design and implement properly

High system overhead Slow transactions

28Database Design

Page 29: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

Web Database Not a database model, but a system

For storing information that can be accessed via Web

That supports complex data types & relationships

In a Client-Server architecture Server hosts database & DBMS (e.g., MySQL)

Client accesses the server for database use

ClientInitiates a Connection

ServerWaits & Responds

to Incoming Connections

WebClient

(e.g. Chrome)

WebServer

(e.g. Apache)

HTTP request DBServer

(e.g. MySQL)

Database

Webpage

Data request

Retrieved data

29Database Design

Page 30: Database Overview - KNUwidit.knu.ac.kr/~kiyang/teaching/DB/s17/lectures/2.DB-DB... · 2017-03-08 · Computerized File System To accommodate the data growth and information need Manual

NoSQL/NewSQL Database

NoSQL (Not Only SQL)

Non-relational: e.g., objects instead tables

For big (unstructured, distributed) data & real-time Web applications

More scalable & better performance

Flexible & agile development

NewSQL NoSQL + Relational

→ Consistent

→ Scalable

→ Flexible

30Database Design