World Wide Web Basics Original version by Carolyn Watters (Dalhousie U. Computer Science)
-
date post
19-Dec-2015 -
Category
Documents
-
view
218 -
download
2
Transcript of World Wide Web Basics Original version by Carolyn Watters (Dalhousie U. Computer Science)
2
The Web…
• …is a distributed document delivery system that uses Internet protocols
• …links documents stored in computers communicating by the Internet
• Main authority is the W3 Consortiumwww.w3.org
3
Basic Definitions• Web server – machine that services
Internet request• Web client – machine that initiates
Internet request• Browser – software to interact with
Internet data at the web client• TCP/IP – internet data protocol• FTP – internet file transfer protocol• HTTP – hypertext transfer protocol• HTML – hypertext markup language
4
Servers and Clients
• Servers – computer systems at the end of a network that store files and provide other services
• Clients – computer systems that are end points for users of the data
7
Internet Model Layers
Application layerCommunication services (FTP, telnet, e-mail)
Transport layerTransmission of messages end-to-end
Network services layerTransmission of messages sequence of links
Data Link layerTransmission of packet across one link
Physical layerWhere the signals move
8
Internet Layer Model
Application layerhttp ftp smtp telnet rlogin
Transport layer TCP UDP
Network Services IP
Data Link layer LAN link
Physical layerPhysical
Connection
10
TCP/IP
• Suite of protocols made the standard for the Internet
• facilitates communication between heterogeneous and similar networks that are connected together
• reliable, connection oriented, byte stream protocol
11
Transport layer: TCP & UDP
TCP– transmission control
protocol– full duplex byte stream– virtual path (connected)– error free– uses acknowledgements– 16 bit address of ports
UDP– user datagram protocol– connectionless– no acknowledgements– no flow control– no resending of
erroneous packets– some error detection– 16 bit port addresses
14
Network Layer: IP
• Delivers packets up to 64 Kb, 1 at a time• Each packet has a header
– sending host and intended host network addresses
– 32 bit addresses
• IP layer (like UDP)– unreliable– connectionless
16
TCP/IP apps
TCP/IP software usually includes:– remote terminal client using TELNET
protocol for remote login– electronic mail client using SMTP protocol
to transfer e-mail to remote system – file transfer client using FTP protocol to
transfer files between 2 machines
17
HTTPHyperText Transport Protocol• Native protocol for WWW
• Sits on top of internet’s TCP/IP protocol
• HTTP is a 4 step process per transaction
• Uses a predefined set of document formats from MIME
18
MIME
Multipurpose Internet Mail Extensions– defines file formats (images, video, text, etc)– e.g. Content-type: text/html– Data type/subtype
» text/html» text/plain» image/gif» video/mpeg» application/msword » etc!
19
HTTP Connection• 1. Client
– Makes an HTTP request for a web page– Makes a TCP/IP connection
• 2. Server accepts request– Sends page as HTTP
• 3. Client downloads page
• 4. Server breaks the connection
20
HTTP is Stateless!
• Each operation or transaction makes a new connection
• each operation is unaware of any other connection
• each click is a new connection
• So how do they do those shopping carts?
21
What does it look like?
• Header + object file• Header
– plain text– info about the object (MIME, etc.)– methods allowed– etc.– browser sends a header to server each time you
ask for information– server sends a header and possibly content
22
HTTP Transaction Example
GET /catalog/ip/ip.htm HTTP 1.0
Accept: text/plain
Accept: text/html
Referer: http://www.june.com/catalog.html
User-Agent: Mozilla/2.0 CRLF
23
HTTP REQUEST PROTOCOL
Request = Simple | FullSimple = GET <URI> CRLFFull = Method URI ProtVersion CRLF
[<HTRQ Header>*] [CRLF <data>] Method = GET | POST | HEAD | ….<HTRQ Header> = <Fieldname>:<Value>CRLF<data> = MIME conforming message
w.w3.org/Protocols/HTTP/
24
HTTP Header fields
• General-header fields– used for both requests and responses
• Request-header fields– used for responses– extra client information for use by server– optional
25
General-header fields
• Date: Mon,11, Jan 1999 08:14:32 GMT
• MIME-version: 1.0
• Pragma: no cache– directives
26
Request-header fields
• acceptable MIME types for response – Accept:text/html– Accept:*/*
• 401 response from client– Authorization: Basic abcdef (uuencoded
username and password)
• From:client-email-addr
27
More Request-header fields
• If-Modified-Since:date– conditional get
• source of current requested URL– Referer:URL
• robot/browser identification– User-Agent:Mozilla/2.0
28
Examining HTTP Header Values
• In perl– $ENV{"From"}
• In Netscape– www.cs.dal.ca/~jamie/cgi-bin/4173/about/env.c
gi
29
HTTP Methods
• Client requests either– simple request– full request
Request-line= method Request-URI HTTP-version CRLF
GET /catalog/ip.html HTTP/1.0
30
Simple requests
• Only for HTTP 0.9
• only uses Get method
• causes the server to locate and transfer the object specified
• client responsible for handling the object
GET <uri> CRLF
31
Full Request
• Uses HTTP version and more methods
• method tells server what to do to the resource requested
• Methods– GET– POST– HEAD
32
GET Method
• Request server to retrieve object specified
• conditional GET– request message includes– If-Modified-Since in header
33
HEAD Method
• Like GET but does not return the object
• returns a header about the resource requested (meta information)
• good way to test link validity
34
POST Method
• Include an object in the request
• server should use that object in processing the request
• must include a Content-Length in header
35
HTTP Response Message
• HTTP protocol version
• 3 digit status code
• reason phrase• CRLF• optional header fields• CRLF
36
HTTP Response Header Fields• Additional information about the server• such as:
– LOCATION: exact URI address– SERVER: server software (CERN/3.0)– WWW-AUTHENTICATE:
• status 401 responses (unauthorized request)• server challenges client• client may use to send authorization info to
server
37
Understanding STATUS Codes• 1xx – for information only
• 2xx – action successful
• 3xx – further action needed (redirect)
• 4xx – client request error
• 5xx – server error
38
HTTP Transaction
1. Client and server establish a connection
2. Client makes a request
3. Server makes a response
4. Server terminates connection
39
• Step 1 establish connection– TCP/IP connection set up– uses a port number as application reference– usually port 80– ports ≤ 1024 are privileged (>1024 are open)
• Step 2 client request– HTTP message sent with a request line– request-line = method URL HTTP version
40
• Step 3 Server response– server sends HTTP message and
optionally requested data– resp-message = HTTP version status code
reason-phrase [optional stuff]
• Step 4 connection terminated– usually the server– sometimes the client “stops” it– anything else, whoever notices terminates