Uniform Resource Identifier (URI)

Written by: Editorial Team

What is a Uniform Resource Identifier (URI)? A Uniform Resource Identifier (URI) is a compact string of characters used to identify a resource either on the internet or a network. It serves as a fundamental component of web technologies, as it enables the interaction between comp

What is a Uniform Resource Identifier (URI)?

A Uniform Resource Identifier (URI) is a compact string of characters used to identify a resource either on the internet or a network. It serves as a fundamental component of web technologies, as it enables the interaction between computers and servers by identifying resources like web pages, files, services, or data. URIs form the backbone of the web by linking everything together and providing a means for accessing or referring to various elements across the internet.

Structure of a URI

A URI is made up of several parts that collectively define how to locate or access a resource. The structure can vary depending on the type of URI, but many follow a similar pattern. Here’s a breakdown of the typical components:

1. Scheme

The scheme is the first part of a URI and indicates the protocol or method to be used when accessing the resource. Common schemes include http, https, ftp, and mailto.

Example:

http://example.com
ftp://files.example.com
mailto:info@example.com

The scheme defines how the resource should be accessed, with http and https being the most common for web pages. In these cases, http or https specifies that the resource can be accessed over the web using the Hypertext Transfer Protocol (or its secure version, HTTPS).

2. Authority

The authority part typically includes the domain name of the server that hosts the resource. It may also contain optional components like user information, a port number, or authentication details.

Example:

http://user:password@www.example.com:8080

  • User Info: user:password@ is optional and is often used in FTP or when basic authentication is required.
  • Host: www.example.com is the domain name or IP address of the server.
  • Port: :8080 indicates the port number to access the server. If the port is not specified, the default for HTTP (80) or HTTPS (443) is assumed.

3. Path

The path specifies the specific location of the resource within the domain. It typically refers to a directory or file.

Example:

http://example.com/folder/page.html

Here, folder/page.html defines the location of the resource (in this case, a webpage) on the server.

4. Query

The query component is used to pass parameters to the resource, often for dynamic web pages. It comes after a ? symbol and consists of key-value pairs separated by &.

Example:

http://example.com/search?q=uri&lang=en

In this case, q=uri is a query parameter that might be a search term, and lang=en specifies the language.

5. Fragment

The fragment identifies a specific part or section of a resource. It comes after a # symbol and is often used in HTML documents to jump to a particular section.

Example:

http://example.com/page.html#section2

The #section2 indicates that the browser should scroll to a specific section labeled section2 within the HTML document.

Types of URIs

There are two main types of URIs that serve distinct purposes: URL and URN.

1. Uniform Resource Locator (URL)

A URL is the most commonly used type of URI and specifically identifies the location of a resource on a network. A URL provides information about where the resource is located and how to retrieve it.

Example:

https://www.example.com/about

In this case, the URL tells the browser to use the HTTPS protocol and access the about page on the www.example.com domain.

2. Uniform Resource Name (URN)

A URN is another type of URI that identifies a resource by name rather than location. It serves as a persistent identifier that is independent of the resource's current location or means of access.

Example:

urn:isbn:0451450523

This URN identifies a book by its International Standard Book Number (ISBN). Unlike URLs, URNs do not specify how to retrieve the resource; they simply serve as a unique name for it.

URI vs. URL vs. URN

While the terms URI, URL, and URN are often used interchangeably, there are subtle differences between them:

  • URI: A general term that refers to both URLs and URNs. Any identifier that follows the URI format is a URI.
  • URL: A URI that provides the location of a resource, typically used for websites and online resources.
  • URN: A URI that identifies a resource by name, independent of its location.

In simpler terms, all URLs and URNs are URIs, but not all URIs are URLs or URNs.

Importance of URIs

URIs are foundational to the functioning of the internet. They allow for the unique identification and access of resources across the web and within other systems. Key reasons why URIs are important include:

1. Resource Identification

URIs provide a standardized way to identify and access resources, such as web pages, files, services, APIs, and more. This identification is essential for linking documents, facilitating online transactions, and enabling communication between services.

2. Web Navigation

Without URIs, navigating the web would be impossible. They allow users to move from one web page to another by referencing resources in a way that browsers understand. Web search engines, hyperlinks, and forms all rely on URIs.

3. Interoperability

URIs ensure that various systems, applications, and platforms can interact with one another. Whether it's a web browser accessing a webpage, an API call to a server, or an application requesting data, the URI provides the common language that ties everything together.

4. Hypertext and RESTful APIs

Hyperlinks, which are key elements of web pages, use URIs to point users to different resources. In RESTful APIs (which follow the principles of Representational State Transfer), URIs are used to uniquely identify resources, and HTTP methods (like GET, POST, PUT, DELETE) define how to interact with them.

Common URI Schemes

Several schemes are widely recognized as part of URIs. Each scheme tells the client how to interact with the resource. Some common URI schemes include:

  • HTTP/HTTPS: For accessing websites or web services.
  • FTP: For transferring files between clients and servers.
  • Mailto: For initiating email messages.
  • Tel: For telephone numbers.
  • Data: For embedding small amounts of data directly in the URI.

Example URIs with common schemes:

http://example.com
mailto:someone@example.com
ftp://ftp.example.com
tel:+123456789
data:image/png;base64,iVBOR...

Syntax Rules for URIs

URIs follow a specific set of rules for their syntax. The structure must adhere to standards set forth by the Internet Engineering Task Force (IETF) in RFC 3986. Here are some important syntax rules:

  1. Characters: URIs consist of a limited set of characters, including letters, digits, and a few special characters. Reserved characters (like ?, #, /, and :) have specific meanings and must be used according to the standard.
  2. Percent-Encoding: Characters outside the allowed set must be percent-encoded (e.g., a space is encoded as %20).
  3. Case Sensitivity: URI schemes and hostnames are case-insensitive (i.e., HTTP://EXAMPLE.COM is treated the same as http://example.com), but paths and queries may be case-sensitive depending on the server.

The Bottom Line

A Uniform Resource Identifier (URI) is an essential part of the web's infrastructure, used to uniquely identify and locate resources online. URIs consist of various components, such as schemes, authorities, paths, and optional queries or fragments. While URIs encompass both URLs (for locating resources) and URNs (for naming resources), the differences between these concepts matter when discussing the specifics of resource identification.

URIs are critical to the functioning of websites, APIs, and various internet technologies, providing the necessary language that computers use to find and access resources across the web. Understanding how URIs work enables better navigation, web development, and system interoperability.