URL stands for Uniform resource locator and we will be discussing it in detail below, we will also be explaining topics like what they are and how they are structured,
With hypertext and HTTP, URL is one of the key concepts of the web, its mechanism is used by browsers to retrieve any published resource on the web. URL basically stands for Uniform resource locator and put simply, it’s nothing more than the address of a given unique resource on the web, which is why you can paste the URL of the source that you want to reach or visit the page of, and the browser immediately takes you to the source of the URL, thus it’s the address of the source that you want to visit or reach using your browser.
In theory, each valid URL is the address to a unique location on the web, these locations are basically resources of the web, and each of these resources can be an HTML page, CSS document, an image, etc. However, there are some exceptions the most common being a URL pointing out a resource that no longer exists or has been moved to another location. As the resource represented by the URL and the URL itself are both handled by the web, it’s up to the owner of that webserver to carefully manage that resource and its associated URLs.
The Anatomy Of A URL
Here are some examples of the URL that represent a given resource of the web,
Any of the above-given URLs can be copied or typed into your browser’s address bar and it will take you to its resource on the web by loading the page associated with it.
A URL is composed of different parts some mandatory, and others optional. The most important ones are highlighted in the URLs below.
Although you might think of a URL as a normal postal address to a location on the web, the scheme of the URL represents the postal service that you want to use, the domain name is just like the city or the town, and the port is just like the zip code of the location. The path will represent the building in which your mail should be delivered, the parameters represent extra information such as the number of apartments in the building and finally, the anchor represents the actual person to whom you addressed the mail.
There are still some extra parts and some extra rules associated with URLs but they are not relevant for regular users or web developers, you don’t need to know them in order to fully understand URLs or build them.
The very first part of the URL is called the scheme and it indicates the protocol that the browser must use to process the request for the resource. Usually, for websites, the protocol is HTTPS and HTTP. Addressing web pages requires one of these two, but browsers also know how to handle other schemes such as email, so don’t be surprised if you see schemes of this sort in a URL.
Next up, is the authority of the URL, which comes right after the scheme, and it’s separated by the characters pattern,
: / /
If the authority includes both, the domain and the port, separated by a colon, then:
- The domain indicates which web server is being requested, usually, this is a domain name, but an IP address may also be used here it’s usage is rare as it’s quite inconvenient to use an IP address.
- The port indicates the technical “gate”, which is used to access the resource on the web server. It is usually omitted if the web server uses the standard and usual ports of the HTTP protocol, which is exactly 80 for HTTP and 443 for HTTPS. Otherwise, it’s mandatory.
- The separator between the scheme and the authority is: / /, the colon separates the scheme from the next coming part or component of the URL, while / / indicates that the next coming part of the URL is the authority.
- One example of a URL that doesn’t use an authority component, is the mail client,
- It contains a scheme but doesn’t use an authority component. Therefore, the colon is not followed by two slashes and only acts as a delimiter between the scheme and the mail address.
Path To Resource:
Is the path to the resource on the web server. In the early days of the web, a path like this represented a physical file location on the web server but nowadays, it’s mostly an extraction handled by web servers without any physical reality or any physical importance.
?key1=value1&key2=value2, are extra parameters added to the web server, those parameters are a list of key/value pairs separated by the & symbol. The web server can use those parameters to do extra stuff before returning the resource. Each server has its own rules when it comes to parameters, and the only reliable way to know if a specific web server is handling the parameters is simply by asking the web server’s owner.
Somewhere in the document is an anchor to another part of the resource itself. An anchor represents a source of bookmark inside the resource giving the browser directions to show the content located at that bookmarked spot. On an HTML document, for example, the browser will scroll to the point where the anchor is defined.
On a video or an audio document, the browser will try to go to the time that is represented by the anchor, and it can prove to be important to note that, # which is also known as the fragment identifier is never sent to the server with the request.
Now let’s move on to the process of using URLs.
How To Actually Use URLs
Any URL can be copied and typed right inside the browser’s address bar, which will get you to the resource behind it or associated with it. But knowing this is actually only like scratching the surface of all the processes involved, the HTML language makes extensive use of the URL,
- To create links to other documents with the element.
- To link a document with its related resources through various elements such as, or ;
- To display media such as images, videos, sounds, music, etc.
- To display other HTML documents with the element.
Absolute URLs In Comparison With Relative URLs
The URLs that we saw above are known as absolute URLs, but there is also another distinction in the URLs which is known as relative URLs, the URL standard defines both, though it uses the terms absolute URL string and relative URL string to distinguish them from URL objects, which are actually the in-memory representations of the URLs. Now let’s move on to the differences between the two URLs, and what the words, absolute and relative mean in the world of URLs,
In your browser’s address bar, the URL doesn’t have any context so you must provide a full or absolute URL to visit the resource that is associated with it, you don’t need to include the protocol or the port when typing in the address, as these are only required when the targeted web server is using some unusual port. All the other parts of the URL are necessary to mention in the address bar and in the right order.
When a URL is used within a document, such as in an HTML page, things tend to be a little different because the browser already has the document’s own URL, and it can use this information in the filling in of any missing parts in the URL available inside the document. We can differentiate between the absolute URL and a relative URL by looking at the path part of the URL, if the path part of the URL starts with a “/” character, then the browser will fetch that resource from the top root of the server, without reference to the context given by the current document in hand.
Despite their very technical nature, the URLs represent a human-readable entry point for a website. They can be memorized and anyone can enter them in a browser’s address bar. People are at the core of the web, so it is usually considered best practice to build a semantic URL, semantic URLs use words with inherent meaning that can be understood by anyone, regardless of them being a technical expert.
Linguistic semantics is irrelevant and unknown to computers, that we all have the knowledge, and you have probably seen URLs that look like they have been mashed up with different characters and random ones at most, but there are many advantages to creating human-readable URLs Such as,
- It’s so much easier to manipulate them.
- It clarifies things for users in terms of what page they’re going to and where they are on the face of the internet, what they’re reading, and what they’re interacting with on the web.
- Some search engines will take those semantics and optimize the web pages according to them giving your URL an edge over the other ones.
URLs that are prefixed with the (data: ) scheme allow content creators or content providers to embed small files inline with documents, they were previously called data URIs but then the name was changed by the WHATWG. Data URLs are treated as unique opaque origins by modern browsers rather than inheriting the origins of the settings object responsible for the navigation and bringing users to the location.
Data URLs consist of four parts, a prefix, data: a MIME ty. pe which indicates the type of data an optional base 64 token if nontextual, and the data itself:
The media type is the MIME ty. pe string such as “image/jpeg” for a jpeg image file, if omitted, it defaults to text/plain;charset=US-ASCII
If the data contains characters defined in RFC 3986 as reserve characters or contains space characters, new line characters, or other nonprinting characters, those characters must be percent-encoded also known as URL-encoded.
And if the data present is textual, you can simply embed the text otherwise you can instruct base64 to embed base64 encoded binary data.
URLs can have a simple yet lengthy nature, once you begin to understand what it is and what are its components are used to build one, you will be able to easily understand the average URLs that we see usually, and the types that further classify them according to the differences between them, in short, you need to understand the basics of the URL first and once you understand the basics, it will not take long until you understand the different types.
We have done our job and now it’s your turn to do the same by implementing the above-given information in something productive, we hope that you got the information that you were looking for and we wish you the best of luck in implementing or further researching on the topic.