From RFC 1808, Section 2.1, every URL should follow a specific format:

Copy<scheme>://<netloc>/<path>;<params>?<query>#<fragment>

Lets break this format down syntactically:

  • scheme: The protocol name (which you'll usually see as http/https)
  • netloc: Contains the network location - which includes the domain itself (and subdomain if present), the port number, along with an optional credentials in form of username:password. Together it may take form of username:password@example.com:80.
  • path: Contains information on how the specified resource needs to be accessed.
  • params: Element which adds fine tuning to path. (optional)
  • query: Another element adding fine grained access to the path in consideration. (optional)
  • fragment: Contains bits of information of the resource being accessed within the path. (optional)

Lets take a very simple example to understand the above clearly:

Copyhttps://cat.example/list;meow?breed=siberian#pawsize

In the above example:

  • https is the scheme (first element of a URL)
  • cat.example is the netloc (sits between the scheme and path)
  • /list is the path (between the netloc and params)
  • meow is the param (sits between path and query)
  • breed=siberian is the query (between the fragment and params)
  • pawsize is the fragment (last element of a URL)

This can be replicated programmatically using Python's urllib.parse.urlparse:

Copy>>> import urllib.parse
>>> url ='https://cat.example/list;meow?breed=siberian#pawsize'
>>> urllib.parse.urlparse(url)
ParseResult(scheme='https', netloc='cat.example', path='/list', params='meow', query='breed=siberian', fragment='pawsize')

Now coming to your code, the if statement checks whether or not the next_page exists and whether the next_page has a netloc. In that login() function, checking if .netloc != '', means that it is checking whether the result of url_parse(next_page) is a relative URL. A relative URL has a path but no hostname (and thus no netloc).

Answer from 0xInfection on Stack Overflow
🌐
Python
docs.python.org › 3 › library › urllib.parse.html
urllib.parse — Parse URLs into components — Python 3.14.6 ...
Parse a URL into five components, returning a 5-item named tuple SplitResult or SplitResultBytes. This corresponds to the general structure of a URL: scheme://netloc/path?query#fragment.
Top answer
1 of 2
80

From RFC 1808, Section 2.1, every URL should follow a specific format:

Copy<scheme>://<netloc>/<path>;<params>?<query>#<fragment>

Lets break this format down syntactically:

  • scheme: The protocol name (which you'll usually see as http/https)
  • netloc: Contains the network location - which includes the domain itself (and subdomain if present), the port number, along with an optional credentials in form of username:password. Together it may take form of username:password@example.com:80.
  • path: Contains information on how the specified resource needs to be accessed.
  • params: Element which adds fine tuning to path. (optional)
  • query: Another element adding fine grained access to the path in consideration. (optional)
  • fragment: Contains bits of information of the resource being accessed within the path. (optional)

Lets take a very simple example to understand the above clearly:

Copyhttps://cat.example/list;meow?breed=siberian#pawsize

In the above example:

  • https is the scheme (first element of a URL)
  • cat.example is the netloc (sits between the scheme and path)
  • /list is the path (between the netloc and params)
  • meow is the param (sits between path and query)
  • breed=siberian is the query (between the fragment and params)
  • pawsize is the fragment (last element of a URL)

This can be replicated programmatically using Python's urllib.parse.urlparse:

Copy>>> import urllib.parse
>>> url ='https://cat.example/list;meow?breed=siberian#pawsize'
>>> urllib.parse.urlparse(url)
ParseResult(scheme='https', netloc='cat.example', path='/list', params='meow', query='breed=siberian', fragment='pawsize')

Now coming to your code, the if statement checks whether or not the next_page exists and whether the next_page has a netloc. In that login() function, checking if .netloc != '', means that it is checking whether the result of url_parse(next_page) is a relative URL. A relative URL has a path but no hostname (and thus no netloc).

2 of 2
6
Copyimport urllib.parse
url="https://example.com/something?a=1&b=1"
o = urllib.parse.urlsplit(url)
print(o.netloc)

example.com

🌐
Python Module of the Week
pymotw.com › 2 › urlparse
urlparse – Split URL into component pieces. - Python Module of the Week
from urlparse import urlparse parsed = urlparse('http://netloc/path;parameters?query=argument#fragment') print parsed
🌐
Saeed Esmaili
saeedesmaili.com › posts › til: simplifying url parsing with python's urlparse library
TIL: Simplifying URL Parsing with Python's urlparse Library | Saeed Esmaili
June 25, 2023 - from urllib.parse import urlparse url_to_parse = "https://saeedesmaili.com/exploring-openai-whisper-with-non-english-voices/?param=query" parsed_url = urlparse(url) print(parsed_url) ## ParseResult(scheme='https', netloc='saeedesmaili.com', path='/exploring-openai-whisper-with-non-english-voices/', params='', query='query=param', fragment='')
🌐
Read the Docs
stackless.readthedocs.io › en › 2.7-slp › library › urlparse.html
20.16. urlparse — Parse URLs into components — Stackless-Python 2.7.15 documentation
urlparse.urlparse(urlstring[, scheme[, allow_fragments]])¶ · Parse a URL into six components, returning a 6-tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment. Each tuple item is a string, possibly empty.
🌐
Jython
jython.org › jython-old-sites › docs › library › urlparse.html
20.16. urlparse — Parse URLs into components — Jython v2.5.2 documentation
>>> from urlparse import urlparse >>> o = urlparse('http://www.cwi.nl:80/~guido/Python.html') >>> o # doctest: +NORMALIZE_WHITESPACE ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/~guido/Python.html', params='', query='', fragment='') >>> o.scheme 'http' >>> o.port 80 >>> o.geturl() 'http://www.cwi.nl:80/~guido/Python.html'
🌐
Reddit
reddit.com › r/learnpython › unparsing a url, after setting an attribute, breaks the url structure.
r/learnpython on Reddit: Unparsing a URL, after setting an attribute, breaks the URL structure.
December 27, 2023 -
from urllib.parse import *
url = "www.google.com"

#Add scheme to the URL
url = urlparse(url)._replace(scheme = 'http').geturl()
print(url)

#OR

anotherURL = urlunparse(urlparse(url)._replace(scheme = 'http'))
print(anotherURL)
>>> 'http:///www.google.com'

In attempt to add scheme to the URL, I've obtained a ParseResult object of the URL and tried to change/set the 'scheme' key to 'http'. However, when I try to unparse the URL - expecting http://www.google.com in output - using geturl() method or urlunparse() function, the result turns out to be http:///www.google.com. What could be the explaination for this?

🌐
OMZ Software
omz-software.com › editorial › docs › library › urlparse.html
20.16. urlparse — Parse URLs into components — Editorial Documentation
urlparse.urlparse(urlstring[, scheme[, allow_fragments]])¶ · Parse a URL into six components, returning a 6-tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment. Each tuple item is a string, possibly empty.
🌐
Sethmlarson
sethmlarson.dev › why-urls-are-hard-path-params-urlparse
Why URLs are Hard: Path Params & urlparse — Seth Larson
April 10, 2020 - "/over/there?name=ferret#nose" ) >>> parts = urlparse(url) >>> parts ParseResult( scheme='foo', netloc='user:pass@example.com:8042', path='/over/there', params='', query='name=ferret', fragment='nose' ) >>> parts.hostname 'example.com' >>> parts.port 8042 >>> parts.username 'user' >>> parts.password 'pass' Okay so looks like we have this as a mapping from ParseResult to RFC 3986: parts.scheme -> scheme ·
Find elsewhere
🌐
GitHub
github.com › enthought › Python-2.7.3 › blob › master › Lib › urlparse.py
Python-2.7.3/Lib/urlparse.py at master · enthought/Python-2.7.3
urlparse(base, '', allow_fragments) scheme, netloc, path, params, query, fragment = \ urlparse(url, bscheme, allow_fragments) if scheme != bscheme or scheme not in uses_relative: return url · if scheme in uses_netloc: if netloc: return urlunparse((scheme, netloc, path, params, query, fragment)) netloc = bnetloc ·
Author   enthought
🌐
Harvard
zitniklab.hms.harvard.edu › bioagent › _modules › urllib › parse.html
urllib.parse - ToolUniverse Documentation
[docs] def urljoin(base, url, allow_fragments=True): """Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.""" if not base: return url if not url: return base base, url, _coerce_result = _coerce_args(base, url) bscheme, bnetloc, bpath, bparams, bquery, bfragment = \ urlparse(base, '', allow_fragments) scheme, netloc, path, params, query, fragment = \ urlparse(url, bscheme, allow_fragments) if scheme != bscheme or scheme not in uses_relative: return _coerce_result(url) if scheme in uses_netloc: if netloc: return _coerce_result(urlunparse((scheme, netloc
🌐
docs.python.org
docs.python.org › 3 › library › urlparse.html
urllib.parse — Parse URLs into components
Source code: Lib/urllib/parse.py This module defines a standard interface to break Uniform Resource Locator (URL) strings up in components (addressing scheme, network location, path etc.), to combi...
🌐
Readthedocs
pydata-sphinx-theme.readthedocs.io › en › stable › _modules › urllib › parse.html
Source code for urllib.parse - The PyData Sphinx Theme
[docs] def urlparse(url, scheme='', allow_fragments=True): """Parse a URL into 6 components: <scheme>://<netloc>/<path>;<params>?<query>#<fragment> The result is a named 6-tuple with fields corresponding to the above. It is either a ParseResult or ParseResultBytes object, depending on the type of the url parameter.
🌐
Python Module of the Week
pymotw.com › 3 › urllib.parse › index.html
urllib.parse — Split URLs into Components
December 9, 2018 - $ python3 urllib_parse_urlparse.py ParseResult(scheme='http', netloc='netloc', path='/path', params='param', query='query=arg', fragment='frag')
🌐
MIKI BLOG
blog.mikihands.com › home › programming › a comprehensive guide to python's `urlparse()` - the essential tool for url analysis
A Comprehensive Guide to Python's `urlparse()` - The Essential Tool for URL Analysis
November 18, 2025 - Python's urlparse() function is a powerful tool that allows you to systematically decompose complex URL strings and extract only the necessary parts. In particular, the .netloc attribute provides vital host and port information, making it extremely ...
🌐
Oregoom
oregoom.com › home › url processing in python
▷ URL Processing in Python - Oregoom.com
October 29, 2024 - from urllib.parse import urlparse url = "https://www.example.com:8080/path/page.html?search=python#anchor" # Parse the URL result = urlparse(url) # Print the components of the URL print(f"Scheme: {result.scheme}") print(f"Network (netloc): {result.netloc}") print(f"Path: {result.path}") print(f"Query: {result.query}") print(f"Fragment: {result.fragment}") Code language: Python (python)
🌐
Readthedocs
ironpython-test.readthedocs.io › en › latest › library › urlparse.html
20.16. urlparse — Parse URLs into components — IronPython 2.7.2b1 documentation
This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment. Each tuple item is a string, possibly empty. The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded. The delimiters as shown above are not part of the result, except for a leading slash in the path component, which is retained if present. For example: >>> from urlparse import urlparse >>> o = urlparse('http://www.cwi.nl:80/~guido/Python.html') >>> o ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/~guido/Python.html', params='', query='', fragment='') >>> o.scheme 'http' >>> o.port 80 >>> o.geturl() 'http://www.cwi.nl:80/~guido/Python.html'
🌐
PyTutorial
pytutorial.com › python-parse-url-extract-and-analyze-web-addresses
PyTutorial | Python Parse URL: Extract and Analyze Web Addresses
January 28, 2026 - The netloc includes the hostname and port. The path is the resource location. The query contains the parameters. The fragment is the page section. Understanding these parts is crucial for web work.
🌐
Raelldottin
raelldottin.com › 2023 › 09 › ensuring-clean-base-url-with-pythons.html
Ensuring a Clean Base URL with Python's urlparse
September 29, 2023 - from urllib.parse import urlparse, urlunparse · # Original URL containing various components · base_url = "https://api.example.com/resource?param=value#section" # Parse the base_url to extract scheme and netloc (domain) parsed_url = urlparse(base_url) scheme = parsed_url.scheme ·