Soup.URI¶
Fields¶
| Name | Type | Access | Description |
|---|---|---|---|
| fragment | str |
r/w | a fragment identifier within path, or None |
| host | str |
r/w | the hostname or IP address, or None |
| password | str |
r/w | a password, or None |
| path | str |
r/w | the path on host |
| port | int |
r/w | the port number on host |
| query | str |
r/w | a query for path, or None |
| scheme | str |
r/w | the URI scheme (eg, “http”) |
| user | str |
r/w | a username, or None |
Methods¶
| class | decode (part) |
| class | encode (part, escape_extra) |
| class | new (uri_string) |
| class | new_with_base (base, uri_string) |
| class | normalize (part, unescape_extra) |
copy () |
|
copy_host () |
|
equal (uri2) |
|
free () |
|
get_fragment () |
|
get_host () |
|
get_password () |
|
get_path () |
|
get_port () |
|
get_query () |
|
get_scheme () |
|
get_user () |
|
host_equal (v2) |
|
host_hash () |
|
set_fragment (fragment) |
|
set_host (host) |
|
set_password (password) |
|
set_path (path) |
|
set_port (port) |
|
set_query (query) |
|
set_query_from_form (form) |
|
set_scheme (scheme) |
|
set_user (user) |
|
to_string (just_path_and_query) |
|
uses_default_port () |
Details¶
-
class
Soup.URI¶ A
Soup.URIrepresents a (parsed) URI.Soup.URIsupports RFC 3986 (URI Generic Syntax), and can parse any valid URI. However, libsoup only uses “http” and “https” URIs internally; You can use SOUP_URI_VALID_FOR_HTTP() to test if aSoup.URIis a valid HTTP URI.scheme will always be set in any URI. It is an interned string and is always all lowercase. (If you parse a URI with a non-lowercase scheme, it will be converted to lowercase.) The macros %SOUP_URI_SCHEME_HTTP and %SOUP_URI_SCHEME_HTTPS provide the interned values for “http” and “https” and can be compared against URI scheme values.
user and password are parsed as defined in the older URI specs (ie, separated by a colon; RFC 3986 only talks about a single “userinfo” field). Note that password is not included in the output of
Soup.URI.to_string(). libsoup does not normally use these fields; authentication is handled viaSoup.Sessionsignals.host contains the hostname, and port the port specified in the URI. If the URI doesn’t contain a hostname, host will be
None, and if it doesn’t specify a port, port may be 0. However, for “http” and “https” URIs, host is guaranteed to be non-None(trying to parse an http URI with no host will returnNone), and port will always be non-0 (because libsoup knows the default value to use when it is not specified in the URI).path is always non-
None. For http/https URIs, path will never be an empty string either; if the input URI has no path, the parsedSoup.URIwill have a path of “/”.query and fragment are optional for all URI types.
Soup.form_decode() may be useful for parsing query.Note that path, query, and fragment may contain % -encoded characters.
Soup.URI.new() callsSoup.URI.normalize() on them, but notSoup.URI.decode(). This is necessary to ensure thatSoup.URI.to_string() will generate a URI that has exactly the same meaning as the original. (In theory,Soup.URIshould leave user, password, and host partially-encoded as well, but this would be more annoying than useful.)-
classmethod
decode(part)¶ Parameters: part ( str) – a URI partReturns: the decoded URI part. Return type: strFully % -decodes part.
In the past, this would return
Noneif part contained invalid percent-encoding, but now it just ignores the problem (asSoup.URI.new() already did).
-
classmethod
encode(part, escape_extra)¶ Parameters: Returns: the encoded URI part
Return type: This % -encodes the given URI part and returns the escaped version in allocated memory, which the caller must free when it is done.
-
classmethod
new(uri_string)¶ Parameters: uri_string ( strorNone) – a URIReturns: a Soup.URI, orNoneif the given string was found to be invalid.Return type: Soup.URIorNoneParses an absolute URI.
You can also pass
Nonefor uri_string if you want to get back an “empty”Soup.URIthat you can fill in by hand. (You will need to call at leastSoup.URI.set_scheme() andSoup.URI.set_path(), since those fields are required.)
-
classmethod
new_with_base(base, uri_string)¶ Parameters: Returns: a parsed
Soup.URI.Return type: Parses uri_string relative to base.
-
classmethod
normalize(part, unescape_extra)¶ Parameters: Returns: the normalized URI part
Return type: % -decodes any “unreserved” characters (or characters in unescape_extra) in part, and % -encodes any non-ASCII characters, spaces, and non-printing characters in part.
“Unreserved” characters are those that are not allowed to be used for punctuation according to the URI spec. For example, letters are unreserved, so
Soup.URI.normalize() will turnhttp://example.com/foo/b%61rintohttp://example.com/foo/bar, which is guaranteed to mean the same thing. However, “/” is “reserved”, sohttp://example.com/foo%2Fbarwould not be changed, because it might mean something different to the server.In the past, this would return
Noneif part contained invalid percent-encoding, but now it just ignores the problem (asSoup.URI.new() already did).
-
copy()¶ Returns: a copy of self, which must be freed with Soup.URI.free()Return type: Soup.URICopies self
-
copy_host()¶ Returns: the new Soup.URIReturn type: Soup.URIMakes a copy of self, considering only the protocol, host, and port
New in version 2.28.
-
equal(uri2)¶ Parameters: uri2 ( Soup.URI) – anotherSoup.URIReturns: TrueorFalseReturn type: boolTests whether or not self and uri2 are equal in all parts
-
free()¶ Frees self.
-
host_equal(v2)¶ Parameters: v2 ( Soup.URI) – aSoup.URIwith a non-Nonehost memberReturns: whether or not the URIs are equal in scheme, host, and port. Return type: boolCompares self and v2, considering only the scheme, host, and port.
New in version 2.28.
-
host_hash()¶ Returns: a hash Return type: intHashes self, considering only the scheme, host, and port.
New in version 2.28.
-
set_fragment(fragment)¶ Parameters: fragment ( strorNone) – the fragmentSets self’s fragment to fragment.
-
set_host(host)¶ Parameters: host ( strorNone) – the hostname or IP address, orNoneSets self’s host to host.
If host is an IPv6 IP address, it should not include the brackets required by the URI syntax; they will be added automatically when converting self to a string.
http and https URIs should not have a
Nonehost.
-
set_password(password)¶ Parameters: password ( strorNone) – the password, orNoneSets self’s password to password.
-
set_port(port)¶ Parameters: port ( int) – the port, or 0Sets self’s port to port. If port is 0, self will not have an explicitly-specified port.
-
set_query_from_form(form)¶ Parameters: form ({ str:str}) – aGLib.HashTablecontaining HTML form informationSets self’s query to the result of encoding form according to the HTML form rules. See
Soup.form_encode_hash() for more information.
-
set_scheme(scheme)¶ Parameters: scheme ( str) – the URI schemeSets self’s scheme to scheme. This will also set self’s port to the default port for scheme, if known.
-
to_string(just_path_and_query)¶ Parameters: just_path_and_query ( bool) – ifTrue, output just the path and query portionsReturns: a string representing self, which the caller must free. Return type: strReturns a string representing self.
If just_path_and_query is
True, this concatenates the path and query together. That is, it constructs the string that would be needed in the Request-Line of an HTTP request for self.Note that the output will never contain a password, even if self does.
-
classmethod