Spaces:
Running
Running
Metadata-Version: 2.1 | |
Name: idna | |
Version: 3.10 | |
Summary: Internationalized Domain Names in Applications (IDNA) | |
Author-email: Kim Davies <kim+pypi@gumleaf.org> | |
Requires-Python: >=3.6 | |
Description-Content-Type: text/x-rst | |
Classifier: Development Status :: 5 - Production/Stable | |
Classifier: Intended Audience :: Developers | |
Classifier: Intended Audience :: System Administrators | |
Classifier: License :: OSI Approved :: BSD License | |
Classifier: Operating System :: OS Independent | |
Classifier: Programming Language :: Python | |
Classifier: Programming Language :: Python :: 3 | |
Classifier: Programming Language :: Python :: 3 :: Only | |
Classifier: Programming Language :: Python :: 3.6 | |
Classifier: Programming Language :: Python :: 3.7 | |
Classifier: Programming Language :: Python :: 3.8 | |
Classifier: Programming Language :: Python :: 3.9 | |
Classifier: Programming Language :: Python :: 3.10 | |
Classifier: Programming Language :: Python :: 3.11 | |
Classifier: Programming Language :: Python :: 3.12 | |
Classifier: Programming Language :: Python :: 3.13 | |
Classifier: Programming Language :: Python :: Implementation :: CPython | |
Classifier: Programming Language :: Python :: Implementation :: PyPy | |
Classifier: Topic :: Internet :: Name Service (DNS) | |
Classifier: Topic :: Software Development :: Libraries :: Python Modules | |
Classifier: Topic :: Utilities | |
Requires-Dist: ruff >= 0.6.2 ; extra == "all" | |
Requires-Dist: mypy >= 1.11.2 ; extra == "all" | |
Requires-Dist: pytest >= 8.3.2 ; extra == "all" | |
Requires-Dist: flake8 >= 7.1.1 ; extra == "all" | |
Project-URL: Changelog, https://github.com/kjd/idna/blob/master/HISTORY.rst | |
Project-URL: Issue tracker, https://github.com/kjd/idna/issues | |
Project-URL: Source, https://github.com/kjd/idna | |
Provides-Extra: all | |
Internationalized Domain Names in Applications (IDNA) | |
===================================================== | |
Support for the Internationalized Domain Names in | |
Applications (IDNA) protocol as specified in `RFC 5891 | |
<https://tools.ietf.org/html/rfc5891>`_. This is the latest version of | |
the protocol and is sometimes referred to as “IDNA 2008”. | |
This library also provides support for Unicode Technical | |
Standard 46, `Unicode IDNA Compatibility Processing | |
<https://unicode.org/reports/tr46/>`_. | |
This acts as a suitable replacement for the “encodings.idna” | |
module that comes with the Python standard library, but which | |
only supports the older superseded IDNA specification (`RFC 3490 | |
<https://tools.ietf.org/html/rfc3490>`_). | |
Basic functions are simply executed: | |
.. code-block:: pycon | |
>>> import idna | |
>>> idna.encode('ドメイン.テスト') | |
b'xn--eckwd4c7c.xn--zckzah' | |
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) | |
ドメイン.テスト | |
Installation | |
------------ | |
This package is available for installation from PyPI: | |
.. code-block:: bash | |
$ python3 -m pip install idna | |
Usage | |
----- | |
For typical usage, the ``encode`` and ``decode`` functions will take a | |
domain name argument and perform a conversion to A-labels or U-labels | |
respectively. | |
.. code-block:: pycon | |
>>> import idna | |
>>> idna.encode('ドメイン.テスト') | |
b'xn--eckwd4c7c.xn--zckzah' | |
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) | |
ドメイン.テスト | |
You may use the codec encoding and decoding methods using the | |
``idna.codec`` module: | |
.. code-block:: pycon | |
>>> import idna.codec | |
>>> print('домен.испытание'.encode('idna2008')) | |
b'xn--d1acufc.xn--80akhbyknj4f' | |
>>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna2008')) | |
домен.испытание | |
Conversions can be applied at a per-label basis using the ``ulabel`` or | |
``alabel`` functions if necessary: | |
.. code-block:: pycon | |
>>> idna.alabel('测试') | |
b'xn--0zwm56d' | |
Compatibility Mapping (UTS #46) | |
+++++++++++++++++++++++++++++++ | |
As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895>`_, the | |
IDNA specification does not normalize input from different potential | |
ways a user may input a domain name. This functionality, known as | |
a “mapping”, is considered by the specification to be a local | |
user-interface issue distinct from IDNA conversion functionality. | |
This library provides one such mapping that was developed by the | |
Unicode Consortium. Known as `Unicode IDNA Compatibility Processing | |
<https://unicode.org/reports/tr46/>`_, it provides for both a regular | |
mapping for typical applications, as well as a transitional mapping to | |
help migrate from older IDNA 2003 applications. Strings are | |
preprocessed according to Section 4.4 “Preprocessing for IDNA2008” | |
prior to the IDNA operations. | |
For example, “Königsgäßchen” is not a permissible label as *LATIN | |
CAPITAL LETTER K* is not allowed (nor are capital letters in general). | |
UTS 46 will convert this into lower case prior to applying the IDNA | |
conversion. | |
.. code-block:: pycon | |
>>> import idna | |
>>> idna.encode('Königsgäßchen') | |
... | |
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed | |
>>> idna.encode('Königsgäßchen', uts46=True) | |
b'xn--knigsgchen-b4a3dun' | |
>>> print(idna.decode('xn--knigsgchen-b4a3dun')) | |
königsgäßchen | |
Transitional processing provides conversions to help transition from | |
the older 2003 standard to the current standard. For example, in the | |
original IDNA specification, the *LATIN SMALL LETTER SHARP S* (ß) was | |
converted into two *LATIN SMALL LETTER S* (ss), whereas in the current | |
IDNA specification this conversion is not performed. | |
.. code-block:: pycon | |
>>> idna.encode('Königsgäßchen', uts46=True, transitional=True) | |
'xn--knigsgsschen-lcb0w' | |
Implementers should use transitional processing with caution, only in | |
rare cases where conversion from legacy labels to current labels must be | |
performed (i.e. IDNA implementations that pre-date 2008). For typical | |
applications that just need to convert labels, transitional processing | |
is unlikely to be beneficial and could produce unexpected incompatible | |
results. | |
``encodings.idna`` Compatibility | |
++++++++++++++++++++++++++++++++ | |
Function calls from the Python built-in ``encodings.idna`` module are | |
mapped to their IDNA 2008 equivalents using the ``idna.compat`` module. | |
Simply substitute the ``import`` clause in your code to refer to the new | |
module name. | |
Exceptions | |
---------- | |
All errors raised during the conversion following the specification | |
should raise an exception derived from the ``idna.IDNAError`` base | |
class. | |
More specific exceptions that may be generated as ``idna.IDNABidiError`` | |
when the error reflects an illegal combination of left-to-right and | |
right-to-left characters in a label; ``idna.InvalidCodepoint`` when | |
a specific codepoint is an illegal character in an IDN label (i.e. | |
INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is | |
illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ | |
but the contextual requirements are not satisfied.) | |
Building and Diagnostics | |
------------------------ | |
The IDNA and UTS 46 functionality relies upon pre-calculated lookup | |
tables for performance. These tables are derived from computing against | |
eligibility criteria in the respective standards. These tables are | |
computed using the command-line script ``tools/idna-data``. | |
This tool will fetch relevant codepoint data from the Unicode repository | |
and perform the required calculations to identify eligibility. There are | |
three main modes: | |
* ``idna-data make-libdata``. Generates ``idnadata.py`` and | |
``uts46data.py``, the pre-calculated lookup tables used for IDNA and | |
UTS 46 conversions. Implementers who wish to track this library against | |
a different Unicode version may use this tool to manually generate a | |
different version of the ``idnadata.py`` and ``uts46data.py`` files. | |
* ``idna-data make-table``. Generate a table of the IDNA disposition | |
(e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix | |
B.1 of RFC 5892 and the pre-computed tables published by `IANA | |
<https://www.iana.org/>`_. | |
* ``idna-data U+0061``. Prints debugging output on the various | |
properties associated with an individual Unicode codepoint (in this | |
case, U+0061), that are used to assess the IDNA and UTS 46 status of a | |
codepoint. This is helpful in debugging or analysis. | |
The tool accepts a number of arguments, described using ``idna-data | |
-h``. Most notably, the ``--version`` argument allows the specification | |
of the version of Unicode to be used in computing the table data. For | |
example, ``idna-data --version 9.0.0 make-libdata`` will generate | |
library data against Unicode 9.0.0. | |
Additional Notes | |
---------------- | |
* **Packages**. The latest tagged release version is published in the | |
`Python Package Index <https://pypi.org/project/idna/>`_. | |
* **Version support**. This library supports Python 3.6 and higher. | |
As this library serves as a low-level toolkit for a variety of | |
applications, many of which strive for broad compatibility with older | |
Python versions, there is no rush to remove older interpreter support. | |
Removing support for older versions should be well justified in that the | |
maintenance burden has become too high. | |
* **Python 2**. Python 2 is supported by version 2.x of this library. | |
Use "idna<3" in your requirements file if you need this library for | |
a Python 2 application. Be advised that these versions are no longer | |
actively developed. | |
* **Testing**. The library has a test suite based on each rule of the | |
IDNA specification, as well as tests that are provided as part of the | |
Unicode Technical Standard 46, `Unicode IDNA Compatibility Processing | |
<https://unicode.org/reports/tr46/>`_. | |
* **Emoji**. It is an occasional request to support emoji domains in | |
this library. Encoding of symbols like emoji is expressly prohibited by | |
the technical standard IDNA 2008 and emoji domains are broadly phased | |
out across the domain industry due to associated security risks. For | |
now, applications that need to support these non-compliant labels | |
may wish to consider trying the encode/decode operation in this library | |
first, and then falling back to using `encodings.idna`. See `the Github | |
project <https://github.com/kjd/idna/issues/18>`_ for more discussion. | |