Skip to content
Merged
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 84 additions & 32 deletions Doc/library/sqlite3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1157,6 +1157,13 @@ Connection objects
f.write('%s\n' % line)
con.close()

.. note::

If your database contains ``TEXT`` values with invalid Unicode
sequences, or encodings incompatible with UTF-8,
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated
you must use a custom :attr:`text_factory`.
See :ref:`sqlite3-howto-text-factory` for more details.


.. method:: backup(target, *, pages=-1, progress=None, name="main", sleep=0.250)

Expand Down Expand Up @@ -1444,39 +1451,8 @@ Connection objects
and returns a text representation of it.
The callable is invoked for SQLite values with the ``TEXT`` data type.
By default, this attribute is set to :class:`str`.
If you want to return ``bytes`` instead, set *text_factory* to ``bytes``.

Example:

.. testcode::

con = sqlite3.connect(":memory:")
cur = con.cursor()

AUSTRIA = "Österreich"

# by default, rows are returned as str
cur.execute("SELECT ?", (AUSTRIA,))
row = cur.fetchone()
assert row[0] == AUSTRIA

# but we can make sqlite3 always return bytestrings ...
con.text_factory = bytes
cur.execute("SELECT ?", (AUSTRIA,))
row = cur.fetchone()
assert type(row[0]) is bytes
# the bytestrings will be encoded in UTF-8, unless you stored garbage in the
# database ...
assert row[0] == AUSTRIA.encode("utf-8")

# we can also implement a custom text_factory ...
# here we implement one that appends "foo" to all strings
con.text_factory = lambda x: x.decode("utf-8") + "foo"
cur.execute("SELECT ?", ("bar",))
row = cur.fetchone()
assert row[0] == "barfoo"

con.close()
See :ref:`sqlite3-howto-text-factory` for more details.

.. attribute:: total_changes

Expand Down Expand Up @@ -1562,6 +1538,8 @@ Cursor objects

Use :meth:`executescript` to execute multiple SQL statements.

:meth:`!execute` only accepts UTF-8 encoded strings.
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated

.. method:: executemany(sql, parameters, /)

For every item in *parameters*,
Expand Down Expand Up @@ -1610,6 +1588,8 @@ Cursor objects
Starting with Python 3.14, :exc:`ProgrammingError` will
be raised instead.

:meth:`!executemany` only accepts UTF-8 encoded strings.
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated

.. method:: executescript(sql_script, /)

Execute the SQL statements in *sql_script*.
Expand All @@ -1635,6 +1615,7 @@ Cursor objects
COMMIT;
""")

:meth:`!executescript` only accepts UTF-8 encoded strings.
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated

.. method:: fetchone()

Expand Down Expand Up @@ -2614,6 +2595,77 @@ With some adjustments, the above recipe can be adapted to use a
instead of a :class:`~collections.namedtuple`.


.. _sqlite3-howto-text-factory:

How to create and use text factories
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

By default, :mod:`!sqlite3` adapts SQLite values with the ``TEXT`` data type
using :class:`str`.
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated
This works well for correctly encoded UTF-8 text, but it will fail for invalid
Unicode sequences and other encodings.
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated
To work around this, you can use a custom :attr:`~Connection.text_factory`.
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated

Because of SQLite's `flexible typing`_, it is not uncommon to encounter table
columns with the ``TEXT`` data type, containing arbitrary data.
To demonstrate, let's create a test database with an invalid Unicode sequence.
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated
We will use a `CAST expression`_ to coerce an invalid Unicode sequence,
represented as a hexadecimal string ``X'619F'``,
into the ``TEXT`` data type:

.. testcode::

con = sqlite3.connect(":memory:")
con.executescript("""
CREATE TABLE test (data TEXT);
INSERT INTO test VALUES(CAST(X'619F' AS TEXT));
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated
""")

To work with such databases, we can use the following technique,
borrowed from the :ref:`unicode-howto`:

.. testcode::

con.text_factory = lambda data: str(data, errors="surrogateescape")
dump = con.iterdump()
for line in dump:
print(line)

The loop above will print the offending line using Unicode surrogate escapes:
Comment thread
erlend-aasland marked this conversation as resolved.
Outdated

.. testoutput::

BEGIN TRANSACTION;
CREATE TABLE test (data TEXT);
INSERT INTO "test" VALUES('a\udc9f');
COMMIT;

Note that strings containing surrogate escapes must be treated with care.
You cannot pass them back to SQLite, for example using :meth:`~Cursor.execute`,
since the :mod:`!sqlite3` module APIs only accept UTF-8 encoded strings.
In order to write strings containing surrogate escapes to a file,
you will have to use ``errors="surrogateescape"`` as an argument to :func:`open`:

.. testcode::

with open("dump.sql", "w", errors="surrogateescape") as f:
sql = "\n".join(dump)
f.write(sql)

.. note::

Unlike :attr:`~Cursor.row_factory`, which exists as an attribute both on
:class:`Cursor` and :class:`Connection` objects,
:attr:`~Connection.text_factory` only exists as an attribute on
:class:`!Connection` objects.

.. seealso::

:ref:`unicode-howto`

.. _CAST expression: https://www.sqlite.org/lang_expr.html#castexpr


.. _sqlite3-explanation:

Explanation
Expand Down