Skip to content

Allow uncompressed import_from_file even if connection was opened with compression #293

@exaSR

Description

@exaSR

Summary

Enable calls to import_from_file, import_from_pandas and import_from_itarable to perform uncompressed transfer even when the main connection was established with compression=True.

Details

Setup

In our code, this actually happens in a central location and not per use case:

conn = pyexasol.connect(
    dsn='CENSORED', user='sys', password='CENSORED',
    autocommit=True, websocket_sslopt={"cert_reqs": ssl.CERT_NONE},
    compression=True
)

(workaround: we now pass the compression flag through a few layers of abstraction so the import works uncompressed)

Attempt 1

conn.import_from_file("nation.csv", table="NN", import_params={"compression": False})

results in

  File "/home/sr/.local/lib/python3.14/site-packages/pyexasol/http_transport.py", line 401, in run_sql
    import_query = ImportQuery.load_from_dict(
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~^
        connection=self.connection, compression=self.compression, params=self.params
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ).build_query(table=table, exa_address_list=self.exa_address_list)
    ^
  File "/home/sr/.local/lib/python3.14/site-packages/pyexasol/http_transport.py", line 220, in load_from_dict
    return ImportQuery(connection=connection, compression=compression, **params)
TypeError: pyexasol.http_transport.ImportQuery() got multiple values for keyword argument 'compression'

Attempt 2

conn.import_from_file("nation.csv", table="NN", import_params={"format": 'CSV'})

results in

  File "/home/sr/.local/lib/python3.14/site-packages/pyexasol/http_transport.py", line 79, in _get_file_list
    file_ext = self._file_ext
               ^^^^^^^^^^^^^^
  File "/home/sr/.local/lib/python3.14/site-packages/pyexasol/http_transport.py", line 167, in _file_ext
    raise ValueError(f"Unsupported compression format: {self.format}")
ValueError: Unsupported compression format: CSV

Background & Context

Compression in python is hellish slow due to the single-threaded operation, so I tried to switch to uncompressed import for TPC-H benchmarking, but failed...

Examples

References

Task(s)

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureProduct feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions