Skip to content

Use Parquet to bypass Arrow#601

Merged
whitphx merged 14 commits intomainfrom
feature/bypass-arrow-serialization
Aug 23, 2023
Merged

Use Parquet to bypass Arrow#601
whitphx merged 14 commits intomainfrom
feature/bypass-arrow-serialization

Conversation

@whitphx
Copy link
Owner

@whitphx whitphx commented Aug 18, 2023

Resolves #509

Copy link
Contributor

@lukasmasuch lukasmasuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

- `time.sleep()` is no-op. Use `asyncio.sleep()` instead. This is a restriction from Pyodide runtime. See https://github.com/pyodide/pyodide/issues/2354. The following section about top-level await may also help to know how to use async functions on stlite.
- `st.experimental_data_editor` does not work as it relies on PyArrow, but it doesn't work on Pyodide. Track this issue on https://github.com/whitphx/stlite/issues/509.
- For URL access, `urllib` and `requests` don't work on Pyodide/stlite, so we have to use alternative methods provided by Pyodide, such as [`pyodide.http.pyfetch()`](https://pyodide.org/en/stable/usage/api/python-api/http.html#pyodide.http.pyfetch) or [`pyodide.http.open_url()`](https://pyodide.org/en/stable/usage/api/python-api/http.html#pyodide.http.open_url). See https://pyodide.org/en/stable/usage/faq.html#how-can-i-load-external-files-in-pyodide for the details. For `pyodide.http.pyfetch()`, see also the following section about top-level await.
- The C extension packages that are not built for Pyodide cannot be installed. See https://pyodide.org/en/stable/usage/faq.html#micropip-can-t-find-a-pure-python-wheel for the details.
Copy link
Contributor

@lukasmasuch lukasmasuch Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe add something like:

  1. st.bokeh_chart does not work since Pyodide uses Bokeh version 3.x while Streamlit only supports 2.x. The 3.x support for Streamlit is tracked here: Support bokeh 3.0.3 streamlit/streamlit#5858
  2. There are some small differences in how (less common) data types of DataFrame columns are handled in st.dataframe, st.data_editor, st.table, and Altair-based charts. The reason is that stlite uses the Parquet format instead of the Arrow IPC format to serialize dataframes.

Copy link
Owner Author

@whitphx whitphx Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I added the second one in this PR, and will do the first one in another PR.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> #605

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support the Arrow-based data transport.

2 participants