Skip to content

pyodide_http patch_requests breaks COI expectations #40

@WebReflection

Description

@WebReflection

This was erroneously opened in here pyodide/pyodide#4191

🐛 Bug

While testing/demoing one of our apps in PSDC we noticed that while Chrome/ium was managing to load a 3rd party spreadsheet both Firefox and Safari were completely broken at the headers and permissions headers.

We use code from a worker which requires SharedArrayBuffer and while we managed to enable it, all requests were blocked by the browsers.

To Reproduce

import requests
from typing import Union, Optional

from xlrd import Book
from xlrd.sheet import Sheet

# Sync Calls
from pyodide_http import patch_requests

def extract():
    """ do stuff """

def sync_load(data_url: str, sheet_name: str = None) -> Optional[Union[Book, Sheet]]:
    """"""
    patch_requests()  # patch requests and 

    r = requests.get(data_url)
    if r.status_code != 200:  # Not OK
        return None
    return extract(r.content, sheet_name=sheet_name)

The error in Safari is about headers messed up

[Error] Refused to set unsafe header "Accept-Encoding"
[Error] Refused to set unsafe header "Connection"
[Error] Preflight response is not successful. Status code: 403
[Error] Failed to load resource: Preflight response is not successful. Status code: 403 (sample_workbook.xls, line 0)
[Error] XMLHttpRequest cannot load https://raw.githubusercontent.com/XXX/sample_workbook.xls due to access control checks.
[Error] Failed to load resource: Preflight response is not successful. Status code: 403 (sample_workbook.xls, line 0)

ending up in pyodide as A network error occurred.

Expected behavior

If we change the code to use XHR out of the box everything works without issues and no network warning is ever shown:

def sync_load(data_url: str, sheet_name: str = None) -> Optional[Union[Book, Sheet]]:
    """"""
    xhr = js.XMLHttpRequest.new()
    xhr.open("GET", data_url, False)
    xhr.responseType = "arraybuffer"
    xhr.send(None)
    content = bytes(xhr.response.to_py())
    return extract(content, sheet_name=sheet_name)

I suspect the error is somewhere in here: https://github.com/koenvo/pyodide-http/blob/main/pyodide_http/_core.py#L75

There are a lot of headers manipulation but in some cases browsers really don't like user-land code messing up with security related server defined headers so that override mime type, as example, can be considered insecure as well as anything else that would not otherwise be part already of the predefined headers.

I hence suggest to allow something like patch_requests(ignore_headers=True) so that nothing is changed but I am also not sure why non worker env should change anything at mime type expectations ... although I think that in our case that value is True.

Environment

  • Browser version: breaks in Safari latest and Firefox latest

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions