Skip to content

Bug: PDF and Page Archives served as plain text (Missing Content-Type header) in v0.30.0 #2501

@MCQSJ

Description

@MCQSJ

Describe the Bug

Description

When trying to preview or download PDF assets or full-page archives, the browser renders them as raw plain text instead of displaying the PDF or HTML content correctly.

When clicking the download button, the browser navigates to a URL that displays the entire file as text. If I use "Save link as" in Edge, the default file extension is incorrectly suggested as .txt. I have to manually rename the extension to .pdf or .html to view the content properly.

Steps to Reproduce

  1. Use Karakeep (Hoarder) version 0.30.0.
  2. Navigate to a bookmark with a PDF attachment or a "Full Page Archive".
  3. Click on the "Preview" or "Download" action for that asset.
  4. Observation: The browser displays raw text/code. The Content-Type header is missing from the response.

Steps to Reproduce

  1. Log into Karakeep (Hoarder) version 0.30.0.
  2. Navigate to a bookmark that contains a PDF attachment or has a "Full Page Archive" generated.
  3. Click the "Preview" icon or the "Download" button for the specific PDF/Archive asset.
  4. Observe that the browser opens a new tab/window showing raw text content instead of the rendered file.
  5. Check the Network tab in DevTools; notice the absence of a Content-Type header in the response from /api/assets/....

Expected Behaviour

  1. When clicking "Preview" for a PDF, the browser's built-in PDF viewer should open and render the file correctly.
  2. When clicking "Preview" for a Page Archive, the archived HTML should be rendered as a webpage.
  3. When clicking "Download", the browser should trigger a file download with the correct file extension (.pdf or .html) instead of opening it as a text page.
  4. The server should include the correct Content-Type header (e.g., application/pdf or text/html) in the API response.

Screenshots or Additional Context

This issue affects both the built-in previewer and the direct download links via the API. The server should explicitly set the Content-Type based on the stored asset type.

Image Image Image

Device Details

OS: Windows 11 Browser: Microsoft Edge 145.0.3800.65

Exact Karakeep Version

Karakeep v0.30.0

Environment Details

Docker Windows OpenResty

Debug Logs

Technical Logs (Edge Console)

The request to the asset API returns a 200 OK but lacks the necessary MIME type header.

Request URL: https://book.home.com/api/assets/e4e4acd4-0a20-4904-9ed7-756443ae8a7f
Key Response Headers:

HTTP/1.1 200 OK
access-control-allow-origin: *
cache-control: private, max-age=31536000, immutable
content-length: 1883575
content-security-policy: sandbox; default-src 'none'; base-uri 'none'; form-action 'none'; img-src https: data: blob:; style-src 'unsafe-inline' https:; connect-src 'none'; media-src https: data: blob:; object-src 'none'; frame-src 'none'
x-content-type-options: nosniff
# Missing Content-Type (e.g., application/pdf or text/html)

### Have you checked the troubleshooting guide?

- [x] I have checked the troubleshooting guide and I haven't found a solution to my problem

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstatus/untriagedThis issue needs triaging to confirm it

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions