Skip to content

Fix relative links to README.md files#1921

Open
chrstnst wants to merge 1 commit intorust-lang:masterfrom
chrstnst:fix-links-to-readme-files
Open

Fix relative links to README.md files#1921
chrstnst wants to merge 1 commit intorust-lang:masterfrom
chrstnst:fix-links-to-readme-files

Conversation

@chrstnst
Copy link
Copy Markdown

@chrstnst chrstnst commented Nov 8, 2022

Fixes #1920

I suggest adding to the 'index' preprocessor a call to swap the filenames in links in the chapters' contents.

@chrstnst
Copy link
Copy Markdown
Author

chrstnst commented Nov 8, 2022

Fixes #984

@sanmai-NL
Copy link
Copy Markdown

@ehuss Is this change easy to review and accept?

@ehuss
Copy link
Copy Markdown
Contributor

ehuss commented Jan 16, 2023

I would prefer to not use regular expressions to translate the links. Can the translation be done in adjust_links or somewhere like that?

@ehuss ehuss added the S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. label Jan 16, 2023
@CaydenPierce
Copy link
Copy Markdown

Would love to see this merged.

@sanmai-NL
Copy link
Copy Markdown

@CaydenPierce would you be able to make the requested improvement to the original PR, using a more robust/performant matching algorithm than regular expression matching?

@ghost
Copy link
Copy Markdown

ghost commented Jun 30, 2023

Hi, I've been reading the discussions on this topic and I'm also interested in seeing mdBook incorporate a solution for this issue.

Can the translation be done in adjust_links or somewhere like that?

I had a look and came up with something that extends the existing regular expression in adjust_links. However this would apply to all links, even if the 'index' preprocessor has been disabled. I'm sharing it in case this can help moving forward on this.

diff --git a/src/utils/mod.rs b/src/utils/mod.rs
index 9156916..d0cda17 100644
--- a/src/utils/mod.rs
+++ b/src/utils/mod.rs
@@ -95,7 +95,7 @@ pub fn unique_id_from_content(content: &str, id_counter: &mut HashMap<String, us
 fn adjust_links<'a>(event: Event<'a>, path: Option<&Path>) -> Event<'a> {
     static SCHEME_LINK: Lazy<Regex> = Lazy::new(|| Regex::new(r"^[a-z][a-z0-9+.-]*:").unwrap());
     static MD_LINK: Lazy<Regex> =
-        Lazy::new(|| Regex::new(r"(?P<link>.*)\.md(?P<anchor>#.*)?").unwrap());
+        Lazy::new(|| Regex::new(r"(?P<link>.*?)(?P<readme>README)?\.md(?P<anchor>#.*)?").unwrap());
 
     fn fix<'a>(dest: CowStr<'a>, path: Option<&Path>) -> CowStr<'a> {
         if dest.starts_with('#') {
@@ -126,7 +126,16 @@ fn adjust_links<'a>(event: Event<'a>, path: Option<&Path>) -> Event<'a> {
             }
 
             if let Some(caps) = MD_LINK.captures(&dest) {
-                fixed_link.push_str(&caps["link"]);
+                let link = &caps["link"];
+                fixed_link.push_str(link);
+                // "Links to README.md will be converted to index.html"
+                if let Some(readme) = caps.name("readme") {
+                    if link.is_empty() || link.ends_with('/') {
+                        fixed_link.push_str("index");
+                    } else {
+                        fixed_link.push_str(readme.as_str());
+                    }
+                }
                 fixed_link.push_str(".html");
                 if let Some(anchor) = caps.name("anchor") {
                     fixed_link.push_str(anchor.as_str());

Several questions I can think of:

  1. How to deal with the fact that the 'index' preprocessor can be run or not? Maybe the logic should live in src/preprocess/index.rs and make calls to adjust_links.
  2. "I would prefer to not use regular expressions to translate the links.": Is it ok to extend the regex in adjust_links?

@bitdivine
Copy link
Copy Markdown

Maybe README.md could be mapped to index.html AND README.html. This way the links don't need to be updated. Just an alternative as changing the links seems to be complicated. I don't believe that having README.md should be a problem for users. After all, users can have either paths ending in / or in /index.html, so why not /README.html as well? It might not look as cool as having URLs end in / but it is simple and robust.

@sanmai-NL
Copy link
Copy Markdown

Maybe README.md could be mapped to index.html AND README.html. This way the links don't need to be updated. Just an alternative as changing the links seems to be complicated. I don't believe that having README.md should be a problem for users. After all, users can have either paths ending in / or in /index.html, so why not /README.html as well? It might not look as cool as having URLs end in / but it is simple and robust.

That would work great, but the canonical address should be recorded in the HTML to prevent duplicate content for indexers like online search engines.

@jlinford
Copy link
Copy Markdown

jlinford commented Oct 2, 2023

Looking forward to seeing this fixed. Thanks!

@gluax
Copy link
Copy Markdown

gluax commented Aug 25, 2024

Hey all, what's the status on getting this fix in?

tgreenx added a commit to tgreenx/zonemaster that referenced this pull request Mar 11, 2025
This commits explicitly sets many configuration options for the mdbook generation.
In particular, for generic options (see https://rust-lang.github.io/mdBook/format/configuration/general.html#build-options):
- 'create-missing' is set to false. This will make the build process exit with an error if any files do not exist in 'SUMMARY.md'.
- 'use-default-preprocessors' is set to false. This disables the default preprocessors ('links' & 'index').
The 'index' preprocessor converts all chapter files named 'README.md' into 'index.md', but there is a catch: relative links are not dynamically updated (see rust-lang/mdBook#1921), leading to errors at build generation. We can live just fine without that plugin.
The 'links' preprocessor expands the {{ #playground }}, {{ #include }}, and {{ #rustdoc_include }} handlebars helpers in a chapter to include the contents of a file. We can probably live without too (although it could be re-enabled).

And for specific "linkcheck" options (see https://github.com/Michael-F-Bryan/mdbook-linkcheck/blob/v0.7.7/README.md?plain=1#L64-L125):
- 'optional' is set to true. This will output a warning if the plugin is not installed, rather than failing to build entirely.
- 'warning-policy is set to 'warn'. This changes the level of messages to warnings, rather than errors that interrupt the build process.
- 'traverse-parent-directories' is set to false. This disallows linking to files outside of the book's root directory.
- 'exclude' is set to some links. This excludes the linkchecking from running on those. They were chosen after trial and error.
tgreenx added a commit to tgreenx/zonemaster that referenced this pull request Mar 11, 2025
This commits explicitly sets many configuration options for the mdbook generation.
In particular, for generic options (see https://rust-lang.github.io/mdBook/format/configuration/general.html#build-options):
- 'create-missing' is set to false. This will make the build process exit with an error if any files do not exist in 'SUMMARY.md'.
- 'use-default-preprocessors' is set to false. This disables the default preprocessors ('links' & 'index').
The 'index' preprocessor converts all chapter files named 'README.md' into 'index.md', but there is a catch: relative links are not dynamically updated (see rust-lang/mdBook#1921), leading to errors at build generation. We can live just fine without that plugin.
The 'links' preprocessor expands the {{ #playground }}, {{ #include }}, and {{ #rustdoc_include }} handlebars helpers in a chapter to include the contents of a file. We can probably live without too (although it could be re-enabled).

And for specific "linkcheck" options (see https://github.com/Michael-F-Bryan/mdbook-linkcheck/blob/v0.7.7/README.md?plain=1#L64-L125):
- 'optional' is set to true. This will output a warning if the plugin is not installed, rather than failing to build entirely.
- 'warning-policy' is set to 'warn'. This changes the level of messages to warnings, rather than errors that interrupt the build process. (No effective change: this was already the default value)
- 'traverse-parent-directories' is set to false. This disallows linking to files outside of the book's root directory. (No effective change: this was already the default value)
- 'exclude' is set to some links. This excludes the linkchecking from running on those. They were chosen after trial and error.
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Aug 18, 2025

☔ The latest upstream changes (possibly 21f2435) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: relative links to 'README.md' files don't resolve

8 participants