feat: replace toml with cli config by Mirko-von-Leipzig · Pull Request #732 · 0xMiden/node

Mirko-von-Leipzig · 2025-03-12T12:31:24Z

This PR replaces the node's configuration via toml files to pure clap cli args. We retain the ability to configure via a file using environment variables instead.

Left undone

Update github scripts (waiting on an initial review first)

Documentation in readme. I'll skip the readme and add it to the operator docs in a separate PR.

The faucet still uses a toml file. This file is re-used when creating a new account and when running the faucet. We may want to relook at the expected flow here but I didn't want to fiddle with it.

New flow

I've swopped around the subcommand structure so that its now miden-node <component> <action>. This was mostly to "organise" the store commands but I can imagine having additional subcommands for rpc and block-producer in the future.

I wanted to change up the store so that miden-node store bootstrap would also create the database and seed it with the genesis data. In this case the genesis.dat would become ephemeral (though it can be brought back when if we decide we need it). miden-node store start would now never create the db, but instead expect it to exist. I also want to move database migration into a separate subcommand (and not automatically do it on start). I decided not to do this in this PR as it was too much churn.

Simplifications

I've replaced the various store filepaths and directory paths with a single data_directory which will contain the database, blockstore and genesis.dat. I don't see a good reason to modify these at present.

I've also simplified node start by making the internal gRPC servers (block-producer & store) use localhost:0 which assigns a random open port. This means these no longer need to be configured at all. We can somewhat trivially make this overrideable but I don't see the point. If you want that, spawn the components separately imo :)

I'll also note that I was tempted to make the components more robust to startup failures, but it seems like tonic in general does not handle connection failures in many cases - meaning we cannot trivially loop/retry until all required components are ready. This may take a bit more work to achieve.

Usage

miden-node node start [OPTIONS] --rpc.url <URL> --store.data-directory <DIR>
miden-node rpc start [OPTIONS] --rpc.url <URL> --store.url <URL> --block-producer.url <URL>
miden-node block-producer start [OPTIONS] --store.url <STORE_URL> <URL>

miden-node store dump-genesis > genesis.toml
miden-node store bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>
miden-node store start [OPTIONS] --url <URL> --data-directory <DIR>

Replace block-producer start command

Mirko-von-Leipzig · 2025-03-12T12:53:01Z

bin/node/src/commands/node.rs

+    Start {
+        /// Url at which to serve the RPC component's gRPC API.
+        #[arg(long = "rpc.url", env = ENV_RPC_URL, value_name = "URL")]
+        rpc_url: Url,
+
+        /// Directory in which the Store component should store the database and raw block data.
+        #[arg(long = "store.data-directory", env = ENV_STORE_DIRECTORY, value_name = "DIR")]
+        store_data_directory: PathBuf,
+
+        /// The remote batch prover's gRPC url. If unset, will default to running a prover
+        /// in-process which is expensive.
+        #[arg(long = "batch_prover.url", env = ENV_BATCH_PROVER_URL, value_name = "URL")]
+        batch_prover_url: Option<Url>,
+
+        /// The remote block prover's gRPC url. If unset, will default to running a prover
+        /// in-process which is expensive.
+        #[arg(long = "block_prover.url", env = ENV_BLOCK_PROVER_URL, value_name = "URL")]
+        block_prover_url: Option<Url>,
+
+        /// Enables the exporting of traces for OpenTelemetry.
+        ///
+        /// This can be further configured using environment variables as defined in the official
+        /// OpenTelemetry documentation. See our operator manual for further details.
+        #[arg(long = "open-telemetry", default_value_t = false, env = ENV_ENABLE_OTEL, value_name = "bool")]
+        open_telemetry: bool,
+    },


I wonder if we shouldn't use the manual (non-derive) way of building these args. That would let us re-use the same ones and keep consistency.

I'm unsure about the naming scheme as well, and whether it should always remain consistent. I suspect I've done a little of column A and a bit of column B.

e.g. should store.url be used to configure RPC, block-producer AND the store start itself?

I'm not sure I fully understand the issue. Is it that when we do miden-node store start we use --url to specify store URL, but if we use miden-node block-producer start we use --store.url? This seems pretty natural to me.

I agree its pretty natural as is. I was considering if it would help cement the relation between the components if kept it consistent throughout.

i.e. it might make it more obvious that its the same URL if its store.url everywhere.

Mirko-von-Leipzig · 2025-03-12T14:46:08Z

crates/rpc/src/server/api.rs

+        let block_producer_url = format!("http://{block_producer_address}");
+        let channel = tonic::transport::Endpoint::try_from(block_producer_url)?.connect().await?;


This is a bit awkward and makes me think I was too hasty in removing our custom Endpoint type in #654.

As a summary:

Tonic wants a Uri but accepts strings in the format of http(s)://{SocketAddr}.

TcpListener::bind wants a SocketAddr.

Uri cannot be trivially converted to a SocketAddr

Url can be converted to a SocketAddr so we use it.

Clients may want to perform a DNS lookup to the server address of http://rpc.testnet.miden.io

I think its okay to leave as is, but we probably want to revisit this (again :'( ).

If we had the Endpoint struct - how would this be different?

I don't actually know. It just seems messy and incorrect to me at the moment.

As an example, if you provide an https URL for a server at the moment then we don't actually support that. We strip the protocol from the URL when we convert to a socket address and don't add TLS support when we create the tonic server.

This is sort of fine because we don't (ourselves) want to use TLS internally. But it sort of points out the leaky abstraction here.

If we implement something ourselves then we'll probably have the opposite issue of rejecting weird but valid URLs.

Actually maybe the real answer is making startup robust. The only reason we have TcpListener involved is to separate server initialization from the clients spinning up in serve.

If we have the components capable of starting up robustly with internal retries etc, then we can simply use Uri which is what tonic can use.

bobbinth

Looks good! Thank you! Not a full review yet, but I left some comments inline.

I've swopped around the subcommand structure so that its now miden-node <component> <action>. This was mostly to "organise" the store commands but I can imagine having additional subcommands for rpc and block-producer in the future.

I think this is largely fine, but I left a couple of comments inline that I think some "node-wide" commands could be brought up to the root level rather than be under the node sub-command. For example:

miden-node start
miden-node bootstrap

Feel more natural to me than

miden-node node start
miden-node store bootstrap

I wanted to change up the store so that miden-node store bootstrap would also create the database and seed it with the genesis data. In this case the genesis.dat would become ephemeral (though it can be brought back when if we decide we need it). miden-node store start would now never create the db, but instead expect it to exist. I also want to move database migration into a separate subcommand (and not automatically do it on start). I decided not to do this in this PR as it was too much churn.

I like this and agree with doing it in a separate PR. One thing we should make sure is that the bootstrap command is deterministic (i.e., given the same genesis inputs file it initializes the DB in the same way).

I've replaced the various store filepaths and directory paths with a single data_directory which will contain the database, blockstore and genesis.dat. I don't see a good reason to modify these at present.

I think this is fine for now - the only question I have is if we'd still have the flexibility to somehow put the database file and the blockstore on different disks, if we wanted to.

Usage

Summarizing my various comments, I'd probably have the commands like something like this:

miden-node start [OPTIONS] --url <URL> --data-directory <DIR>
miden-node rpc start [OPTIONS] --url <URL> --store-url <URL> --block-producer-url <URL>
miden-node block-producer start [OPTIONS] --url <URL> --store-url <STORE_URL>

miden-node dump-genesis > genesis.toml
miden-node bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>
miden-node store start [OPTIONS] --url <URL> --data-directory <DIR>

crates/store/src/server/mod.rs

crates/rpc/src/server/mod.rs

crates/rpc/src/server/api.rs

bobbinth · 2025-03-16T20:17:34Z

crates/rpc/src/server/api.rs

+        let block_producer_url = format!("http://{block_producer_address}");
+        let channel = tonic::transport::Endpoint::try_from(block_producer_url)?.connect().await?;


If we had the Endpoint struct - how would this be different?

bin/node/src/commands/mod.rs

bobbinth · 2025-03-16T20:50:35Z

bin/node/src/commands/node.rs

+    Start {
+        /// Url at which to serve the RPC component's gRPC API.
+        #[arg(long = "rpc.url", env = ENV_RPC_URL, value_name = "URL")]
+        rpc_url: Url,
+
+        /// Directory in which the Store component should store the database and raw block data.
+        #[arg(long = "store.data-directory", env = ENV_STORE_DIRECTORY, value_name = "DIR")]
+        store_data_directory: PathBuf,
+
+        /// The remote batch prover's gRPC url. If unset, will default to running a prover
+        /// in-process which is expensive.
+        #[arg(long = "batch_prover.url", env = ENV_BATCH_PROVER_URL, value_name = "URL")]
+        batch_prover_url: Option<Url>,
+
+        /// The remote block prover's gRPC url. If unset, will default to running a prover
+        /// in-process which is expensive.
+        #[arg(long = "block_prover.url", env = ENV_BLOCK_PROVER_URL, value_name = "URL")]
+        block_prover_url: Option<Url>,
+
+        /// Enables the exporting of traces for OpenTelemetry.
+        ///
+        /// This can be further configured using environment variables as defined in the official
+        /// OpenTelemetry documentation. See our operator manual for further details.
+        #[arg(long = "open-telemetry", default_value_t = false, env = ENV_ENABLE_OTEL, value_name = "bool")]
+        open_telemetry: bool,
+    },


I'm not sure I fully understand the issue. Is it that when we do miden-node store start we use --url to specify store URL, but if we use miden-node block-producer start we use --store.url? This seems pretty natural to me.

bin/node/src/commands/node.rs

bin/node/src/main.rs

bobbinth · 2025-03-16T21:04:52Z

bin/node/src/commands/store.rs

+    /// Dumps the default genesis configuration to stdout.
+    ///
+    /// Use this as a starting point to modify the genesis data for `bootstrap`.
+    DumpGenesis,
+
+    /// Bootstraps the blockchain database with the genesis block.
+    ///
+    /// This populates the genesis block's data with the accounts and data listed in the
+    /// configuration file.
+    ///
+    /// Each generated genesis account's data is also written to disk. This includes the private
+    /// key which can be used to create transactions for these accounts.
+    ///
+    /// See also: `dump-genesis`
+    Bootstrap {
+        /// Genesis configuration file.
+        ///
+        /// If not provided the default configuration is used.
+        #[arg(long, value_name = "FILE")]
+        config: Option<PathBuf>,
+        /// Directory in which to store the database and raw block data.
+        #[arg(long, env = ENV_STORE_DIRECTORY, value_name = "DIR")]
+        data_directory: PathBuf,
+        // Directory to write the account data to.
+        #[arg(long, value_name = "DIR")]
+        accounts_directory: PathBuf,
+    },


It is not clear to me that these commands should be store-specific. Currently the do affect only the store component, but in the future, bootstrapping process may affect multiple components. Maybe we could move it to the root so that the users could run something like:

miden-node bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>

Could you give a potential example?

The reason I bundled these is because I thought (in general) that these would all be run on the instance running the store. That we're spinning up the store component and these are the set of commands required to do so.

If there is a block-producer bootstrap process then it would likely not have anything to do with this one?

hmmm - not quite sure what an example could be, and actually, thinking about this more, it does feel like the store component should have its own bootstrap command. But I do think it may be good to have one "global" bootstrap command which would currently just execute store's initialization process. So, both of these will currently do the same:

miden-node bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR> miden-node store bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>

But in the future the second one may be a subset of the first one.

I would prefer if we have a single canonical way of doing one thing, at least until there is a stronger indication that we need a combination of things.

I could see that we may want a quick-start option to spin up a node but its possible we would want that to form part of the "bundled" options then?

One could argue that these commands are different: one is for initializing the "monolithic" node and the other one is for initializing the store component. That is, the user who's setting up the store as a separate component would use the second command, but the user who is setting up the node to run as a combined binary would use the first command - and so, there is no overlap.

They do happen to do the same thing right now - but it may actually be cleaner to keep them separate.

Would you be happy with

miden-node store boostrap store start miden-node bundled bootstrap bundled start

where bundled is the current node -- I don't think this is a great name, but node is also.. strange. I'm wary of dumping the full node commands in the root "folder" so to speak.

miden-node store boostrap
store start

miden-node bundled bootstrap
bundled start

Yep, let's go with this. I do think not having the "bundled" word would make it a bit easier for the user - but the difference is probably marginal.

Node alternate word dump

composite combined bundle pack in-process full-node full all

Something that is a bit confusing now is that dump-genesis is only in miden-node store.

Mirko-von-Leipzig · 2025-03-18T12:16:39Z

I've addressed all comments, only open question is what to do regarding dump-genesis.

igamigo

Not the deepest review but looks good to me! It seems like various subcommands only include a "start" variant, so I wonder if it would just be worth leaving them as top-level commands (would it solve the dump-genesis ambiguity maybe?) and I also saw your comment about providing space for future commands which would make sense as well.

igamigo · 2025-03-18T17:23:48Z

bin/node/src/commands/block_producer.rs

+    /// Starts the block-producer component.
+    Start {
+        /// Url at which to serve the gRPC API.
+        #[arg(env = ENV_BLOCK_PRODUCER_URL)]
+        url: Url,
+
+        /// The store's gRPC url.
+        #[arg(long = "store.url", env = ENV_STORE_URL)]
+        store_url: Url,
+
+        /// The remote batch prover's gRPC url. If unset, will default to running a prover
+        /// in-process which is expensive.
+        #[arg(long = "batch-prover.url", env = ENV_BATCH_PROVER_URL)]
+        batch_prover_url: Option<Url>,
+
+        /// The remote block prover's gRPC url. If unset, will default to running a prover
+        /// in-process which is expensive.
+        #[arg(long = "block-prover.url", env = ENV_BLOCK_PROVER_URL)]
+        block_prover_url: Option<Url>,
+
+        /// Enables the exporting of traces for OpenTelemetry.
+        ///
+        /// This can be further configured using environment variables as defined in the official
+        /// OpenTelemetry documentation. See our operator manual for further details.
+        #[arg(long = "enable-otel", default_value_t = false, env = ENV_ENABLE_OTEL)]
+        open_telemetry: bool,
+    },
+}


Considering your comment, is there anything to document (here and elsewhere) about the https support?

I think I can solve the issue by making component start up more robust. Right now I wouldn't even know what exactly would work beyond (sometimes) discarding the protocol.

One startup is robust it should be able to handle https as well, even if we ourselves never use it for performance/simplicity reasons.

Mirko-von-Leipzig · 2025-03-18T18:15:37Z

Not the deepest review but looks good to me! It seems like various subcommands only include a "start" variant, so I wonder if it would just be worth leaving them as top-level commands (would it solve the dump-genesis ambiguity maybe?) and I also saw your comment about providing space for future commands which would make sense as well.

Yeah its unclear to me as well; but I think I prefer the consistency of it all. That its always miden-node <component> <action>

bobbinth

Looks good! Thank you! What are some more immediate tasks beyond this PR?

Update documentation (readme and mdBook).
Update deployment scripts.
Make the genesis dat file "ephemeral" (i.e., get rid of the dump-genesis command).
Anything else?

Mirko-von-Leipzig · 2025-03-19T05:34:08Z

Looks good! Thank you! What are some more immediate tasks beyond this PR?

Update documentation (readme and mdBook).

Update deployment scripts.

Make the genesis dat file "ephemeral" (i.e., get rid of the dump-genesis command).

Anything else?

Allow each server to start even if a downstream server is offline which would improve this endpoint situation. Though the solution given in the issue with connect_lazy isn't entirely correct I don't think.

We can also now do #735 to configure the block-producer further.

Mirko-von-Leipzig added 12 commits March 10, 2025 15:16

CLI for node

f3dd91b

Replace RPC start command

1094e90

Replace block-producer start command

70f8e5c

Replace store start command

7b73139

Replace block-producer start command

Refactor and replace node start command

07993b6

Update config with provers

581febc

Replace init subcommand

0a6fd65

Re-add OTel support

da17b8e

store bootstrap

9247b4f

cleanup

47c0291

lints

13a5ae0

rip old configs

0a2fa25

Mirko-von-Leipzig marked this pull request as ready for review March 12, 2025 12:41

fixup ci issues

3d4ab68

Mirko-von-Leipzig commented Mar 12, 2025

View reviewed changes

Mirko-von-Leipzig force-pushed the mirko/cli branch from 48d30f0 to 6c0d031 Compare March 12, 2025 14:02

fix http url/socket

ba7c7bf

Mirko-von-Leipzig force-pushed the mirko/cli branch from 6c0d031 to ba7c7bf Compare March 12, 2025 14:04

Mirko-von-Leipzig commented Mar 12, 2025

View reviewed changes

Mirko-von-Leipzig mentioned this pull request Mar 13, 2025

feat: allow to configure block and batch frequency #735

Closed

bobbinth requested review from bobbinth and igamigo March 13, 2025 20:42

bobbinth reviewed Mar 16, 2025

View reviewed changes

bobbinth mentioned this pull request Mar 17, 2025

feat: store seed command in stress test CLI #657

Merged

Mirko-von-Leipzig mentioned this pull request Mar 17, 2025

Read-only RPC server #742

Closed

Address review comments

2a38fa7

Mirko-von-Leipzig requested a review from bobbinth March 18, 2025 12:16

Mirko-von-Leipzig force-pushed the mirko/cli branch from c955073 to e8c73d3 Compare March 18, 2025 12:20

igamigo approved these changes Mar 18, 2025

View reviewed changes

bobbinth approved these changes Mar 19, 2025

View reviewed changes

Mirko-von-Leipzig force-pushed the mirko/cli branch 2 times, most recently from 1d5e3bb to 4d131ce Compare March 19, 2025 09:22

Rename node to bundled and add bootstrapping

db2fdde

Mirko-von-Leipzig force-pushed the mirko/cli branch from 4d131ce to db2fdde Compare March 19, 2025 09:28

Merge next

384debf

Mirko-von-Leipzig mentioned this pull request Mar 19, 2025

CLI rework followups #749

Closed

Mirko-von-Leipzig merged commit 37f56c6 into next Mar 19, 2025
1 check passed

Mirko-von-Leipzig deleted the mirko/cli branch March 19, 2025 10:54

		let block_producer_url = format!("http://{block_producer_address}");
		let channel = tonic::transport::Endpoint::try_from(block_producer_url)?.connect().await?;

Conversation

Mirko-von-Leipzig commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Left undone

New flow

Simplifications

Usage

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bobbinth left a comment

Choose a reason for hiding this comment

Usage

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mirko-von-Leipzig Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mirko-von-Leipzig commented Mar 18, 2025

Uh oh!

igamigo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mirko-von-Leipzig commented Mar 18, 2025

Uh oh!

bobbinth left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mirko-von-Leipzig commented Mar 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Mirko-von-Leipzig commented Mar 12, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 17, 2025 •

edited

Loading

bobbinth left a comment •

edited

Loading