feat: replace toml with cli config#732
Conversation
Replace block-producer start command
| Start { | ||
| /// Url at which to serve the RPC component's gRPC API. | ||
| #[arg(long = "rpc.url", env = ENV_RPC_URL, value_name = "URL")] | ||
| rpc_url: Url, | ||
|
|
||
| /// Directory in which the Store component should store the database and raw block data. | ||
| #[arg(long = "store.data-directory", env = ENV_STORE_DIRECTORY, value_name = "DIR")] | ||
| store_data_directory: PathBuf, | ||
|
|
||
| /// The remote batch prover's gRPC url. If unset, will default to running a prover | ||
| /// in-process which is expensive. | ||
| #[arg(long = "batch_prover.url", env = ENV_BATCH_PROVER_URL, value_name = "URL")] | ||
| batch_prover_url: Option<Url>, | ||
|
|
||
| /// The remote block prover's gRPC url. If unset, will default to running a prover | ||
| /// in-process which is expensive. | ||
| #[arg(long = "block_prover.url", env = ENV_BLOCK_PROVER_URL, value_name = "URL")] | ||
| block_prover_url: Option<Url>, | ||
|
|
||
| /// Enables the exporting of traces for OpenTelemetry. | ||
| /// | ||
| /// This can be further configured using environment variables as defined in the official | ||
| /// OpenTelemetry documentation. See our operator manual for further details. | ||
| #[arg(long = "open-telemetry", default_value_t = false, env = ENV_ENABLE_OTEL, value_name = "bool")] | ||
| open_telemetry: bool, | ||
| }, |
There was a problem hiding this comment.
I wonder if we shouldn't use the manual (non-derive) way of building these args. That would let us re-use the same ones and keep consistency.
I'm unsure about the naming scheme as well, and whether it should always remain consistent. I suspect I've done a little of column A and a bit of column B.
e.g. should store.url be used to configure RPC, block-producer AND the store start itself?
There was a problem hiding this comment.
I'm not sure I fully understand the issue. Is it that when we do miden-node store start we use --url to specify store URL, but if we use miden-node block-producer start we use --store.url? This seems pretty natural to me.
There was a problem hiding this comment.
I agree its pretty natural as is. I was considering if it would help cement the relation between the components if kept it consistent throughout.
i.e. it might make it more obvious that its the same URL if its store.url everywhere.
48d30f0 to
6c0d031
Compare
6c0d031 to
ba7c7bf
Compare
| let block_producer_url = format!("http://{block_producer_address}"); | ||
| let channel = tonic::transport::Endpoint::try_from(block_producer_url)?.connect().await?; |
There was a problem hiding this comment.
This is a bit awkward and makes me think I was too hasty in removing our custom Endpoint type in #654.
As a summary:
- Tonic wants a
Uribut accepts strings in the format ofhttp(s)://{SocketAddr}. TcpListener::bindwants aSocketAddr.Uricannot be trivially converted to aSocketAddrUrlcan be converted to aSocketAddrso we use it.- Clients may want to perform a DNS lookup to the server address of
http://rpc.testnet.miden.io
I think its okay to leave as is, but we probably want to revisit this (again :'( ).
There was a problem hiding this comment.
If we had the Endpoint struct - how would this be different?
There was a problem hiding this comment.
I don't actually know. It just seems messy and incorrect to me at the moment.
As an example, if you provide an https URL for a server at the moment then we don't actually support that. We strip the protocol from the URL when we convert to a socket address and don't add TLS support when we create the tonic server.
This is sort of fine because we don't (ourselves) want to use TLS internally. But it sort of points out the leaky abstraction here.
If we implement something ourselves then we'll probably have the opposite issue of rejecting weird but valid URLs.
There was a problem hiding this comment.
Actually maybe the real answer is making startup robust. The only reason we have TcpListener involved is to separate server initialization from the clients spinning up in serve.
If we have the components capable of starting up robustly with internal retries etc, then we can simply use Uri which is what tonic can use.
bobbinth
left a comment
There was a problem hiding this comment.
Looks good! Thank you! Not a full review yet, but I left some comments inline.
I've swopped around the subcommand structure so that its now
miden-node <component> <action>. This was mostly to "organise" the store commands but I can imagine having additional subcommands for rpc and block-producer in the future.
I think this is largely fine, but I left a couple of comments inline that I think some "node-wide" commands could be brought up to the root level rather than be under the node sub-command. For example:
miden-node start
miden-node bootstrap
Feel more natural to me than
miden-node node start
miden-node store bootstrap
I wanted to change up the store so that
miden-node store bootstrapwould also create the database and seed it with the genesis data. In this case thegenesis.datwould become ephemeral (though it can be brought back when if we decide we need it).miden-node store startwould now never create the db, but instead expect it to exist. I also want to move database migration into a separate subcommand (and not automatically do it on start). I decided not to do this in this PR as it was too much churn.
I like this and agree with doing it in a separate PR. One thing we should make sure is that the bootstrap command is deterministic (i.e., given the same genesis inputs file it initializes the DB in the same way).
I've replaced the various store filepaths and directory paths with a single
data_directorywhich will contain the database, blockstore andgenesis.dat. I don't see a good reason to modify these at present.
I think this is fine for now - the only question I have is if we'd still have the flexibility to somehow put the database file and the blockstore on different disks, if we wanted to.
Usage
Summarizing my various comments, I'd probably have the commands like something like this:
miden-node start [OPTIONS] --url <URL> --data-directory <DIR>
miden-node rpc start [OPTIONS] --url <URL> --store-url <URL> --block-producer-url <URL>
miden-node block-producer start [OPTIONS] --url <URL> --store-url <STORE_URL>
miden-node dump-genesis > genesis.toml
miden-node bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>
miden-node store start [OPTIONS] --url <URL> --data-directory <DIR>
| let block_producer_url = format!("http://{block_producer_address}"); | ||
| let channel = tonic::transport::Endpoint::try_from(block_producer_url)?.connect().await?; |
There was a problem hiding this comment.
If we had the Endpoint struct - how would this be different?
| Start { | ||
| /// Url at which to serve the RPC component's gRPC API. | ||
| #[arg(long = "rpc.url", env = ENV_RPC_URL, value_name = "URL")] | ||
| rpc_url: Url, | ||
|
|
||
| /// Directory in which the Store component should store the database and raw block data. | ||
| #[arg(long = "store.data-directory", env = ENV_STORE_DIRECTORY, value_name = "DIR")] | ||
| store_data_directory: PathBuf, | ||
|
|
||
| /// The remote batch prover's gRPC url. If unset, will default to running a prover | ||
| /// in-process which is expensive. | ||
| #[arg(long = "batch_prover.url", env = ENV_BATCH_PROVER_URL, value_name = "URL")] | ||
| batch_prover_url: Option<Url>, | ||
|
|
||
| /// The remote block prover's gRPC url. If unset, will default to running a prover | ||
| /// in-process which is expensive. | ||
| #[arg(long = "block_prover.url", env = ENV_BLOCK_PROVER_URL, value_name = "URL")] | ||
| block_prover_url: Option<Url>, | ||
|
|
||
| /// Enables the exporting of traces for OpenTelemetry. | ||
| /// | ||
| /// This can be further configured using environment variables as defined in the official | ||
| /// OpenTelemetry documentation. See our operator manual for further details. | ||
| #[arg(long = "open-telemetry", default_value_t = false, env = ENV_ENABLE_OTEL, value_name = "bool")] | ||
| open_telemetry: bool, | ||
| }, |
There was a problem hiding this comment.
I'm not sure I fully understand the issue. Is it that when we do miden-node store start we use --url to specify store URL, but if we use miden-node block-producer start we use --store.url? This seems pretty natural to me.
| /// Dumps the default genesis configuration to stdout. | ||
| /// | ||
| /// Use this as a starting point to modify the genesis data for `bootstrap`. | ||
| DumpGenesis, | ||
|
|
||
| /// Bootstraps the blockchain database with the genesis block. | ||
| /// | ||
| /// This populates the genesis block's data with the accounts and data listed in the | ||
| /// configuration file. | ||
| /// | ||
| /// Each generated genesis account's data is also written to disk. This includes the private | ||
| /// key which can be used to create transactions for these accounts. | ||
| /// | ||
| /// See also: `dump-genesis` | ||
| Bootstrap { | ||
| /// Genesis configuration file. | ||
| /// | ||
| /// If not provided the default configuration is used. | ||
| #[arg(long, value_name = "FILE")] | ||
| config: Option<PathBuf>, | ||
| /// Directory in which to store the database and raw block data. | ||
| #[arg(long, env = ENV_STORE_DIRECTORY, value_name = "DIR")] | ||
| data_directory: PathBuf, | ||
| // Directory to write the account data to. | ||
| #[arg(long, value_name = "DIR")] | ||
| accounts_directory: PathBuf, | ||
| }, |
There was a problem hiding this comment.
It is not clear to me that these commands should be store-specific. Currently the do affect only the store component, but in the future, bootstrapping process may affect multiple components. Maybe we could move it to the root so that the users could run something like:
miden-node bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>
There was a problem hiding this comment.
Could you give a potential example?
The reason I bundled these is because I thought (in general) that these would all be run on the instance running the store. That we're spinning up the store component and these are the set of commands required to do so.
If there is a block-producer bootstrap process then it would likely not have anything to do with this one?
There was a problem hiding this comment.
hmmm - not quite sure what an example could be, and actually, thinking about this more, it does feel like the store component should have its own bootstrap command. But I do think it may be good to have one "global" bootstrap command which would currently just execute store's initialization process. So, both of these will currently do the same:
miden-node bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>
miden-node store bootstrap [OPTIONS] --data-directory <DIR> --accounts-directory <DIR>
But in the future the second one may be a subset of the first one.
There was a problem hiding this comment.
I would prefer if we have a single canonical way of doing one thing, at least until there is a stronger indication that we need a combination of things.
I could see that we may want a quick-start option to spin up a node but its possible we would want that to form part of the "bundled" options then?
There was a problem hiding this comment.
One could argue that these commands are different: one is for initializing the "monolithic" node and the other one is for initializing the store component. That is, the user who's setting up the store as a separate component would use the second command, but the user who is setting up the node to run as a combined binary would use the first command - and so, there is no overlap.
They do happen to do the same thing right now - but it may actually be cleaner to keep them separate.
There was a problem hiding this comment.
Would you be happy with
miden-node store boostrap
store start
miden-node bundled bootstrap
bundled start
where bundled is the current node -- I don't think this is a great name, but node is also.. strange. I'm wary of dumping the full node commands in the root "folder" so to speak.
There was a problem hiding this comment.
miden-node store boostrap
store startmiden-node bundled bootstrap
bundled start
Yep, let's go with this. I do think not having the "bundled" word would make it a bit easier for the user - but the difference is probably marginal.
There was a problem hiding this comment.
Node alternate word dump
composite
combined
bundle
pack
in-process
full-node
full
all
There was a problem hiding this comment.
Something that is a bit confusing now is that dump-genesis is only in miden-node store.
|
I've addressed all comments, only open question is what to do regarding |
c955073 to
e8c73d3
Compare
igamigo
left a comment
There was a problem hiding this comment.
Not the deepest review but looks good to me! It seems like various subcommands only include a "start" variant, so I wonder if it would just be worth leaving them as top-level commands (would it solve the dump-genesis ambiguity maybe?) and I also saw your comment about providing space for future commands which would make sense as well.
| /// Starts the block-producer component. | ||
| Start { | ||
| /// Url at which to serve the gRPC API. | ||
| #[arg(env = ENV_BLOCK_PRODUCER_URL)] | ||
| url: Url, | ||
|
|
||
| /// The store's gRPC url. | ||
| #[arg(long = "store.url", env = ENV_STORE_URL)] | ||
| store_url: Url, | ||
|
|
||
| /// The remote batch prover's gRPC url. If unset, will default to running a prover | ||
| /// in-process which is expensive. | ||
| #[arg(long = "batch-prover.url", env = ENV_BATCH_PROVER_URL)] | ||
| batch_prover_url: Option<Url>, | ||
|
|
||
| /// The remote block prover's gRPC url. If unset, will default to running a prover | ||
| /// in-process which is expensive. | ||
| #[arg(long = "block-prover.url", env = ENV_BLOCK_PROVER_URL)] | ||
| block_prover_url: Option<Url>, | ||
|
|
||
| /// Enables the exporting of traces for OpenTelemetry. | ||
| /// | ||
| /// This can be further configured using environment variables as defined in the official | ||
| /// OpenTelemetry documentation. See our operator manual for further details. | ||
| #[arg(long = "enable-otel", default_value_t = false, env = ENV_ENABLE_OTEL)] | ||
| open_telemetry: bool, | ||
| }, | ||
| } |
There was a problem hiding this comment.
Considering your comment, is there anything to document (here and elsewhere) about the https support?
There was a problem hiding this comment.
I think I can solve the issue by making component start up more robust. Right now I wouldn't even know what exactly would work beyond (sometimes) discarding the protocol.
One startup is robust it should be able to handle https as well, even if we ourselves never use it for performance/simplicity reasons.
Yeah its unclear to me as well; but I think I prefer the consistency of it all. That its always |
Allow each server to start even if a downstream server is offline which would improve this endpoint situation. Though the solution given in the issue with We can also now do #735 to configure the block-producer further. |
1d5e3bb to
4d131ce
Compare
4d131ce to
db2fdde
Compare
This PR replaces the node's configuration via
tomlfiles to pureclapcli args. We retain the ability to configure via a file using environment variables instead.Left undone
Documentation in readme. I'll skip the readme and add it to the operator docs in a separate PR.
The faucet still uses a
tomlfile. This file is re-used when creating a new account and when running the faucet. We may want to relook at the expected flow here but I didn't want to fiddle with it.New flow
I've swopped around the subcommand structure so that its now
miden-node <component> <action>. This was mostly to "organise" the store commands but I can imagine having additional subcommands for rpc and block-producer in the future.I wanted to change up the store so that
miden-node store bootstrapwould also create the database and seed it with the genesis data. In this case thegenesis.datwould become ephemeral (though it can be brought back when if we decide we need it).miden-node store startwould now never create the db, but instead expect it to exist. I also want to move database migration into a separate subcommand (and not automatically do it on start). I decided not to do this in this PR as it was too much churn.Simplifications
I've replaced the various store filepaths and directory paths with a single
data_directorywhich will contain the database, blockstore andgenesis.dat. I don't see a good reason to modify these at present.I've also simplified
node startby making the internal gRPC servers (block-producer & store) uselocalhost:0which assigns a random open port. This means these no longer need to be configured at all. We can somewhat trivially make this overrideable but I don't see the point. If you want that, spawn the components separately imo :)I'll also note that I was tempted to make the components more robust to startup failures, but it seems like
tonicin general does not handle connection failures in many cases - meaning we cannot trivially loop/retry until all required components are ready. This may take a bit more work to achieve.Usage