A Python package for generating 3D structures of organometallic complexes.
Before using MetalloGen, following must be prepared.
MetalloGen delegates energy and geometry calculations to an external quantum chemistry (QC) backend.
Currently supported binaries are:
- Gaussian (
g09org16) - ORCA (
orca) - xTB (
xtb, for the defaultxtb_gaussianworkflow)
Make sure these executables are on your PATH. For example:
>> which g09
/appl/g09.shchoi/G09Files/g09/g09
>> which g16
/appl/g16.shchoi/G16Files/g16/g16
>> which orca
/appl/orca_6_0_1_linux_x86-64_shared_openmpi416/orca
>> which xtb
/home/rxn_grp/programs/xtbBy default, MetalloGen uses the xtb_gaussian method (--calculator xtb_gaussian), which couples xTB with Gaussian via the xtb-gaussian wrapper.
- You can obtain
xtb-gaussianfrom the
Aspuru-Guzik Group GitHub repository. - After installation, set the environment variable
xtbbinto thextb-gaussianexecutable:
>> export xtbbin="/home/rxn_grp/programs/xtb-gaussian"- Verify:
>> echo $xtbbin
/home/rxn_grp/programs/xtb-gaussianNote: When
--calculator xtb_gaussian(the default) is used, bothxtband a Gaussian binary (g09org16) and thextbbinenvironment variable must be correctly configured.
MetalloGen can also use ORCA directly as the QC backend:
- Set
--calculator orca(or-c orca). - Only
orcais required on yourPATH;xtbbinis not needed for this mode.
Example:
>> which orca
/home/rxn_grp/programs/orca/orca# Clone the repository
>> git clone https://github.com/kyunghoonlee777/MetalloGen.git
>> cd MetalloGen
# Create environment
>> conda create -n metallogen python=3.9 -y
>> conda activate metallogen
# Install MetalloGen (editable mode)
>> pip install -e .MetalloGen can be executed with two types of inputs:
- m-SMILES representation (modified SMILES for mononuclear coordination complexes)
- MOL/SDF files containing predefined molecular structures
Internally, MetalloGen uses a calculator backend selected via --calculator / -c:
xtb_gaussian(default): wrapper using thextb-gaussianscript (xTB + Gaussian).orca: direct ORCA calculations.
MetalloGen uses a modified SMILES representation called m-SMILES as input. From an m-SMILES string, MetalloGen generates the corresponding 3D conformers.
The m-SMILES representation encodes:
- the metal center (e.g.,
[Zr+4]) - the ligands as SMILES strings separated by vertical bars (
|) - the coordination geometry (e.g.,
5_trigonal_bipyramidal)
Donor atoms directly coordinated to the metal are specified with square brackets, and coordination sites are indicated with atom mapping numbers (for example, [Cl-:2] means a chloro ligand bound at coordination site 2).
This makes it straightforward to encode polydentate and polyhapto ligands while preserving coordination geometry and stereochemistry.
Example (m-SMILES input with default calculator):
metallogen \
-s "[Zr+4]|[Cl-:2]|[Cl-:3]|[N:1]1=C(C[C-:4]2[CH:4]=[CH:4][CH:4]=[CH:4]2)C=CC=C1(C[C-:5]3[CH:5]=[CH:5][CH:5]=[CH:5]3)|5_trigonal_bipyramidal" \
-wd <WORKING DIRECTORY> \
-sd <SAVE DIRECTORY> \
-r 1 \
-nc 1 \
-c xtb_gaussianThe generated 3D conformer corresponding to the m-SMILES input is shown below:
In some cases—such as benchmarking with CSD (Cambridge Structural Database)—obtaining an m-SMILES representation can be challenging or impractical.
For these situations, MetalloGen can directly take MOL or SDF files as input via the -id flag. This allows seamless use of existing 3D structures extracted from databases.
As an example, consider a complex extracted from the CSD with refcode CIXDAS.
The corresponding 3D structure (in SDF format) can be provided directly to MetalloGen:
Example (SDF input using ORCA):
metallogen \
-id <INPUT DIRECTORY> \
-wd <WORKING DIRECTORY> \
-sd <SAVE DIRECTORY> \
-r 1 \
-nc 1 \
-c orcaMetalloGen successfully generates well-formed conformers from such SDF inputs as well:
When running MetalloGen, two types of output are generated:
-
For each input structure, the number of conformers specified by
--num_conformer/-ncare generated. -
These conformers differ by the initial embedding conditions used in the generation procedure.
-
Each conformer is saved in the directory specified by
--save_directory/-sdas:result_{i}.xyz(wherei = 0, 1, 2, ...)
-
The XYZ files contain full 3D coordinates of the metal complex and can be directly opened in standard molecular viewers.
MetalloGen calls the selected QC backend (xtb_gaussian or orca) during generation and relaxation:
-
Working directory (
--working_directory/-wd):- Intermediate input and output files for the calculator are written here as scratch.
- This typically includes calculator-specific files such as Gaussian or ORCA input/output and any temporary files created during optimization steps.
- You can inspect these files for debugging or detailed analysis; they can also be cleaned up after the run.
-
Save directory (
--save_directory/-sd):- If
--final_relax/-ris set to1(default), both the final relaxation input files and corresponding log/output files are saved alongside the final conformer XYZ files. - This means that, for each conformer, the final 3D structure (
result_{i}.xyz) and its QC calculation logs live together in one place, making it easy to track which calculation produced which structure.
- If
In summary:
- Geometry & coordinates:
result_{i}.xyzin--save_directory- QC scratch/intermediate files: in
--working_directory- Final QC input/output for relaxed structures: in
--save_directory(when-r 1)
The following options are available:
| Argument | Short | Type | Default | Description |
|---|---|---|---|---|
--smiles |
-s |
str |
None |
Input m-SMILES string |
--input_directory |
-id |
str |
None |
Input SDF/MOL file directory (all files in the directory are processed) |
--working_directory |
-wd |
str |
None |
Scratch directory for running quantum chemical calculations |
--save_directory |
-sd |
str |
None |
Directory to save final conformers and, optionally, final QC inputs/logs |
--final_relax |
-r |
int |
1 |
Whether to perform final relaxation after generation (0 = no, 1 = yes) |
--num_conformer |
-nc |
int |
1 |
Number of conformers to generate for each input |
--calculator |
-c |
str |
xtb_gaussian |
Calculator backend to use: xtb_gaussian (default, uses xtb-gaussian + Gaussian) or orca (direct ORCA) |
If
--calculatoris omitted, MetalloGen usesxtb_gaussianby default.
To use ORCA, specify-c orcaand ensureorcais available onPATH.
Please cite as below Kyunghoon Lee, Shinyoung Park, Minseong Park, and Woo Youn Kim. "MetalloGen: Automated 3D Conformer Generation for Diverse Coordination Complexes" Journal of Chemical Information and Modeling 65 (2025): 11878–11891.
This project is licensed under the BSD 3-Clause License.
For questions, issues, or collaboration, please contact:
- Kyunghoon Lee - kyunghoonlee@kaist.ac.kr
- Minseong Park - pms131131@kaist.ac.kr



