The open-source framework for fragment-based generation of chemical structures. The idea is similar to matched molecular pair - if two fragments are in the identical contexts they can replace each other to produce new chemically valid and probably synthetically feasible structures.
Features:
Implementation features:
| DB link | Source | Molecule filters | Fragment filters | Size |
|---|---|---|---|---|
| chembl22_sa2.db.gz | ChEMBL22 |
|
maximum number of heavy atoms 20 | 350.2 MB |
| chembl22_sa2_hac12.db.gz | ChEMBL22 |
|
maximum number of heavy atoms 12 | 158.9 MB |
| chembl22_sa25_hac12.db.gz | ChEMBL22 |
|
maximum number of heavy atoms 12 | 886.8 MB |
| chembl33_sa2_f5.db.gz | ChEMBL33 |
|
maximum number of heavy atoms 15 | 281.3 MB |
| chembl33_sa25_f5.db.gz | ChEMBL33 |
|
maximum number of heavy atoms 15 | 1.9 GB |
| chembl33_f5.db.gz | ChEMBL33 |
|
maximum number of heavy atoms 15 | 6.6 GB |
| enamine2025_sa2_f5.db.gz | Enamine stock 2025 |
|
maximum number of heavy atoms 15 | 776.9 MB |

| task | SMILES LSTM* | SMILES GA* | Graph GA* | Graph MCTS* | CReM |
|---|---|---|---|---|---|
| Celecoxib rediscovery | 1.000 | 0.732 | 1.000 | 0.355 | 1.000 |
| Troglitazone rediscovery | 1.000 | 0.515 | 1.000 | 0.311 | 1.000 |
| Thiothixene rediscovery | 1.000 | 0.598 | 1.000 | 0.311 | 1.000 |
| Aripiprazole similarity | 1.000 | 0.834 | 1.000 | 0.380 | 1.000 |
| Albuterol similarity | 1.000 | 0.907 | 1.000 | 0.749 | 1.000 |
| Mestranol similarity | 1.000 | 0.79 | 1.000 | 0.402 | 1.000 |
| C11H24 | 0.993 | 0.829 | 0.971 | 0.410 | 0.966 |
| C9H10N2O2PF2Cl | 0.879 | 0.889 | 0.982 | 0.631 | 0.940 |
| Median molecules 1 | 0.438 | 0.334 | 0.406 | 0.225 | 0.371 |
| Median molecules 2 | 0.422 | 0.38 | 0.432 | 0.170 | 0.434 |
| Osimertinib MPO | 0.907 | 0.886 | 0.953 | 0.784 | 0.995 |
| Fexofenadine MPO | 0.959 | 0.931 | 0.998 | 0.695 | 1.000 |
| Ranolazine MPO | 0.855 | 0.881 | 0.92 | 0.616 | 0.969 |
| Perindopril MPO | 0.808 | 0.661 | 0.792 | 0.385 | 0.815 |
| Amlodipine MPO | 0.894 | 0.722 | 0.894 | 0.533 | 0.902 |
| Sitagliptin MPO | 0.545 | 0.689 | 0.891 | 0.458 | 0.763 |
| Zaleplon MPO | 0.669 | 0.413 | 0.754 | 0.488 | 0.770 |
| Valsartan SMARTS | 0.978 | 0.552 | 0.990 | 0.04 | 0.994 |
| Deco Hop | 0.996 | 0.970 | 1.000 | 0.590 | 1.000 |
| Scaffold Hop | 0.998 | 0.885 | 1.000 | 0.478 | 1.000 |
| total score | 17.341 | 14.398 | 17.983 | 9.011 | 17.919 |
Polishchuk, P., CReM: chemically reasonable mutations framework for structure generation. Journal of Cheminformatics 2020, 12, (1), 28. - https://doi.org/10.1186/s13321-020-00431-w