Rgroup Query Tutorial

Introduction

JChemPaint allows you to draw, save and retrieve "Rgroup queries". Rgroup queries are defined by the Symyx RGfile format, more details for which can be found in the "CTfile Formats" manual downloadable from www.symyx.com.

In essence, an Rgroup query allows you to define multiple molecules in one data structure by combining different substituent attachments with one root structure (or scaffold). A JChemPaint screenshot may help to clarify this:



This (simple) example contains a root structure with one Rgroup labeled "R1" in pink. For R1, three substituents are defined and drawn under the root structure. Three possible molecules can thus be derived by substituting R1 with one of these.
The substituents are to be attached to the root by replacing R1 with the substituent atom marked with an asterisk. It will connect to the root using the bond that is also marked with an asterisk.

Using the "R-groups" menu, you can let JChemPaint generate the possible configurations for any Rgroup query and save these to an SDF file. For the example above, this would result in the following configurations (BioClipse screenshot):






Drawing R-groups

To draw R-groups, the following JChemPaint menus are relevant This section will explain how to create Rgroups using the aforementioned menus.

Define the root

The suggested way to construct Rgroup queries is to first draw all the structures you need, and then assign Rgroup aspects to these. Going back the earlier example, you would first draw the four structures that make up the query. The R1 atom is a so called "pseudo atom". To draw it, you can draw it as a normal (say Carbon) atom first, and then right click it and from the atom menu pick Pseudo Atoms->R1.
Next, use a selection tool to select the entire structure that is intended to become the root structure. With the selection made, right click the canvas and pick R-groups->Define as Root Structure. See the picture below for an illustration.




Define the substituents

Once a root structure has been chosen, the remaining structures in the drawing are subsequently flagged as "Not in R-Group". To continue, select each structure that is to become a substituent, and choose R-groups->Define as Substituent. JChemPaint will then prompt you to "Enter an R-group number". In our example, we'll enter 1. In general, you can pick any number that corresponds to an R1...R32 atom in your root structure. In this way, you link the substituent to the specific Rgroup of your choice.

JChemPaint will randomly pick a "connection point" atom on a newly declared substituent and flag this atom with an asterisk. Potentially, if your Rgroup atom has multiple bonds connecting it to the root, it will pick a second attachment point and flag this with a double quote. The asterisk and quote on the substituent correspond to the similarly flagged bonds on the root. The RGfile format allows for two attachment point/bonds to be identified per Rgroup, not more.

You are allowed to change the attachment atom(s) in the substituents and the attachment bond(s) in the root. For example, to make another atom in a substituent the attachment point, select it and right click it. From the atom popup menu, select R-Group attachment->Set as first attachment point (or Set as second attachment point). Similarly, you can pick a bond in the root structure that connects to an R-atom and right click it to define its attachment using the bond popup menu.

Advanced Rgroup logic

With the Rgroup query in place, the RGfile format allows you to define some more advanced logic on top of it. With JChemPaint this can be done using menu option R-groups->Advanced R-group logic. For each group you can indicate:

Saving Rgroups

When you save the drawing, choose the "MDL MOL file" format as the file type. There is no exclusive file extension reserved for RGfiles.
JChemPaint will prompt you if you want to retain the RGroup information. Choose Yes; if you do not, the drawing will be saved as a regular MOL file instead of an RGroup query (extended) MOL file.

Generating configurations

To confirm the correctness of you RGroup query specification, you can let JChemPaint/CDK generate all its possible configurations into an SDfile. This can be done with menu option R-groups->Generate possible configurations. You can use a tool like BioClipse to browse the content of the SDfile.