Monday, 4 March 2013

To Remove Or Not To Remove - That Is The Question

During the course of standard compound curation, I come across problem inorganic compounds. An example of these are Cisplatin and Transplatin. These compounds only differ in the orientation of their complex bonds but complex bonds cannot be drawn in a standard molfile without causing InChI issues. At the  moment, they are kept separate by showing standard bonds between the Pt, Cl and NH3 in Cisplatin, but we have removed the bonds altogether for Transplatin. This is not an ideal situation, nor an accurate structural representation.

Another example is the compound, below left, and how it should look as a complex, right, from the paper:

At the moment, there are approximately 1,800 cases like this, which only accounts for 0.15% of the entire ChEMBL compound set.

What we are proposing to do is to remove the structures for these complex compounds and to keep only their names and all of the associated biological data. This would then treat them in a similar way to the antibodies and large peptides that we store in ChEMBL.

So, we have set up an online private Doodle Poll for you, our users, to have your vote on whether we should remove the structures and keep the biological data, or leave them as they are.

Noel O'Boyle said...

Hi Louisa, you didn't say exactly whtat the proble is. Is it not possible to use the InChI software to generate different InChIs for cis and trans-platin? The "don't disconnect metals" option (/RecMet) is worth trying.

BTW, the link to the compound above is a different structure than shown in the image.

Louisa said...

Hi Noel

No, you can't generate different InChIs for trans and cisplatin.

The structures shown were taken from the original paper and ChEMBL. It was used to illustrate how covalent bonds are being used to show complex compounds where co-ordinate dative bonds should be used. Unfortunately, coordinate bonds do not give an acceptable InChI.
In the case shown on the post, covalent bonds have been used in place of coordinate bonds and so the NH3 group and NH2R group have lost a hydrogen so as not to have a charge.
As there are only few cases of this, we would like to remove the structure but keep the data. That way, we are not storing compounds with 'incorrect' bond types.