Nematodes or roundworms are the most abundant metazoans on earth and are found in all conceivable habitats (Blumenthal and Davis, 2004; Bardgett and van der Putten, 2014). They are commonly classified as marine or terrestrial, free-living or animal/human/plant-parasitic (Lambshead, 2004). In order to perceive the environment and seek their hosts in soil or other habitats, nematodes depend on a very well-developed chemosensory mechanism (Curtis, 2008; Reynolds et al., 2011). The nematode
Olfaction, as a part of chemoreception, uses water-soluble volatile odorants or gaseous chemical cues which bind to various sensory receptors, including G protein-coupled receptors (GPCRs). GPCRs, are also known as seven-transmembrane (7-TM) domain receptors, are the largest family and most diverse group of membrane receptors in eukaryotes including EPNs (Elrod and Chou, 2002; Audebrand et al., 2020). Most GPCRs are 200–1000 amino acids long (Bresso et al., 2019). They are found within the lipid-protein bilayer of the cell membrane and are responsible for regulating communication between the cells and its surroundings (Fig. 1) (Schiöth and Fredriksson, 2005; Sonnabend et al., 2017; Insel et al., 2019). In addition to that, it is suggested that they are present in the cell organelles, and inside the nucleus. There are hundreds of different GPCRs which can bind to a diverse set of ligands, including peptide hormones, neurotransmitters, neuropeptides, biogenic amines, amino acids, ions, lipid-derived mediators, peptides, proteins, and odor molecules (Gether, 2000; Schiöth and Fredriksson, 2005; Hilger et al., 2018; Sriram and Insel 2018) and they are responsible for vision, olfaction, taste, and more (Rosenbaum et al., 2009). Nematode chemosensory GPCRs (NemChRs) are unique to nematodes, and are involved in the detection and reception of odor molecules, as their ligand binding sites are located on cell surfaces and are accessible to sensory molecules (Robertson, 2006; Krishnan et al., 2014). This sensory perception is processed downstream through different signalling pathways leading to modulation of nematode behavior (Bargmann et al., 1993; Bargmann, 2006). Silencing of these downstream effectors inhibits nematode response, activation, and movement towards host-emitted cues (Gang et al., 2020; Wheeler et al., 2020). The GPCRs are involved in controlling the movement of EPNs towards (or within) their host (Bresso et al., 2019). Recently, Bernot et al. (2020) and Wheeler et al. (2020) have reported the involvement of NemChRs in host-seeking behaviour.
Genome sequencing projects have made it possible for researchers to identify genes of interest in their favorite organisms.
In order to predict putative GPCRs in
The initial step was to filter the large number of protein sequences based on their length. The proteomic dataset (Bai et al., 2013; Mclean et al., 2018) of
Seven transmembrane sequences, detected by any of the three transmembrane detectors, were then subjected to four different GPCR prediction tools to identify putative GPCRs from those sequences. These were GPCRHMM (
The outcomes of all the bioinformatic tools were then compiled and visualized using the Venny 2.1 software (
Apart from predicting GPCR, GPCRPred was also used to know the families and subfamilies of GPCRs based on the dipeptide composition of proteins. It categorizes GPCRs into class A (rhodopsin-like), B (secretin and adhesion), C (metabotropic glutamate), D (fungal pheromone receptors), E (cAMP receptors), or F (frizzled), as per International Union of Basic and Clinical Pharmacology (IUPHAR) nomenclature (
All GPCR candidates predicted by the pipeline were analyzed by another alignment-free method, GPCR-CA (
In order to validate the pipeline, the putative GPCRs were then checked for the presence of an extracellular N terminus and an intracellular C terminus using HMMTOP2 and TMHMM2.
We screened all these GPCRs through Pfam (
Putative
PRED-COUPLE2 (
Out of the total 21,699 predicted proteins of
The characterization and classification of GPCR sequences identified through a bioinformatics pipeline using various annotation tools and methods, and their probable coupling specificity with different G-proteins
Hba_04160 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | Gi/o, Gs | |
Hba_07805 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7TM_GPCR_Srw | Serpentine type 7TM GPCR chemoreceptor Srw | |
7TM GPCR, serpentine receptor class w (Srw) (IPR019427) | ||||||
Hba_10668 | Class A Rhodopsin like | Amine | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | Gi/o, Gs |
Hba_12209 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
Hba_14891 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
Hba_18427 | Class A Rhodopsin like | Peptide | 7TM GPCR, serpentine receptor class w (Srw) (IPR019427) | 7TM_GPCR_Srw | Serpentine type 7TM GPCR chemoreceptor Srw | |
Hba_18743 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7TM_GPCR_Srsx | Serpentine type 7TM GPCR chemoreceptor Srsx | |
Serpentine type 7TM GPCR chemoreceptor Srsx (IPR019424) | ||||||
Hba_18878 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
G-protein coupled receptor Aex-2 (IPR039952) | ||||||
Hba_18906 | Class A Rhodopsin like | Lysospingolipid | None | GpcrRhopsn4 | Rhodopsin-like GPCR transmembrane domain | |
Hba_19080 | Frizzled | Peptide | Frizzled/secreted frizzled-related protein (IPR015526) | Frizzled | Frizzled/Smoothened family membrane region | |
Hba_19161 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
Hba_20566 | Secretin-like | Rhodopsin | GPCR, family 2, secretin-like (IPR000832) | 7tm_2 | 7 transmembrane receptor (Secretin family) | |
Hba_03545 | Class A Rhodopsin like | Peptide | Putative G protein-coupled receptor, Chromadorea (IPR040435) | |||
Hba_20096 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | ||
Hba_18203 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
Hba_12258 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
G-protein coupled receptor Aex-2 (IPR039952) | ||||||
Hba_17528 | Class A Rhodopsin like / Class D Fungal pheromone like | Peptide | None | 7TM_GPCR_Srx | Serpentine type 7TM GPCR chemoreceptor Srx | |
Hba_14446 | Class A Rhodopsin like | Amine | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
Hba_09978 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
Hba_08130 | Class A Rhodopsin like | Peptide | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7tm_1 | 7 transmembrane receptor (rhodopsin family) | |
Hba_13948 | Class A Rhodopsin like | Amine | G protein-coupled receptor, rhodopsin-like (IPR000276) | 7TM_GPCR_Srx |
The common 69 heptathetical transmembrane sequences were passed through four GPCR prediction tools to identify the GPCRs among these sequences. GPCRHMM, GPCRPipe, GPCR-Pen, and GPCRPred identified 22, 21, 15, and 62 sequences as GPCRs, respectively (Fig. 3B). Interestingly, GPCRHMM, GPCRPipe, and GPCR-Pen did not identify any unique GPCR sequence, but GPCRPred identified 36 unique GPCRs not predicted by any other tool. In summary, 21 sequences were considered as putative GPCRs as they were identified by at least three of the four tools used for GPCR prediction, and further confirmed by detailed analysis of the sequences (Fig. 3B, Tables 1–3, supplementary Tables 2 – 4).
The Gene Ontology annotations of identified GPCR sequences of
Hba_04160 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_07805 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
G protein-coupled peptide receptor activity (GO:0008528) | |||
Hba_10668 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_12209 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_14891 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_18427 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
G protein-coupled peptide receptor activity (GO:0008528) | |||
Hba_18743 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_18878 | neuropeptide signaling pathway (GO:0007218) | Neuropeptide receptor activity (GO:0008188) | Integral component of membrane (GO:0016021) |
G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | ||
Hba_18906 | G protein-coupled receptor signaling pathway (GO:0007186) | None | None |
response to pheromone (GO:0019236) | |||
Hba_19080 | Cell surface receptor signaling pathway (GO:0007166) | Protein binding (GO:0005515) | Membrane (GO:0016020) |
transmembrane signaling receptor activity (GO:0004888) | integral component of membrane (GO:0016021) | ||
Hba_19161 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_20566 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
cell surface receptor signaling pathway (GO:0007166) | transmembrane signaling receptor activity (GO:0004888) | ||
Hba_03545 | None | None | Integral component of membrane (GO:0016021) |
Hba_20096 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_18203 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_12258 | Neuropeptide signaling pathway (GO:0007218) | Neuropeptide receptor activity (GO:0008188) | Integral component of membrane (GO:0016021) |
G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | ||
Hba_17528 | None | None | Integral component of membrane (GO:0016021) |
Hba_09978 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_14446 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_13948 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Hba_08130 | G protein-coupled receptor signaling pathway (GO:0007186) | G protein-coupled receptor activity (GO:0004930) | Integral component of membrane (GO:0016021) |
Similarity of motifs identified by MEME analysis in GPCRs with the known protein domains as analyzed by HHPred
Serpentine Receptor, class V [ |
98.28 | 1.1e-6 | 1.9 | 219 | |
Human Gonadotropin-releasing hormone Receptor (GnRHR) related [ |
98 | 9.6e-7 | −1.4 | 257 | |
BILF1; Membrane protein, viral GPCR, class A-like GPCR, Epstein-Barr virus; HET: Y01{ |
97.98 | 5.4e-8 | −6.3 | 231 | |
Type-1 angiotensin II receptor, Soluble cytochrome b562 BRIL fusion protein; GPCR, MEMBRANE PROTEIN; HET: CLR, OLC, NAG | 98.17 | 7.9e-7 | 0.1 | 425 | |
Proteinase-activated receptor 2, soluble cytochrome b562, membrane protein, GPCR, 7TM | 97.92 | 4.4e-6 | 0.1 | 437 | |
Neurotensin receptor type 1, lysozyme chimera; G-protein coupled receptor, neurotensin receptor, G-protein, signaling protein | 97.87 | 0.7e-6 | 0.4 | 510 | |
Cytochrome c1, heme protein, mitochondrial; cytochrome bc1, Membrane protein, heme protein, rieske iron sulfur protein | 68.8 | 10 | 2.1 | 241 | |
Lysophosphatidic acid receptor 6a, Endolysin, Lysophosphatidic acid receptor 6a; alpha helical, membrane protein; HET: OLC | 66.66 | 10 | 2.1 | 241 | |
Neuropeptide Receptor family [ |
60.14 | 1.6 | −1.5 | 477 | |
Zinc finger protein sdc-1 [ |
27.67 | 58 | 0.9 | 1201 | |
Uncharacterized protein CELE_ZC204.13 [ |
20.25 | 85 | 0.5 | 156 | |
Uncharacterized protein CELE_F56H11.2 [ |
27.34 | 48 | 0.3 | 129 | |
U4/U6 small nuclear ribonucleoprotein PRP3; spliceosome, assembly, pre-B complex, U1 snRNP, splicing; HET: GTP; 3.4A | 27.2 | 48 | 0.3 | 469 | |
Uncharacterized protein CELE_C37C3.12 [ |
23.12 | 63 | 0.3 | 172 | |
Mitochondrial ATP synthase subunit ASA6; mitochondrial ATP synthase dimer flexible coupling cryoEM, proton transport | 46.72 | 12 | 0 | 151 | |
Ubiquitin carboxyl-terminal hydrolase MINDY-1; hydrolase, cysteine protease, isopeptidase and ubiquitin binding; 2.16A | 24.2 | 63 | 0.4 | 289 | |
Uncharacterized protein YdhK; PF07563 family, DUF1541 | 21.94 | 56 | −0.1 | 166 | |
Uncharacterized protein CELE_Y46G5A.23 [ |
34.58 | 38 | 0.8 | 108 | |
Uncharacterized protein CELE_F26F2.3 [ |
27.97 | 55 | 0.7 | 283 | |
Pandonodin; Lasso peptide, RiPPs, unknown function; NMR { |
24.59 | 130 | 1.5 | 33 | |
Uncharacterized protein CELE_K04C2.5 [ |
77.76 | 2.1 | 0.6 | 108 | |
Uncharacterized protein CELE_Y71F9B.1 [ |
52 | 5.6 | −0.7 | 139 | |
Early E3 18.5 kDa glycoprotein; Ad2 E3-19K-HLA-A2 complex, unique tertiary structure, Adenovirus E3-19K, Immune evasion | 42.68 | 24 | 0.6 | 100 | |
Signal recognition particle subunit SRP72 [ |
43.18 | 17 | 0.1 | 635 | |
Aquaporin or aquaglyceroporin related [ |
27.95 | 38 | 0 | 63 | |
RNA polymerase I-specific transcription initiation factor RRN6; RNA Polymerase I, Pre-initiation complex | 26.05 | 41 | −0.1 | 894 | |
Intracellular growth locus, subunit C; |
40.56 | 26 | 0.6 | 211 | |
Maternally affected uncoordination [ |
38.96 | 37 | 1 | 258 | |
Uncharacterized protein CELE_F59H6.15 [ |
36.51 | 7.9 | −1.7 | 104 |
Using the same pipeline, 27 GPCR sequences were identified from the proteomic dataset of
Similarity of GPCRs identified from two different versions of annotations of
g973.t1 | No match |
g2254.t1 | Hba_12209, Hba_14891, Hba_14446 |
g4175.t1 | Hba_08130 |
g4474.t1 | No match |
g5555.t1 | Hba_14891 |
g5582.t1 | No match |
g5664.t1 | Hba_10668 |
g6690.t1 | Hba_09978 |
g7127.t1 | Hba_10668 |
g8593.t1 | Hba_10668, Hba_12209 |
g10582.t1 | Hba_10668, Hba_12209, Hba_14446 |
g11631.t1 | Hba_10668 |
g7125.t1 | No match |
g1348.t1 | No match |
g3739.t1 | Hba_19080 |
g6192.t1 | No match |
g13587.t1 | Hba_19080 |
g13965.t1 | No match |
g14268.t1 | No match |
g337.t1 | Hba_18743 |
g8530.t1 | No match |
g8941.t1 | Hba_17528 |
g8998.t1 | No match |
g9082.t1 | No match |
g10420.t1 | No match |
g116.t1 | No match |
g8239.t1 | No match |
A total 1,252 GPCRs (supplementary Table 5) were identified out of the 28,447 predicted proteins of
All the sequences identified as GPCRs above were validated by the GPCR-CA tool, as it uses Cellular Automaton (CA) images to reveal the features hidden in complex protein sequences. It designated all these GPCRs as Class A Rhodopsin-like GPCR, except Hba_17528, which was identified as Class D Fungal pheromone GPCR (Table 1). Further, the predicted GPCRs were classified based on function and protein family, and four chemosensory GPCRs (Hba_07805, Hba_18427, Hba_18743 and Hba_17528), ten 7-transmembrane receptors under the rhodopsin family, one rhodopsin-like GPCR with transmembrane domain, one frizzled type, and one secretin type GPCR were identified (Table 1, supplementary Table 1). In addition, the reciprocal BLAST was used to find the orthologous sequences of the predicted GPCRs from the proteomic dataset of closely related model organisms. This approach also validated the pipeline and confirmed that all the identified protein sequences were GPCRs. Sequence similarities found a frizzled type (Hba_19080), a secretin type (Hba_20566) and the other 19 as rhodopsin types of GPCR (Table 1). The GPCRPred tool classified all the 21 shortlisted sequences as Class A rhodopsin-like GPCR and further classified them into three different subfamilies: peptide (15 sequences), biogenic amine (3 sequences), and lysospingolipid (1 sequence) (Table 1). A search against InterPro database confirmed the families of all the 21 proteins as G protein-coupled receptors except Hba_17528. All the GPCRs were suggested to be involved in G protein-coupled receptor signalling pathway except Hba_19080, which was annotated to have a role in cell surface receptor signalling. Additionally, Hba_18878 was suggested to be involved in the neuropeptide signalling pathway, and Hba_18906 in pheromone responsiveness. Most of them are rhodopsin-like GPCR, except Hba_19080 (frizzled/secreted, frizzled-related protein) and Hba_20566 (secretin-like). All these GPCRs were found to be integral components of the cell membrane by gene ontology (GO) analysis, except Hba_18906 (Table 2).
Additionally, NCBI conserved domain analysis revealed that four identified proteins were members of the class A seven-transmembrane GPCRs and belonged to FMRFamide (Phe-Met-Arg-Phe)-like receptors and related proteins. Eight of the proteins were rhodopsin receptor-like class A family of the seven-transmembrane GPCR superfamily, which constitutes about 90% of all GPCRs. They include light-sensitive rhodopsin and receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and various other ligands. Six of the proteins were broadly classified under the 7tm_GPCRs superfamily. Among these, Hba_18743 was a serpentine type 7TM GPCR chemoreceptor under the Srsx family, the only family among the various superfamilies of chemoreceptors. Another serpentine type of chemoreceptor GPCR (Hba_17528) was found under the srx family, which is a part of the Srg superfamily of chemoreceptors. Interestingly, Hba_20566 was a pigment-dispersing factor receptor (PDFR), a member of the B1 subfamily of class B seven-transmembrane GPCRs, also referred to as the secretin-like receptor family. Hba_18203 was found to be an FMRFamide receptor and a member of the class A family of seven-transmembrane G protein-coupled receptors. Hba_09978 was a cholecystokinin receptor and came under the class A family of seven-transmembrane GPCRs. This group represents four GPCRs that are members of the RFamide receptor family, including cholecystokinin receptors (CCK-AR and CCK-BR), orexin receptors (OXR), neuropeptide FF receptors (NPFFR), and pyroglutamylated RFamide peptide receptors (QRFPR). Hba_08130 was an amine receptor of the class A family of GPCRs, which include adrenoceptors, 5-HT (serotonin) receptors, muscarinic cholinergic receptors, dopamine receptors, histamine receptors, and trace amine receptors (supplementary Table 2).
Lastly, in the 21 GPCR sequences, 18 different motifs were identified, which include 7-transmembrane receptor (rhodopsin family), serpentine type 7TM GPCR chemoreceptor Srw, serpentine type 7TM GPCR chemoreceptor Srsx, frizzled/smoothened family membrane region, 7-transmembrane receptor (secretin family), serpentine type 7TM GPCR chemoreceptor Srx, serpentine type 7TM GPCR chemoreceptor Srt, frizzled/smoothened family membrane region etc (Table 3, Fig. 4).
PRED-COUPLE2 analysis showed possible interaction of 21 GPCRs to the different families (Gi/o-, Gq/11-, Gs- and G12/13) of G-proteins (Table 1). Among these, Gi/o is the most abundant kind of G-protein, which can bind to all the fetched GPCRs. There are few GPCRs which have coupling specificity only to Gi/o. Additionally, there are promiscuous GPCRs, except Hba_14891, Hba_19161, Hba_20566, and Hba_17528, which can couple to members of more than one G-protein subfamily. Hba_18878, Hba_13948, and Hba_19080 are the only three GPCRs which can interact with all the four types of G-proteins. No coupling specificity is observed with any of the G-proteins; only in the case of Hba_20096. Though Hba_18878 and Hba_19080 have very low sequence similarity, they have coupling ability to members of the same subfamily of G-proteins. While Hba_10668 and Hba_14446 belong to the same amine subfamilies of rhodopsin like GPCR, they often couple to members of distinct G-protein subfamilies. Several sequences under the peptide subfamily have considerably high sequence similarity but they differ in coupling specificity with different families of G-proteins, except Gi/o (Table 1).
The entomopathogenic nematodes such as
Langeland et al., 2021 created an integrative database of nematode chemoreceptors called NemChR-DB to facilitate the analysis of NemChRs (
Further, these filtered sequences were passed through different transmembrane detectors; α-helical transmembrane proteins are the most important class of membrane proteins and constitute almost 20–30% of all the proteins encoded in a genome (Wallin and von Heijne, 1998; Krogh et al., 2001). From the retained sequences, TMHMM2 detected 97 sequences containing seven transmembrane helices while Phobius, HMMTOP2, and TOPCONS identified 139, 153, and 123 sequences with the same property, respectively. There are more than ten
Among these 7-TM sequences, 22 sequences were identified as GPCRs by GPCRHMM, a Hidden Markov Model–based GPCR recognition software which identifies TM topology-related features. It captures the variation in amino acid composition and topological segment lengths between GPCR families. It has a bare minimum error rate of identification of GPCR, in comparison to other HMM–based GPCR predictors, including Pfam. It has shown a higher percentage of selectivity and sensitivity over profile HMMs and generic transmembrane detectors on sets of known GPCRs and non-GPCRs (Wistrand et al., 2006). As GPCRHMM and GPCRPipe use a similar type of algorithm to detect GPCRs, they identified nearly the same numbers of GPCRs. The “AND” method of GPCRPipe has an accuracy of 97% and sensitivity and specificity of around 91% and 100%, respectively. These values are higher than any other GPCR detectors (Theodoropoulou et al., 2013). On the other hand, GPCRPen comprises sequence similarities (BLAST), common sequence motif profiles (Pfam), transmembrane structure (GPCRTm), and dipeptide composition (GPCRPred) (Begum et al., 2020). But it has only predicted 15 GPCR sequences, as we have restricted our search to the first two algorithms. When a completely different GPCR prediction server (GPCRPred) was employed for the same purpose, it recognized a much higher number of sequences (62) as GPCRs, compared to the other three tools. It is a support vector machine based on dipeptide composition (Bhasin and Raghava, 2004). The completely different search algorithm and trained database resulted in a higher number of sequences, which is almost 3–4 times the number of sequences fetched by the remaining three programs. GPCR recognition accuracy of GPCRPred is up to 99.5% using 5-fold cross-validation. All the resultant sequences were reconfirmed by screening through GPCR-CA, which proves that the pipeline is highly stringent, as it depends on a completely different algorithm to predict and classify GPCRs, which was not used at any earlier stage of the pipeline. It utilizes CA images to reveal the features hidden in a bunch of long and complex protein sequences. The gray-level co-occurrence matrix factors extracted from these images are used to represent the samples of proteins through their pseudo amino acid composition. It designated all these GPCRs as Class A rhodopsin-like GPCRs. Likewise, GPCRPred has also classified all the fetched GPCRs as Class A rhodopsin types. GPCRPred can classify GPCRs into five major classes or families with an overall Matthew's correlation coefficient (MCC) and accuracy of 0.81 and 97.5%, respectively (Bhasin and Raghava, 2004). It has been suggested that despite having low sequence similarity and diversified signal molecules, GPCRs involved in chemoreception might have originated from the rhodopsin family of GPCRs (Nordström et al., 2011). This rhodopsin family is the most abundant and diverse among all the GPCR families. They also have a unique signal transduction mechanism (Rosenbaum et al., 2009). Presence of an extracellular N-terminus domain, an intracellular C-terminus domain, and seven serial transmembrane hydrophobic helices joined by intracellular and extracellular loops are typical properties of GPCRs (Brody and Cravchik, 2000; Kroeze et al., 2003; Rosenbaum et al., 2009; Hanlon and Andrew, 2015). Except Hba_18906, all the retrieved sequences exhibited these properties, which strengthens our pipeline and the selection.
The same methodology of mining GPCRs was applied to the
Involvement of GPCRs in a wide array of physiological and pathological processes (Dryer and Berghard, 1999; Mombaerts, 1999; Schiöth and Fredriksson, 2005; Nei et al., 2008), and the presence of their ligand binding sites on cell surfaces, have made them the most suitable and accessible drug targets, for, for example, angiotensin receptor blockers (ARBs) for hypertension (Ghosh et al., 2015; Odoemelam et al., 2020; Alhosaini et al., 2021). There are several GPCRs with unknown ligand binding properties, known as orphan GPCRs. GPCR-ligand interaction and its downstream effect is dependent on the interaction of the GPCR under study with members of a specific G-protein subfamily. Therefore, predicting coupling specificity of orphan GPCRs to G-protein subfamilies is essential to find potential drug targets through heterologous expression studies (Wess, 1998). However, GPCRs with low sequence similarity may couple to members of the same subfamily of G-proteins, while members of the same GPCR subfamilies often couple to members of distinct G-protein subfamilies (Wong, 2003). As promiscuous GPCRs are found to be coupled with more than one G-protein subfamily, it is evident that coupling is a multidimensional function rather than one-by-one function (Hermans, 2003; Sgourakis et al., 2005).
The NemChRs identified in this study must be functionally validated for their roles in chemoreception in EPN