Academia.eduAcademia.edu

Phylogenetic revision of the claudin gene family

2013, Marine Genomics

Claudins are four-transmembrane proteins acting to collectively regulate paracellular movement of water and ions across cellular tight junctions in vertebrate tissues. Despite the prominence of zebrafish (Danio rerio) as a developmental model and the existence of an annotated genome, the diversity and evolutionary history of these claudins, with respect to other vertebrate groups, is poorly described. In this study, we identify 54 zebrafish claudins, including 24 that were previously unreported, and infer homology of the encoded polypeptide sequences with other vertebrate claudin groups using Bayesian phylogenetic analysis. In this analysis, 197 vertebrate claudin and claudin-like proteins were classified into discrete 'superclades' of related proteins. Based on these groupings, an interim reclassification is proposed, which will resolve ambiguity in the present nomenclature of several vertebrate models. Fifty-two of the 54 identified claudins were detected in cDNA preparations from whole, adult zebrafish, and 43 exhibited distinct tissue expression profiles. Despite prolific expansion of the claudin gene family in teleost genomes, these claudins can still be broadly separated into two functional groups: (1) "classic" claudins that characteristically contain an equal number of opposing, charged residues in the first extracellular loop (ECL1) and (2) "non-classic" claudins that typically have an ECL1 containing a variable number of charged residues. Functional analysis of these groups indicates that 'classic' claudins may act to reduce overall paracellular permeability to water and dissolved ions, whereas 'non-classic' claudins may constitute pores that facilitate selective ion permeability.

Marine Genomics 11 (2013) 17–26 Contents lists available at ScienceDirect Marine Genomics journal homepage: www.elsevier.com/locate/margen Phylogenetic revision of the claudin gene family David A. Baltzegar, Benjamin J. Reading, Emily S. Brune, Russell J. Borski ⁎ Department of Biology, North Carolina State University, Raleigh 27695-7617, NC, USA a r t i c l e i n f o Article history: Received 1 February 2013 Received in revised form 8 May 2013 Accepted 8 May 2013 Keywords: Claudin Phylogeny Vertebrates Zebrafish Danio rerio a b s t r a c t Claudins are four-transmembrane proteins acting to collectively regulate paracellular movement of water and ions across cellular tight junctions in vertebrate tissues. Despite the prominence of zebrafish (Danio rerio) as a developmental model and the existence of an annotated genome, the diversity and evolutionary history of these claudins, with respect to other vertebrate groups, is poorly described. In this study, we identify 54 zebrafish claudins, including 24 that were previously unreported, and infer homology of the encoded polypeptide sequences with other vertebrate claudin groups using Bayesian phylogenetic analysis. In this analysis, 197 vertebrate claudin and claudin-like proteins were classified into discrete ‘superclades’ of related proteins. Based on these groupings, an interim reclassification is proposed, which will resolve ambiguity in the present nomenclature of several vertebrate models. Fifty-two of the 54 identified claudins were detected in cDNA preparations from whole, adult zebrafish, and 43 exhibited distinct tissue expression profiles. Despite prolific expansion of the claudin gene family in teleost genomes, these claudins can still be broadly separated into two functional groups: (1) “classic” claudins that characteristically contain an equal number of opposing, charged residues in the first extracellular loop (ECL1) and (2) “non-classic” claudins that typically have an ECL1 containing a variable number of charged residues. Functional analysis of these groups indicates that ‘classic’ claudins may act to reduce overall paracellular permeability to water and dissolved ions, whereas ‘non-classic’ claudins may constitute pores that facilitate selective ion permeability. © 2013 Elsevier B.V. All rights reserved. 1. Introduction Claudins are transmembrane proteins governing the formation of cellular tight junctions. In the interstitial matrix, claudins interact with other claudins or tight junction proteins (e.g., occludin) to regulate paracellular permeability. The specific permeability properties are conferred both by the type and abundance of constituent claudins that span the paracellular space (Krause et al., 2008; Van Itallie and Anderson, 2006). As compartmentalization into microenvironments of distinct ionic composition is critical to biological systems, it is not surprising that claudins are now identified as vital to normal vertebrate development and homeostasis: in the embryo (Hardison et al., 2005; Münzel et al., 2011; Siddiqui et al., 2010), within lumens and the blood–brain barrier (Abdelilah-Seyfried, 2010; Cheung et al., 2011; Jeong et al., 2008; Zhang et al., 2010), components of ion and osmoregulation (Bagherie-Lachidan et al., 2008; Le Moellic et al., 2005; Nilsson et al., 2007; Ohta et al., 2006; Tipsmark et al., 2008a), and influencing carcinoma and disease (D'Souza et al., 2005; Hewitt et al., 2006; Satake et al., 2008; Winkler et al., 2009). Zebrafish (Danio rerio) is a prominent vertebrate model, particularly in the fields of developmental biology and physiology, however ⁎ Corresponding author at: Department of Biology, North Carolina State University, Campus Box 7617, Raleigh 27695-7617, NC, USA. Tel.: +1 919 515 8105. E-mail address: [email protected] (R.J. Borski). 1874-7787/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.margen.2013.05.001 its full complement of claudins is unknown. Ongoing genome assembly and annotation revisions have created spurious or redundant records, thus making transcriptomic profile analysis difficult despite availability of high-density array platforms. Additionally, the current nomenclature for zebrafish claudins is ambiguous, comprising both alphabetical and numerical designations, which only partially reflects homology to claudins from other taxa. This ambiguity is compounded by erroneous classifications in other groups. For example, zebrafish claudin j (cldnj) is essential to the formation of the otolith (hearing) during development (Hardison et al., 2005). This gene is clearly an ortholog of pufferfish (Takifugu rubripes) claudin 6 (cldn6), yet neither may be related to human claudin 6 (CLDN6; GenBank UGID:1293154). Conventionality of gene nomenclature is essential, albeit dynamic and rooted on the persistent accumulation of data (Povey et al., 2001). The same safeguards that prevent variability and confusion of gene nomenclature, however also promote obsolescence. As examples, Kollmar et al. (2001) identified 11 zebrafish claudins with no ortholog in mammals, described as claudins a–k. In 2004, genome analysis of the pufferfish yielded a staggering 56 claudins, presumably the result of gene and genome-wide duplication. Those with no clear homology to mammalian claudins were classified within new numeric groups (Loh et al., 2004). The largest expansion occurred at the claudin 4 loci: human CLDN4 putatively shares a common ancestor to 13 discrete fish claudins (cldn27a–d, cldn28a–c, cldn29a–b, 18 D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 cldn30a–d) (Loh et al., 2004). Currently, the same theoretical ortholog from a “non-model” fish could be classified as cldn4, cldnd, or cldn29a. From an evolutionary perspective, the expansion of the claudin gene family provides a unique opportunity to model the possible fates of duplicated genes. Yet, this task remains difficult without comprehensive reclassification of vertebrate claudins. In this study, we identified 54 zebrafish claudins, of which 24 were previously undescribed (novel) in the annotated assembly (Zv9). Shared evolutionary history was inferred using human, mouse (Mus musculus), frog (Xenopus tropicalis), and pufferfish claudins using Bayesian phylogenetic inference and genome synteny. The mRNA expression of identified claudins was verified in preparations from whole zebrafish. Tissue specific expression of claudin genes in zebrafish is also reported. Since standardization of gene nomenclature is requisite to a unified knowledge base derived from both model and non-model organisms, we propose an interim reclassification scheme for all major claudin groups, reflective of common evolutionary descent. This reclassification provides a much needed reference for the study of claudin function across taxa. 2. Materials and methods 2.1. Identification of putative claudins Candidate zebrafish claudins were identified by review of accessioned records available through NCBI Gene (www.ncbi.nlm.nih.gov/ gene; search criteria: Danio rerio claudin, in March of 2010). These records contained both validated and provisional claudins, hypothetical loci containing claudin-like domains (pfam00822: PMP22_Claudin; PMP-22/EMP/MP20/Claudin family), and other closely related genes (claudin domain containing 1 [cldnd1], peripheral myelin protein 22 [pmp22], lens intrinsic membrane protein [lim], calcium channel voltage-dependent gamma subunits 1–8 [cacng]). Our preliminary search was refined by BLASTp search (of the translated peptide sequences) of the T. rubripes and M. musculus non-redundant protein databases (NCBI). Sequences with a Claudin BLASTp match or containing the pfam00822 PMP/Claudin domain were selected for further study. Final candidate genes were selected by peptide sequence alignment using ClustalX (Thompson et al., 1997) and by preliminary phylogeny analysis with human, mouse, X. tropicalis, and T. rubripes claudin, cldnd1, pmp22, and cacng2 sequences (parameters: generations = 2 million, sample frequency = 2000, among-site variation = equal (fixed), amino acid rate matrix = Poisson, burnin = 950). The complete list of sequences selected for analysis (total = 197) is provided as Supplemental information (Table S2). 2.2. Phylogeny and genomic synteny comparisons Bayesian phylogenetic analysis was performed using four models for amino acid substitution: Poisson (Bishop and Friday, 1987), Blosum62 (Henikoff and Henikoff, 1992), WAG (Whelan and Goldman, 2001), and the Equalin model, an F81 model variant (Felsenstein, 1981). All other analysis parameters were held constant (generations = 50 million, sample frequency = 10,000, among-site rate variation = equal [fixed], burnin = 1250). Analysis was performed using MrBayes (v3.1.2) on TeraGrid computing accessible through the Cyberinfrastructure for Phylogenetic Research (CIPRES) portal, available online at http://www.phylo.org/portal2/home.action (Huelsenbeck and Ronquist, 2001; Miller et al., 2010). Consensus trees from the Bayesian analysis were tested as “user trees” for significant differences in log likelihood using the Shimodaira–Hasegawa (S–H) test for alternate evolutionary hypotheses in TREE-PUZZLE (Schmidt et al., 2002; Shimodaira and Hasegawa, 1999). The S–H test for log likelihood testing was performed with the following models: Blosum62, WAG, Dayhoff (Dayhoff et al., 1978), and JTT (Jones et al., 1992). Genomic synteny comparisons were performed using the following assemblies available through Ensembl (collaboration of EMBL-EBI and the Wellcome Trust Sanger Institute, available at http://www. ensembl.org): Fugu [T. rubripes]—International Fugu Genome Consortium version 4 [June, 2005]; human [Homo sapiens]—Genome Reference Consortium assembly version 37 [February, 2009]; mouse [M. musculus]—Mouse Genome Sequencing Consortium version 37 [April, 2007]; Western clawed frog [X. tropicalis]—Joint Genome Institute version 4.2 [November, 2009]; zebrafish [D. rerio]—Sanger Institute assembly version 9 [Zv9; April 2010]. Orthology of nonclaudin genes was determined using the Ensembl annotated “orthologs” database and by local alignment search tools (BLAST/ BLAT) available through NCBI and Ensembl. 2.3. RNA extraction and tissue expression Zebrafish males (Tuebingen longfin strain) were used to examine mRNA expression of identified claudins. The following tissues were collected and pooled from 5 fish: eye, whole brain, gill, heart, kidney (whole), spleen, skin (whole left side, including the lateral line system), and testes. Additional males were used for preparation of whole fish total RNA. Collected tissue was preserved in RNAlater (Ambion) at 4 °C overnight before bead homogenization with RNAzol RT (Molecular Research Center) buffer. Total RNA was extracted by manufacturer's protocol (MRC). Following extraction, DNA contamination was removed by DNAse-I treatment using a Turbo DNA-free Kit (Ambion). Before cDNA synthesis, total RNA from all tissues was quantified by A260 absorbance using a NanoDrop (ND1000) spectrophotometer. RNA quality was assessed by 18S and 28S ribosomal band integrity after gel electrophoresis. One microgram of total RNA for each tissue (or from whole fish) was reverse-transcribed with random primers using a High Capacity cDNA Reverse Transcription Kit (Applied Biosystems). The cDNA reactions were diluted 1:6 prior to PCR amplification. Primer pairs for 54 putative D. rerio claudins were designed from accessioned NCBI Gene sequences using Vector NTI software (Lu and Moriyama, 2004). Additionally, the housekeeping gene β-actin 1 (bactin1) was amplified as a positive control. A complete list of primers and annealing temperatures is provided in Table S4 (Supplemental information): amplicon size range = 400–500 bp, annealing temperature range = 54–63 °C. All PCR reactions were performed using Taq DNA polymerase and 10 × buffer (Fisher Scientific). Reactions (25 μL) contained the following: 1 × Buffer A, 0.2 mM dNTP mix, forward and reverse primers (0.4 μM each), DNA polymerase (0.3 U/μL), 1.3 μL of diluted cDNA (~ 11 ng), and nuclease-free water (Sigma-Aldrich; to volume). The PCR cycling parameters were as follows: (1 cycle) 95 °C for 2 min; (40 cycles) 95 °C for 30 s, 54–63 °C for 30 s, 72 °C for 45 s; (1 cycle) 72 °C for 5 min, 4 °C holding. To validate successful PCR of the targeted claudin, amplification was first performed using whole-fish cDNA and these were submitted for sequencing. All PCR reactions were cleaned by Qiaquick PCR purification columns (Qiagen) and concentrated as a 2 × elution. Samples were submitted to the University of Chicago CRC-DNA sequencing facility with the forward primer (Applied Biosystems 3730XL 96capillary sequencer). Sequence chromatograms were identified by BLASTn search to accessioned D. rerio claudins (non-redundant database; Organism = Danio rerio [txid7955]; August 20th, 2010; e-values: 0.0–7e−35; % identity = 76–100). Two poor-quality sequences, LOC557209 and LOC794676, were identified by alignment using Blast2align (NCBI; 77–82% identity). For the tissue expression profile, amplification was performed simultaneously with cDNA derived from discrete tissue preparations. All templates were normalized by starting total RNA concentration (1 μg) and verified by bactin1 amplification. The reaction and cycling parameters were as stated previously. After amplification, the D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 PCR products were gel electrophoresed with a 1 kb DNA marker (Promega) (1% agarose in 1 × TAE buffer, 4 μg/mL ethidium bromide). The agarose gels were imaged using a digital Gene Genius system 19 (Syngene), with post-hoc image analysis restricted to whole-figure cropping and reverse-contrast enhancement, using CorelDraw (v12). No other modifications were performed. Fig. 1. Bayesian phylogeny of vertebrate claudin protein sequences. (A) Unrooted radial consensus tree from analysis using the WAG substitution model (best log-likelihood score). Taxa sequences were “pruned” from the tree, with the major branches circled for better depiction in cladogram format (B–H). The largest branch of claudins (shaded) is depicted in Fig. 2. The relative position of outgroups CldnD1, CldnD2, Pmp22, and TMEM204 are shown but not depicted (tree file available in Supplemental information). Branch length scale represents 0.1 amino acid substitutions per site. (B) Vertebrate claudin groups 10, 15, and 15-like (15-l). (C) Claudin groups 1, 7, and 19. (D) Claudin groups 2, 14, and 20. (E) Claudin groups 11 and 16. (F) Claudin groups 22, 25, 25-like (25-l), 33, and 34. (G) Claudin groups 12, 12-like (12-l), and 23. (H) Claudin 18. Taxon labels denote the abbreviated scientific name (Dr, Danio rerio; Hs, Homo sapiens; Mm, Mus musculus; Tr, Takifugu rubripes; Xt, Xenopus tropicalis) followed by the current (accepted) claudin group designation. Proposed changes to nomenclature are shown in parentheses. Large gray numerals and dots on the tree indicate the monophyletic origin of a major claudin group. Small font numerals at each node represent clade-credibility values (percent of total Bayesian trees supporting each node in the consensus tree). Zebrafish claudin sequences are in bold font. Mouse Cldn24 and LOC100045502 are identical sequences (double asterisks; see F). 20 D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 2.4. Functional analysis of major claudin groups 3. Results Zebrafish claudins were assigned designations as ‘classic’ (super clades in Figs. 1C, D, 2B, C, D, and E) or ‘non-classic’ (superclades in Fig. 1B, E, F, G, and H) by sequence similarity to human orthologs (Lal-Nag and Morin, 2009). Claudin domain demarcations were predicted using SMART (Schultz et al., 1998) and protein sequence alignments were performed using MacVector ClustalW (Thompson et al., 1994). Consensus peptide sequences of claudin extracellular loops were generated using the HCV Sequence Database of the Los Alamos National Laboratory (available online at: http://hcv.lanl.gov). Chou–Fasman and Robson–Garnier plots of claudin extracellular loops were generated using MacVector. Normalized values from each plot were used to create a consensus plot of secondary structure prediction. 3.1. Phylogenetic analysis A revised list of 197 claudins and claudin-related proteins from human (28), mouse (29), Fugu (60), X. tropicalis (23), and zebrafish (57) was obtained from available NCBI and Ensembl gene records (Tables S1, S2). The proteins cldnd1, cldnd2, and pmp22 have similar structural motifs (pfam00822, pfam13903 claudin-like domains) to known claudins and were used here as outgroups. Consensus trees derived from phylogenetic analysis under different model assumptions were tested for significant differences in log-likelihood using the S–H test available in TREE-PUZZLE (Schmidt et al., 2002; Shimodaira and Hasegawa, 1999). The WAG model yielded the best Fig. 2. Phylogeny of vertebrate claudins (continued). (A) Unrooted radial consensus tree of the largest claudin superclade (in Fig. 1). Taxa sequences are pruned and depicted as subgroups in cladogram format (B–E). The position of claudin groups on the radial tree is indicated with bold font. Branch length scale represents 0.1 amino acid substitutions per site. (B) Claudin groups 5, 5-like, 6, and 31. (C) Claudin groups 3, 8, and 8-like. (D) Claudin groups 4, 13, 4-like, 28, 29, 30, 32, and 36. (E) Claudin groups 35 and 13. Taxon labels and symbols are as described in Fig. 1. D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 scoring consensus tree (log-likelihood—64230.25), and was significantly better than the Poisson or Equalin trees, regardless of the substitution model used in the S–H test (Blosum62, WAG, Dayhoff, or JTT; p b 0.05). Although it possessed a higher log likelihood score, the WAG model tree was not significantly better than the Blosum62 consensus tree (p = 0.371 to 0.528). The WAG model was the best-scoring consensus tree of these two and hence is reported here (Figs. 1 and 2). Tree topology for the 2nd best-scoring tree (Blosum62) is identical except for the following cases: mouse and human claudin 7 are grouped with claudin 1 (compare with Fig. 1C; WAG model); zebrafish LOC793915 is not grouped with claudin 16 but is sister to claudins 11 + 16 (Fig. 1E, both monophyletic); zebrafish zgc:63990 is not united with claudin 12 and is unresolved (Fig. 1G); all vertebrate claudin 3 form a unified clade, sister to 21 claudin 8 and claudin 8-like (Fig. 2C), zebrafish cldna and cldnb form a clade with pufferfish cldn30c (Fig. 2D), and mouse Cldn4 formed a clade with mouse Cldn13 (Fig. 2D). Raw consensus tree files for all models analyzed are available as Supplemental information (Fig. S1). 3.2. Identification of zebrafish claudins Of the 54 zebrafish claudins identified in our analysis, 47 were identified as putative orthologs of claudins previously described in pufferfish (Loh et al., 2004). To avoid ambiguity, all claudins are listed (Table S1) both by accepted (current) gene names as well as those proposed using the interim reclassification scheme (see Section 2.3). In cases where ortholog assignment was ambiguous or conflicted with accepted nomenclature, genomic synteny was used as additional evidence for homology (Figs. 3 and S2–S5). Final assignment was determined by sum of evidence: cldn7a and cldn7b — current paralog designation (NCBI) was not supported by our phylogeny or genomic synteny (Figs. 1C, 3A); cldn15a and cldn15b — genomic synteny supported current paralog assignment (Fig. 3B); cldn11a and cldn11b — phylogeny supported by synteny (Fig. S2C); cldn18 — undetermined due to chromosomal rearrangement at zebrafish loci (Fig. S2D); cldn8a–d — phylogeny supported (Fig. S3B); cldn10a–e — genomic synteny supports orthology of cldn10a and cldn10e, others equivocal due to chromosomal rearrangement (Fig. S3B); cldn33a–b and cldn34 — orthology supported by synteny (Fig. S4); and claudins 3a, 3c–d, 5c, 8-like, 28a–c, 28d, 29a–b, 30a, 30c–d, 36b — orthology to Fugu claudins supported by genomic synteny (Fig. S5). Gene orthologs to the following Fugu claudins were not identified in our zebrafish analyses: cldn3b, cldn13, cldn14a (renamed cldn14), cldn27a (renamed cldn36a), cldn27d (renamed cldn28e), cldn28b, and cldn30b. Seven zebrafish claudins have no direct ortholog in the pufferfish, presumably due to gene loss in the Fugu genome or lineage-specific gene duplication in zebrafish: LOC567620 and si:ch211-95j8.2 form a monophyletic clade with pufferfish cldn23a, and are described here as cldn23a1 and cldn23a2, respectively (Fig. 1G); cldnk and LOC100334365 are paralogs of pufferfish cldn31 and designated as cldn31a and cldn31b, respectively (Fig. 2B); zebrafish LOC793915 (designated as cldn16) forms a monophyletic group with tetrapod claudin 16, but currently has no pufferfish ortholog (Fig. 1E); zebrafish zgc:63990 aligns as basal to vertebrate claudin 12 in a weakly supported clade (52% ccs, Fig. 1G) and is designated as cldn12-like; no vertebrate claudin was identified as an ortholog of zebrafish cldng, however it groups most closely to vertebrate claudin 5 (designated as cldn5-like; Fig. 1B). A complete list of all sequences used in our analysis is available as Supplemental information (Table S2). Fig. 3. Loci containing claudin 7 and claudin 15 in Fugu and zebrafish genomic assemblies. (A) Claudin 7a and claudin 7b. (B) Claudin 15a and claudin 15b. The orthology of zebrafish claudins, when unresolved by phylogeny (Figs. 1, 2), was identified by genomic synteny. The existing nomenclature incorrectly assigns orthology to zebrafish cldn7a (now reclassified as cldn7b) and cldn7b (now reclassified as cldn7a), but is correctly assigned for zebrafish cldn15a and cldn15b. Gene and inter-gene distances depicted are for illustrative purposes only and are not to relative scale. Symbols: empty box, claudin gene (group ortholog); hatched box, genes with no observable ortholog in compared regions; black box, orthologous genes across loci; shaded box, genes giving alternate evidence for paralog identity; rc, reverse complemented sequence. Gene abbreviations: chrnb1, cholinergic receptor, nicotinic, beta 1; fgf11, fibroblast growth factor 11; fis1, fission 1; gpr65, G protein coupled receptor 65; grk1b, G protein-coupled receptor kinase 1 b; kcnip1, Kv channel interacting protein 1; lsmd1, LSM domain containing 1; npm1, nucleophosmin 1; prki, protein–kinase, interferon-inducible; sat2, spermine N1-acetyltransferase family member 2; ugt5g1, UDP glucuronosyltransferase 5 family, polypeptide G1. Genome assembly regions (Ensembl): Fugu (v4)—(cldn7a) scaffold_6: 271,571-323,547 (nt); (cldn7b) scaffold_276: 193,528-320,757; (cldn15a) scaffold_63: 1,148,610-1,161,616; (cldn15b) scaffold_6: 1,983,619-1,995,310; zebrafish (Zv9)—(cldn7a) Chromosome 7: 23,566,760-23,876,679; (cldn7b) Chromosome 10: 22,203,960-22,449,654 (reverse complement); (cldn15a) Chromosome 7: 21,949,415-22,003,543; (cldn15b) Chromosome 5: 63,250,685-63,622,810. 22 D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 3.3. Proposed interim reclassification of major claudin groups Bayesian phylogenetic analysis identified eight claudin “superclades”, as grouped by shared conservation of amino acid sequence: claudins 10, 15, and 15-like (Fig. 1B); claudins 1, 7, and 19 (Fig. 1C); claudins 2, 14, 20, and 34 (Fig. 1D); claudins 11 and 16 (Fig. 1E), claudins 22, 25 (excluding teleost cldn25), and 33 (Fig. 1F); claudins 23 and 12 (Fig. 1G), claudin 18 (Fig. 1H), and claudins 3, 4, 5, 6, 8, 8-like, 9, 13, 27, 28, 29, 30, 31, 32, and 35 (Fig. 2A–D). The claudin groups below were identified as monophyletic, where all designated members share a common evolutionary history, in our phylogeny: claudins 1, 2, 5, 10, 11, 12, 15, 16, 19, 20, 23, 29, 30, and 31 (Figs. 1, 2). Monophyly could not be easily assigned to the remaining claudin groups, due to poor tree resolution or more often, these groups are paraphyletic as currently described. Each of these groups is discussed within a proposed interim reclassification scheme where necessary. A revised list of proposed changes to major claudin groups is provided as Supplemental information (Table S3). Claudin 3—monophyly of all claudin 3 genes is supported by genomic synteny (Fig. S5), but not in our consensus tree (Fig. 2C). Tetrapod Cldn3 groups with Cldn8 and Cldn8-like sequences with a poor clade credibility score (ccs, 59%). Therefore, no reclassification is proposed. Claudin 4—claudin 4 is not present in teleost fishes, but is likely ancestral to claudins 27, 28, 29, and 30 (Fig. 2D, ccs 84%). Genomic synteny supports the monophyly of these fish claudins with tetrapod claudin 4 (Fig. S5). In our phylogeny, tetrapod claudin 4 is paraphyletic without inclusion of mouse Cldn13. No reclassification is currently proposed (see Claudin 13). Claudin 6 and 9—mammalian claudin 6 is monophyletic and a paralog of claudin 9 (100% ccs, Fig. 2B). Pufferfish cldn6 and zebrafish cldnj are orthologs, but not of tetrapod claudin 6 or 9, but likely vertebrate claudin 8 (see Fig. 2C). Previous analysis (Loh et al., 2004) supported unification of fish and mammal claudins 6 and 9, however this monophyletic clade was poorly supported (b50% bootstrap). Further, genomic synteny supports that claudin 6 and claudin 9 may not exist in fishes (Fig. S2B). As Cldn6 and Cldn9 are clearly paralogs in mammals (Fig. 2B; Loh et al., 2004; Tipsmark et al., 2008b; Krause et al., 2008), reclassification to claudin 6α (Cldn6) and claudin 6β (Cldn9) is proposed. Additionally, we propose T. rubripes cldn6 and D. rerio cldnj be reclassified as claudin 8-like. Claudin 7—an unresolved node supports monophyly of tetrapod and teleost claudin 7 only with the inclusion of claudin 1 (72% ccs, Fig. 1C). Genomic synteny shows no clear conservation of the claudin 7 loci between tetrapods and teleosts. Therefore, no reclassification is proposed. Claudin 8 and 17—vertebrate claudin 8 is monophyletic with inclusion of claudin 17 (86% ccs, Fig. 2C). Consistent with previous work, our phylogeny identifies mammal Cldn8 and Cldn17 as paralogs (Krause et al., 2008; Loh et al., 2004; Tipsmark et al., 2008b). As Cldn8 was described before Cldn17 (Morita et al., 1999), reclassification of mammal claudin 8 to claudin 8α, and claudin 17 to claudin 8β, is proposed. Also consistent with Loh et al. (2004), our analysis supports independent duplication of teleost cldn8a–d from the common ancestor (Fig. 2C; 92% ccs). As zebrafish cldn17 is an ortholog of pufferfish cldn8c (100% ccs), reclassification to cldn8c is proposed. Claudin 13—the phylogeny infers that teleost cldn13 is ancestral to the expansion of the largest claudin superclade (Fig. 2A, E), and not directly related to mammal (mouse) Cldn13. In contrast, genomic synteny supports orthology for these claudins (Fig. S5). Further information is needed to resolve this group; no reclassification proposed. Claudin 14—with the exclusion of teleost cldn14b, all vertebrate claudin 14s form a monophyletic group (100% ccs, Fig. 1D). Reclassification of pufferfish cldn14b and zebrafish LOC568833 to claudin 34 is proposed. Claudin 18—a unified clade of all vertebrate claudin 18 sequences also includes the outgroup sequences (CldnD1, CldnD2, Pmp22, and TMEM204) used for polarization of the unrooted tree (86% ccs, Fig. 1H). Genomic synteny suggests orthology of mammal and Fugu claudin 18, however chromosomal rearrangement has yielded equivocal evidence for this gene in the zebrafish (Fig. S2D). No revision is currently proposed. Claudin 21—currently there are no genes with the designation claudin 21. Human CLDN21 was described by Katoh and Katoh (2003) at locus 4q35.1, but this gene was reclassified as CLDN24 in later assemblies (GRCh37; Genome Reference Consortium Build 37, April 2009). In the mouse, Cldn21 is annotated in GRC assembly 37 (April 2007, current Ensembl version) at chromosome 9, but was reclassified as Cldn25 in GRC assembly 37.2 (March 2011, current NCBI version). The Cldn25 identified in the earlier assembly was then reclassified as claudin domain containing 1 (CldnD1, chromosome 16, v37.2). No teleost claudin 21 has been described. The closing of this group to avoid further ambiguity is proposed. Claudins 22 and 24—mammal claudin 22 is monophyletic only with the inclusion of claudin 24, a paralog gene in our analysis (83% ccs, Figs. 1F, and S2A). No ortholog has been found in teleost fishes. Redesignation of mammal Cldn22 and Cldn24 to Cldn22α and Cldn22β (respectively) is proposed. Claudin 25 and 26—mammal claudin 25 is not monophyletic with teleost claudin 25. Human and mouse claudin 25 are monophyletic and ancestral to claudins 22 and 24 (100% ccs, Fig. 1F), while pufferfish cldn25 and cldn26 are monophyletic with D. rerio claudin 15-like (100% ccs, Fig. 1B). Reclassification of T. rubripes cldn25 and cldn26 to cldn15la and cldn15lb, respectively, is proposed. The group designation claudin 26 should be closed to avoid future ambiguity. Claudins 27 and 28— claudin 27 is paraphyletic as currently described. Claudins 27a and 27c form a monophyletic clade ancestral to claudin 28 (100% ccs), while cldn27b and cldn27d group within claudin 28 (97% ccs, Fig. 2D). Reclassification of cldn27b and cldn27d to cldn28d and cldn28e, respectively, is proposed. To avoid future ambiguity, cldn27a and cldn27c should be assigned to a new group, claudin 36 (cldn36a and cldn36b, respectively). Claudin 32—claudin 32a is not a paralog of claudin 32b (Fig 2D, E). Reclassification of T. rubripes cldn32b and D. rerio LOC570842 to a new group, claudin 35, is proposed (Fig. 2E). This would accompany the reclassification of orthologs T. rubripes cldn32a and D. rerio cldni to claudin 32, proper. Claudin 33—claudins 33a and 33b are monophyletic, however claudin 33c forms an ancestral clade to claudins 22, 24 and 25 (Fig. 1F, 100% ccs, and Fig. S4). Reclassification of claudin 33c to claudin 25-like is proposed. 3.4. mRNA expression of zebrafish claudins As duplicate genes could be nonfunctional (pseudogenes), another objective was to determine the proportion of the identified claudins expressed as mRNA. Forty-three claudins were detected in a panel of 8 discrete tissue cDNA preparations. Zebrafish cldn7a was expressed ubiquitously, while cldn5b, cldn5a, and cldn11a were detected in all but one tissue type (Fig. 4). Other claudins were expressed only in a single tissue: cldn8l (brain), cldn10c and cldn10e (gill), cldn10d (spleen), cldn18 (kidney), and cldn33a (testes). With the exception of cldn10c and cldn10e, all claudin group paralogs exhibited unique expression patterns (Fig 4). Nine zebrafish claudins were not detected in our tissue profile but were detected in cDNA preparations obtained from whole (homogenized) fish: cldn5c, cldn8a, cldn15b, cldn15la, cldn5lb, cldn16, cldn29a, cldn31b, and cldn34 (data not shown). All amplicons were sequenced and validated against the targeted claudin sequence by BLASTn search or by pair-wise BLAST sequence alignment (Blast2align) (Table S5, Supplemental information). Two zebrafish claudins, cldn10a and cldn23a2, were not detected in our expression analysis (tissue profile or whole fish cDNA). A search of the NCBI zebrafish EST database identified an available embryonic cDNA clone (Agencourt) for cldn23a2 GenBank: CN0248601.1, suggesting that the expression of this claudin may be ontogenetically regulated (MegaBLAST search, 98% sequence identity, e value = 0.0). A similar search for zebrafish cldn10a yielded no significant results. D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 23 Fig. 4. Tissue expression profile of zebrafish claudins. Endpoint (plus/minus) PCR assay of identified zebrafish claudins using prepared cDNA derived from the following tissues: brain (whole), eye, gill, heart, kidney (whole), spleen, skin, and testes. Controls: zebrafish beta actin (bactin1) was used as a positive control for all tissues, sterile water as a negative control. One microgram of total RNA was used as template for all cDNA preparations. All claudin gene symbols reflect proposed claudin nomenclature in the interim reclassification scheme (listed in Table S1, S.I.). Claudins containing non-“classical” ECL structures are underlined. The photographed gel images were reverse contrasted for better band visualization and printing efficiency. 3.5. Functional analysis of major claudin groups The nucleotide sequences reported here (Table S1) encode claudin proteins with an average length of 240 amino acids and a predicted molecular weight of 26 kDa. The zebrafish claudins exhibited typical tetraspan transmembrane structure with two conserved extracellular loops, one intracellular loop, a short internal amino terminal sequence and a carboxy terminal cytoplasmic domain of variable length (Fig. 5). The first claudin extracellular loop (ECL1) contains the signature residues tryptophan, glycine, and leucine [W-GLW] (Van Itallie and Anderson, 2006) followed by two cysteine residues [C–C] involved in the barrier function (Wen et al., 2004). The [W-GLW] motif is conserved in all zebrafish claudins except ‘non-classic’ claudins from the superclade shown in Fig. 1B, which have a [W-NLW] motif (Fig. 5). The [C–C] motif is conserved in the ECL1 of all zebrafish claudins (Fig. 5). In addition to these two characteristic motifs, the ECL1 contains several acidic and basic residues that determine charge- and size-selectivity of channels or pores formed between claudins, allowing for paracellular permeability of select ions and small molecules (Colegio et al., 2002, 2003; Krause et al., 2008; Angelow and Yu, 2009; Lal-Nag and Morin, 2009). The ECL1 sequences of ‘classic’ zebrafish claudins are conserved, whereas those of ‘non-classic’ claudins are less so (Fig. 5) as similarly observed in mouse (Krause et al., 2008) and human (Lal-Nag and Morin, 2009). The ECL1 of ‘classic’ claudins typically contain a combination of three positive residues (lysines [K] and/or arginines [R]) and three negative residues (glutamic acids [E] and/or aspartic acids [D]) imparting a net neutral charge to the loop (Fig. 5). Ordering of these residues is as follows: [K/R–E/D–K–D–D–R]. Although the locations of these residues are conserved in claudins depicted in Fig. 2E, D, C, and B, they appear in slightly different positions than claudins from superclades shown in Fig. 1C and D (see Fig. 5). A notable exception among the ‘classic’ claudins includes the four claudin 8 isoforms (Fig. 2C), which contain an additional 2 [D/E] and 1 or 3 [R] residues. Multiple copies of claudin 8 in fishes and their grouping in relation to mammalian claudin 8 orthologs (Fig. 2C) suggest gene duplication followed by possible neofunctionalization. In contrast, the ECL1 of ‘non-classic’ claudins have a greater number of charged residues and the positions of such residues appear less conserved (Fig. S6). Claudins of opposing cell membranes dimerize (i.e., transinteract) through hydrophobic association of aromatic residues in the second extracellular loop (ECL2) (Angelow et al., 2008) and all of the zebrafish claudins contain conserved [W], tyrosine [Y], and phenylalanine [F] residues required for such interactions (Fig. 5). The ECL2 of mammalian claudins typically consists of a helix-turnhelix motif (Krause et al., 2009), however alignment of zebrafish ECL2 regions clearly resolve three distinct functional groupings and the consensus sequences of these groups are shown in Fig. 5. Claudins of opposing cell membranes dimerize (i.e., trans-interact) through hydrophobic association of aromatic residues in the second extracellular loop (ECL2) (Angelow et al., 2008) and all of the zebrafish claudins contain conserved tryptophan [W], tyrosine [Y], and phenylalanine [F] residues required for such interactions (Fig. 5). The ECL2 of mammalian claudins typically consists of a helix-turn-helix motif (Krause et al., 2009), however alignment of zebrafish ECL2 regions resolve into three distinct groupings correlating to major clades identified in the phylogeny. The consensus sequences of these groups are shown in Fig. 5. ‘Classic’ claudins belonging to superclades depicted in Fig. 2B, C, D, and E contain an ECL2 with a predicted sheet-turn-helix motif, whereas that of the remaining two ‘classic’ claudin superclades (Fig. 1C and D) are predicted to have a sheet-turn motif. Some of the ‘non-classic’ claudins resolve into a third group (Fig. 1B) with an ECL2 predicted to have a helix-turn motif. The remaining ‘non-classic’ claudins (Fig. 1E, F, G, and H) do not resolve well and the ECL2 sequences have poor consensus. Interestingly, ‘non-classic’ zebrafish 24 D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 Fig. 5. Domain model of a representative zebrafish claudin. Consensus peptide sequence alignments corresponding to the first extracellular loop (prefix ECL1) and second extracellular loop (prefix ECL2) of zebrafish claudins are shown above the corresponding structures depicted in the cartoon. The consensus sequences are defined as follows: suffix _C1 are ‘classic’ claudins from superclades in Fig. 2B, C, D, and E; suffix _C2 are ‘classic’ claudin superclades shown in Fig. 1C and D; suffix _NC1 are ‘non-classic’ claudins from superclade 1B. Conserved uncharged residues (including the [W-GLW] motif of ECL1) are indicated by the gray shading and conserved charged residues are indicated by black shapes and white text. Two cysteine [C] residues conserved in all claudins are indicated in white boxes within the ECL1 alignment and white circles in the cartoon. An ‘x’ in the sequence alignment designates a non-conserved residue position. The second proline [P] residue present in the ECL2 of ‘classic’ teleost claudins, but absent in ‘non-classic’ claudins is indicated by the arrow. The cylinders numbered 1 through 4 depict transmembrane domains. A putative PDZ domain interacting motif at the carboxy-terminus of the intracellular tail also is depicted in the cartoon. This motif is largely conserved in ‘classic’ claudins, however it is not conserved in ‘non-classic’ claudins. Average lengths for each domain are given in amino acids (aa). claudins typically lack a second proline [P] in the ECL2 that is present in the ‘classic’ teleost claudins (Fig. 5) and this may influence secondary structure. Further studies are required to validate these structural predictions and to better understand the functional properties of the ECL2 as it relates to claudin members of different superclades. 4. Discussion Claudins are the dominant constituents of cellular tight junctions and both their abundance and composition confer selective properties of paracellular ion flux across vertebrate tissues (Krause et al., 2009; Van Itallie and Anderson, 2006). The zebrafish, despite its status as one of the five major vertebrate models, is poorly annotated with respect to the diversity of the claudin members present in this species. In this study, our objectives were to review accessioned sequence records currently available through NCBI Gene and Ensembl, and identify both known and unknown zebrafish claudins based on inferred evolutionary history with other claudins from well-studied vertebrate genomes: human, mouse, Fugu, and X. tropicalis. To promote better comparative inference across taxa, the zebrafish claudins were then reclassified under a proposed interim scheme, designed to create a unified framework for all vertebrate groups, reflective of shared evolutionary descent. Additionally, we report zebrafish claudin mRNA expression by tissue profiles and functionally examine select features in the polypeptide sequences of claudin groups. This work was undertaken to aid future investigations of claudin function in the zebrafish developmental model, and to facilitate further hypothesis testing in other vertebrate groups. From a comprehensive review of 88 claudin and claudin-related sequences, 54 claudins were identified in the zebrafish (Table S1). These sequences were used with other vertebrate claudins (197 sequences, Table S2) in an unrooted Bayesian phylogenetic analysis, which was then tested using four prominent model assumptions in the literature (substitution probability matrices: Poisson, Equalin, Blosum62, and WAG). The consensus trees were then tested together for significant differences in log-likelihood score by the S–H test for alternate evolutionary hypotheses (Shimodaira and Hasegawa, 1999). Common ancestry for all vertebrate claudins was determined by the inclusion of outgroup sequences. To date, this work comprises the largest dataset used to infer the evolutionary history of vertebrate claudins (Kollmar et al., 2001; Krause et al., 2009; Loh et al., 2004; Tipsmark et al., 2008b). At its core, evolutionary inference is merely a hypothesis, and we sought to support our findings when needed by additional evidence gathered from genomic synteny (Figs. 3, S1– 5). Fifty-one zebrafish claudins were orthologs of claudins previously described by Loh et al. (2004) in the pufferfish, T. rubripes. Orthologs of three zebrafish claudins, LOC793915 (cldn16), zgc:63990 (cldn12l), and cldng (cldn5l) have yet to be identified in pufferfish. Additionally, subsequent to our analysis, two putative claudins were discovered in the zebrafish genomic assembly (Zv9, Ensembl): ENSDART 00000126328 and ENSDART 00000076524. Identified tentatively as zebrafish cldn13 and cldn34b respectively, their appearance illustrates that gene discovery is ongoing; as accessioned sequence records and assemblies are continuously updated and revised. Gene names are discovery based, and reflect observed phenotypes, function, or esoteric whims of the discoverer (e.g. cheap date, kryptonite), yet these names are essentially meaningless if orthologs cannot easily be identified in other groups. Our analysis provided the opportunity to infer evolutionary relationships of claudins across vertebrates, and assess how well accepted gene names identify homologs in other groups. Zebrafish claudins containing alphabetical names (cldna-k) have clear orthologs in pufferfish, which have numerical designations (Figs. 1, 2). In mammals, lineage-specific gene duplicates (paralogs) are not identified as such, and these names D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 have been erroneously assigned to genes in fishes (claudin 6 and claudin 9, claudin 8 and 17, claudin 22 and 24; Figs. 1–2, S2–3). To address this problem, we have proposed a reclassification for major claudin groups based upon our analysis of five commonly studied vertebrate models: human, mouse, zebrafish, Fugu, and X. tropicalis. Nomenclature commissions strictly regulate the use of gene names, however our goal is to provide a comprehensive tool for investigators in the interim. The closing of claudin groups 9, 17, 21, 24, 26 and 27 is recommended to avoid future ambiguity. New, provisional groups are also proposed where homology could not be assigned or is contradictory to previous designations: 4-like, 5-like, 8-like, 25-like, 34, 35, and 36 (Table S3). These changes are designed to minimally affect current usage, yet confer needed evidence for common descent. The mRNA expression of 43 zebrafish claudins was observed in a panel of eight discrete tissue types (Fig. 4), with an additional 9 claudins detected from whole fish cDNA. Only one claudin, zebrafish cldn10a, had no evidence for expression, either in our analysis or by searches of accessioned Expressed Sequence Tags (EST). The expression profiles we report are consistent with previous studies examining zebrafish claudins (Clelland and Kelly, 2010; Kumai et al., 2011), with small differences in findings likely attributable to differences in amplification efficiency or gel visualization. Quantitative methods (e.g. qPCR), using a wider panel of tissues, should be performed in future studies to establish the full range of claudin expression. Although our analysis was primarily for the verification of predicted genes and therefore limited in scope, paralog claudins nearly always possessed distinct expression profiles, which may suggest some degree of sub-functionalization. Some paralogs, such as cldn10c, cldn10d, and cldn10e, were restricted to expression by single tissue types (Fig. 4). These findings are significant to the study of gene and genome duplication events, where gene duplicates have the theoretical potential to neofunctionalize, due to relaxation of selection pressures (Conant and Wolfe, 2008; Ohno, 1970). Expansion of the claudin gene family in teleost fishes, paired with unique gain of function/loss of function techniques available in the zebrafish model, may provide an interesting case study for assessing the link between duplication and the development of evolutionary novelty. Studies have previously noted that the presence of many related claudins clustered within confined genomic regions (e.g. the claudin 3, 4 loci) likely point to tandem gene duplication (TGD) as a key component of claudin diversity in fishes (Lal-Nag and Morin, 2009; Loh et al., 2004). Our evidence agrees with these findings, however we also note that the putative whole-genome duplication (WGD) in teleosts may also be a dominant component of this expansion, particularly if TGD events occurred prior to WGD. The existence of many orthologs between zebrafish and Fugu suggests claudin TGD may have occurred in the ancestor group (Neopterygii), at/or near similar timeframes proposed for the teleost WGD event (Hoegg et al., 2004; Taylor et al., 2003). Further examination of claudin loci from basal fish taxa will be required to assess alternate mechanisms of genomic addition and their relative importance to evolutionary diversification of expansive protein families, such as the claudins. It is postulated that a net excess of positively or negatively charged residues in the ECL1 of claudins promotes formation of paracellular anion or cation pores, respectively, between transinteracting claudins (Krause et al., 2008). Various studies in mouse and human collectively suggest that ‘classic’ claudins are involved in processes that reduce paracellular permeability (Krause et al., 2008). The overall conservation of six charged residues in the ECL1 of ‘classic’ zebrafish claudins (Fig. 5) may impart a similar functional property as an equal proportion of unequally charged residues leads to a tight interaction (Krause et al., 2009). Additionally, Colegio et al. (2002) have reported that offsetting the net charge by substituting a negative for a positive residue at one of these conserved sites in ECL1 of human CLDN4 increases paracellular Na+ permeability. Therefore, ‘classic’ zebrafish claudins consisting of those belonging 25 to superclades depicted in Figs. 1C, D, and 2B, C, D, and E may function in tight junctions characteristic of low paracellular permeability. Since ‘non-classic’ claudins typically appear to increase paracellular permeability to various ions and claudin composition within tight junctions is thought to influence paracellular permeability properties (review: Krause et al., 2008), structural variation in this particular class of claudins is not surprising. Comparatively less is known of the functional properties of basic and acidic residues in the ECL1 of ‘non-classic’ claudins, thus they pose interesting targets for future investigations that employ site directed mutagenesis (see: Wen et al., 2004; Piontek et al., 2008; Angelow and Yu, 2009). For example, Colegio et al. (2002) reported that substituting all of the acidic with basic residues in the ECL1 of human CLDN15 reverses paracellular permeability from cations to anions. These findings collectively suggest that ‘non-classic’ claudins may participate in formation of leaky junctions and variation in their composition at tight junction sites influences paracellular permeability characteristics (e.g., anion, cation, etc.). Finally, future investigations of trans-interactions will be required to elucidate the significance of structural differences in the ECL2 of ‘non-classic’ zebrafish claudins, which may relate to cell– matrix interactions that regulate cell differentiation or proliferation in addition to that of tight junction formation (see: Angelow et al., 2008; Heiskala et al., 2001). 5. Concluding remarks Here, we describe the molecular structures, evolutionary history, and mRNA expression of 54 zebrafish claudins in the most recent annotated zebrafish genome (Zv9). In order to resolve ambiguity and promote future gene discovery with correct classification, we propose a unified interim nomenclature based upon shared evolutionary history of vertebrate claudins. Additionally, we classify zebrafish claudins into broad functional groups based upon the structures and features of the ECL1 and ECL2. This work was undertaken to provide investigators with better insight into the evolutionary relationships and functionality of vertebrate claudins, and to facilitate future hypothesis testing from a comparative approach. Conflict of interest The authors declare no conflict of interest in the work or funding of this study. Acknowledgments The authors would like to thank Dr. John Godwin (NC State University, Raleigh) for providing the zebrafish (Tuebingen) used in this study. This study was funded by the National Science Foundation and the United States Department of Agriculture. Appendix A. Supplementary data Supplementary data to this article can be found online at http:// dx.doi.org/10.1016/j.margen.2013.05.001. References Abdelilah-Seyfried, S., 2010. Claudin-5a in developing zebrafish brain barriers: another brick in the wall. Bioessays 32, 768–776. Angelow, S., Yu, A.S.L., 2009. Structure–function studies of claudin extracellular domains by cysteine-scanning mutagenesis. J. Biol. Chem. 284, 29205–29217. Angelow, S., et al., 2008. Biology of claudins. Am. J. Physiol. Renal Physiol. 295, F867–F876. Bagherie-Lachidan, et al., 2008. Claudin-3 tight junction proteins in Tetraodon nigroviridis: cloning, tissue-specific expression, and a role in hydromineral balance. Am. J. Physiol. Regul. Integr. Comp. Physiol. 294, R1638–R1647. Bishop, M.J., Friday, A.E., 1987. Tetrapod relationships: the molecular evidence. In: Patterson, C. (Ed.), Molecules and Morphology in Evolution. Cambridge University Press, Cambridge (UK), pp. 123–139. 26 D.A. Baltzegar et al. / Marine Genomics 11 (2013) 17–26 Cheung, I.D., et al., 2011. Regulation of intrahepatic biliary duct morphogenesis by Claudin 15-like b. Dev. Biol. 361, 68–78. Clelland, E.S., Kelly, S.P., 2010. Tight junction proteins in zebrafish ovarian follicles: stage specific mRNA abundance and response to 17β-estradiol, human chorionic gonadotropin, and maturation inducing hormone. Gen. Comp. Endocrinol. 168, 388–400. Colegio, O.R., et al., 2002. Claudins create charge-selective channels in the paracellular pathway between epithelial cells. Am. J. Physiol. Cell Physiol. 283, C142–C147. Colegio, O.R., et al., 2003. Claudin extracellular domains determine paracellular charge selectivity and resistance but not tight junction fibril architecture. Am. J. Physiol. Cell Physiol. 284, C1346–C1354. Conant, G.C., Wolfe, K.H., 2008. Turning a hobby into a job: how duplicated genes find new functions. Nat. Rev. Genet. 9, 938–950. Dayhoff, M.O., et al., 1978. A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure, vol. 5 (S3). National Biomedical Research Foundation, Washington, D.C., pp. 345–352. D'Souza, T., et al., 2005. Phosphorylation of claudin-3 at threonine 192 by cAMPdependent protein kinase regulates tight junction barrier function in ovarian cancer cells. J. Biol. Chem. 280, 26233–26240. Felsenstein, J., 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376. Hardison, A.L., et al., 2005. The zebrafish gene claudinj is essential for normal ear function and important for the formation of the otoliths. Mech. Dev. 122, 949–958. Heiskala, M., et al., 2001. The roles of claudin superfamily proteins in paracellular transport. Traffic 2, 92–98. Henikoff, S., Henikoff, J.G., 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. U. S. A. 89, 10915–10919. Hewitt, K.J., et al., 2006. The claudin gene family: expression in normal and neoplastic tissues. BMC Cancer 6, 186. Hoegg, S., et al., 2004. Phylogenetic timing of the fish-specific genome duplication correlates with the diversification of teleost fish. J. Mol. Evol. 59, 190–203. Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755. Jeong, J.Y., et al., 2008. Functional and developmental analysis of the blood–brain barrier in zebrafish. Brain Res. Bull. 75, 619–628. Jones, D.T., et al., 1992. The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8, 275–282. Katoh, M., Katoh, M., 2003. CLDN23 gene, frequently down-regulated in intestinal-type gastric cancer, is a novel member of CLAUDIN gene family. Int. J. Mol. Med. 11, 683–689. Kollmar, R., et al., 2001. Expression and phylogeny of claudins in vertebrate primordia. Proc. Natl. Acad. Sci. U. S. A. 98, 10196–10201. Krause, G., et al., 2008. Structure and function of claudins. Biochim. Biophys. Acta 1778, 631–645. Krause, G., et al., 2009. Structure and function of extracellular claudin domains. Ann. N. Y. Acad. Sci. 1165, 34–43. Kumai, Y., et al., 2011. Strategies for maintaining Na+ balance in zebrafish (Danio rerio) during prolonged exposure to acidic water. Comp. Biochem. Physiol. A 160, 52–62. Lal-Nag, M., Morin, P.J., 2009. The claudins. Genome Biol. 10, 235. http://dx.doi.org/ 10.1186/gb-2009-10-8-235. Le Moellic, C., et al., 2005. Aldosterone and tight junctions: modulation of claudin-4 phosphorylation in renal collecting duct cells. Am. J. Physiol. Cell Physiol. 289, C1513–C1521. Loh, Y.H., et al., 2004. Extensive expansion of the claudin gene family in the teleost fish, Fugu rubripes. Genome Res. 14, 1248–1257. Lu, G., Moriyama, E.N., 2004. Vector NTI, a balanced all-in-one sequence analysis suite. Brief. Bioinform. 5, 378–388. Miller, M.A., et al., 2010. Creating the CIPRES Science Gateway for Inference of Large Phylogenetic Trees. Gateway Computing Environments Workshop (GCE), New Orleans, LA. Morita, K., Furuse, M., Fujimoto, K., Tsukita, S., 1999. Claudin multigene family encoding four-transmembrane domain protein components of tight junction strands. Proc. Natl. Acad. Sci. U. S. A. 96, 511–516. Münzel, E.J., et al., 2011. Claudin k is specifically expressed in cells that form myelin during development of the nervous system and regeneration of the optic nerve in adult zebrafish. Glia 60, 253–270. Nilsson, H., et al., 2007. Effects of hyperosmotic stress on cultured airway epithelial cells. Cell Tissue Res. 330, 257–269. Ohno, S., 1970. Evolution by Gene Duplication. Springer-Verlag, Heidelberg. Ohta, H., et al., 2006. Restricted localization of claudin-16 at the tight junction in the thick ascending limb of Henle's loop together with claudins 3, 4, and 10 in bovine nephrons. J. Vet. Med. Sci. 68, 453–463. Piontek, J., et al., 2008. Formation of tight junction: determinants of homophilic interaction between classic claudins. FASEB J. 22, 146–158. Povey, S., et al., 2001. The HUGO Gene Nomenclature Committee (HGNC). Hum. Genet. 109, 678–680. Satake, S., et al., 2008. Cdx2 transcription factor regulates claudin-3 and claudin-4 expression during intestinal differentiation of gastric carcinoma. Pathol. Int. 58, 156–163. Schmidt, H.A., et al., 2002. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502–504. Schultz, J., et al., 1998. SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl. Acad. Sci. U. S. A. 95, 5857–5864. Shimodaira, H., Hasegawa, M., 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16, 1114–1116. Siddiqui, M., et al., 2010. The tight junction component Claudin E is required for zebrafish epiboly. Dev. Dyn. 239, 715–722. Taylor, J.S., et al., 2003. Genome duplication, a trait shared by 22,000 species of ray-finned fish. Genome Res. 13, 382–390. Thompson, J.D., et al., 1994. CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680. Thompson, J.D., et al., 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882. Tipsmark, C.K., et al., 2008a. Salinity regulates claudin mRNA and protein expression in the teleost gill. Am. J. Physiol. Regul. Integr. Comp. Physiol. 294, R1004–R1014. Tipsmark, C.K., et al., 2008b. Branchial expression patterns of claudin isoforms in Atlantic salmon during seawater acclimation and smoltification. Am. J. Physiol. Regul. Integr. Comp. Physiol. 294, R1563–R1574. Van Itallie, C.M., Anderson, J.M., 2006. Claudins and epithelial paracellular transport. Annu. Rev. Physiol. 68, 403–429. Wen, H., et al., 2004. Selective decrease in paracellular conductance of tight junctions: role of the first extracellular domain of claudin-5. Mol. Cell. Biol. 24, 8408–8417. Whelan, S., Goldman, N., 2001. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18, 691–699. Winkler, L., et al., 2009. Molecular determinants of the interaction between Clostridium perfringens enterotoxin fragments and Claudin-3. J. Biol. Chem. 284, 18863–18872. Zhang, J., et al., 2010. Establishment of a neuroepithelial barrier by Claudin5a is essential for zebrafish brain ventricular lumen expansion. Proc. Natl. Acad. Sci. U. S. A. 107, 1425–1430.