20
Braz J Med Biol Res 37(4) 2004 Brazilian Journal of Medical and Biological Research (2004) 37: 459-478 ISSN 0100-879X Genome features of Leptospira interrogans serovar Copenhageni 1 Centro de Biotecnologia, Instituto Butantan, São Paulo, SP, Brasil 2 Instituto de Química, 3 Instituto de Biociências and 4 Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo, SP, Brasil 5 Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, SP, Brasil 6 Laboratório de Bioinformática, Instituto de Computação, Universidade Estadual de Campinas, Campinas, SP, Brasil 7 Koninklijk Instituut voor de Tropen/Royal Tropical Institute (KIT), KIT Biomedical Research, Amsterdam, The Netherlands 8 The David Geffen School of Medicine at University of California, Los Angeles, CA, USA 9 Division of Infectious Diseases, Veterans Affairs Greater Los Angeles Healthcare System, Los Angeles, CA, USA 10 Virginia Bioinformatics Institute and Department of Computer Science, Virginia Polytechnic Institute and State University Bioinformatics 1, Blacksburg, VA, USA A.L.T.O. Nascimento 1 , S. Verjovski-Almeida 2 , M.A. Van Sluys 3 , C.B. Monteiro-Vitorello 5 , L.E.A. Camargo 5 , L.A. Digiampietri 6 , R.A. Harstkeerl 7 , P.L. Ho 1 , M.V. Marques 4 , M.C. Oliveira 3 , J.C. Setubal 6,10 , D.A. Haake 8,9 and E.A.L. Martins 1 Abstract We report novel features of the genome sequence of Leptospira interrogans serovar Copenhageni, a highly invasive spirochete. Lep- tospira species colonize a significant proportion of rodent populations worldwide and produce life-threatening infections in mammals. Ge- nomic sequence analysis reveals the presence of a competent transport system with 13 families of genes encoding for major transporters including a three-member component efflux system compatible with the long-term survival of this organism. The leptospiral genome contains a broad array of genes encoding regulatory system, signal transduction and methyl-accepting chemotaxis proteins, reflecting the organism’s ability to respond to diverse environmental stimuli. The identification of a complete set of genes encoding the enzymes for the cobalamin biosynthetic pathway and the novel coding genes related to lipopolysaccharide biosynthesis should bring new light to the study of Leptospira physiology. Genes related to toxins, lipoproteins and several surface-exposed proteins may facilitate a better understanding of the Leptospira pathogenesis and may serve as potential candidates for vaccine. Correspondence A.L.T.O. Nascimento Centro de Biotecnologia, Instituto Butantan, Av. Vital Brazil, 1500 05503-900 São Paulo, SP Brasil Fax: +55-11-3726-1505 E-mail: [email protected] Research supported by FAPESP and CNPq. Received 20 February, 2004 Accepted March 8, 2004 Key words Leptospira interrogans Outer membrane proteins Lipopolysaccharides Transport Regulatory systems Introduction Spirochetes are motile, helically shaped bacteria which include the genera Leptospira, Leptonema, Borrelia and Treponema. Bor- relia and Treponema are the causative agents of Lyme disease, relapsing fever and syphi- lis. Leptospira consists of a genetically di- verse group of pathogenic and non-patho- genic or saprophytic species (1). Leptospiro- sis is a widespread zoonotic disease: trans- mission to humans occurs through contact with domestic or wild animal reservoirs or an environment contaminated by their urine.

Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

459

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome featuresBrazilian Journal of Medical and Biological Research (2004) 37: 459-478ISSN 0100-879X

Genome features of Leptospirainterrogans serovar Copenhageni

1Centro de Biotecnologia, Instituto Butantan, São Paulo, SP, Brasil2Instituto de Química, 3Instituto de Biociências and4Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo, SP, Brasil5Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo,Piracicaba, SP, Brasil6Laboratório de Bioinformática, Instituto de Computação,Universidade Estadual de Campinas, Campinas, SP, Brasil7Koninklijk Instituut voor de Tropen/Royal Tropical Institute (KIT), KIT BiomedicalResearch, Amsterdam, The Netherlands8The David Geffen School of Medicine at University of California, Los Angeles,CA, USA9Division of Infectious Diseases, Veterans Affairs Greater Los Angeles HealthcareSystem, Los Angeles, CA, USA10Virginia Bioinformatics Institute and Department of Computer Science, VirginiaPolytechnic Institute and State University Bioinformatics 1, Blacksburg, VA, USA

A.L.T.O. Nascimento1,S. Verjovski-Almeida2,

M.A. Van Sluys3,C.B. Monteiro-Vitorello5,

L.E.A. Camargo5,L.A. Digiampietri6,

R.A. Harstkeerl7,P.L. Ho1, M.V. Marques4,

M.C. Oliveira3,J.C. Setubal6,10,

D.A. Haake8,9 andE.A.L. Martins1

Abstract

We report novel features of the genome sequence of Leptospirainterrogans serovar Copenhageni, a highly invasive spirochete. Lep-tospira species colonize a significant proportion of rodent populationsworldwide and produce life-threatening infections in mammals. Ge-nomic sequence analysis reveals the presence of a competent transportsystem with 13 families of genes encoding for major transportersincluding a three-member component efflux system compatible withthe long-term survival of this organism. The leptospiral genomecontains a broad array of genes encoding regulatory system, signaltransduction and methyl-accepting chemotaxis proteins, reflecting theorganism’s ability to respond to diverse environmental stimuli. Theidentification of a complete set of genes encoding the enzymes for thecobalamin biosynthetic pathway and the novel coding genes related tolipopolysaccharide biosynthesis should bring new light to the study ofLeptospira physiology. Genes related to toxins, lipoproteins andseveral surface-exposed proteins may facilitate a better understandingof the Leptospira pathogenesis and may serve as potential candidatesfor vaccine.

CorrespondenceA.L.T.O. Nascimento

Centro de Biotecnologia,

Instituto Butantan,

Av. Vital Brazil, 1500

05503-900 São Paulo, SP

Brasil

Fax: +55-11-3726-1505

E-mail: [email protected]

Research supported by FAPESP

and CNPq.

Received 20 February, 2004

Accepted March 8, 2004

Key words• Leptospira interrogans• Outer membrane proteins• Lipopolysaccharides• Transport• Regulatory systems

Introduction

Spirochetes are motile, helically shapedbacteria which include the genera Leptospira,Leptonema, Borrelia and Treponema. Bor-relia and Treponema are the causative agentsof Lyme disease, relapsing fever and syphi-

lis. Leptospira consists of a genetically di-verse group of pathogenic and non-patho-genic or saprophytic species (1). Leptospiro-sis is a widespread zoonotic disease: trans-mission to humans occurs through contactwith domestic or wild animal reservoirs or anenvironment contaminated by their urine.

Page 2: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

460

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

Infection produces a wide spectrum of clini-cal manifestations. The early phase of illnessis characterized by fever, chills, headache,and severe myalgias. The disease progressesin 5 to 15% of the clinical infections toproduce severe multisystem complicationssuch as jaundice, renal failure and hemor-rhagic manifestations (2). In developed coun-tries, leptospirosis is associated with recre-ational activities (1) while in developingcountries it produces large urban epidemicswith mortality mainly during the rainy sea-son (3). Leptospirosis also represents a ma-jor economic problem producing abortions,stillbirths, infertility, failure to thrive, re-duced milk production, and death in animalssuch as cows, pigs, sheep, goats, horses, anddogs (1). Environmental control measuresare difficult to implement because of thelong-term survival of pathogenic leptospiresin soil and water and the abundance of wildand domestic animal reservoirs (1). Lep-tospira are classified according to serovarstatus - more than 200 pathogenic serovarshave been identified. Structural heterogene-ity in lipopolysaccharide (LPS) moieties ap-pears to be the basis for the large degree ofantigenic variation observed among sero-vars (1). The development of vaccines hasbeen pursued as a strategy for the preventionof leptospirosis. At present, vaccines arebased on inactivated whole cell or mem-brane preparations of pathogenic leptospireswhich induce immune responses against lep-tospiral LPS (1). However, these vaccinesdo not induce long-term protection againstinfection and do not provide cross-protec-tive immunity against heterologous leptospi-ral serovars. Protein antigens conservedamong pathogenic serovars may contributeto overcoming the limitations of the cur-rently available vaccines.

The genome sequence of Leptospirainterrogans serovar Lai was recently pub-lished (4) and comparative genome analysiswith L. interrogans serovar Copenhagenihas been performed. We report here new

features of the L. interrogans serovar Co-penhageni that should contribute to under-standing the molecular mechanisms of lep-tospiral physiology, pathogenesis and facili-tate the identification of candidates for broad-range vaccines.

Material and Methods

The sequenced strain, Fiocruz L1-130,was isolated as described by Nascimento etal. (5). The sequencing strategy adopted fol-lows the basic outline of the Xylella genomeproject (6). Library construction, sequenc-ing, assembly, and finishing were carried outby the Agronomical and Environmental Ge-nomes consortium [http://aeg.lbi.ic.unicamp.br] and by Instituto Butantan. The genomewas assembled using phrap from shotgunreads, cosmid reads and PCR-product se-quences. Scaffolding was performed usingdomestic software. Finishing criteria arebased on consensus base phred quality of atleast 20 and consensus base covered by atleast one read sequence of each DNA strand(6). The first base of the sequence was cho-sen based on our hypothesis for the origin ofthe replication locus, which was in turn basedon the presence of certain genes and on GC-skew variation. Genome annotation and com-parative genomics were done as previouslydescribed (7). Detection of potentially sur-face-exposed integral membrane proteins wascarried out as described by Nascimento et al.(5). Sequences from 16S rDNA were manu-ally assembled using ESEE 3.2. Phyloge-netic analyses were performed based on twomatrices (34 taxa and 1255 positions; 24taxa and 1375 positions) using the programPAUP 4.0b8 (8). Divergence time was esti-mated based on 1445 positions of the 16SrRNA sequences. A constant rate of 1 to 2%per 50 million years was assumed (9).

The sequences have been depositedin Genbank under accession numbersAE016823 (chromosome I) and AE016824(chromosome II).

Page 3: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

461

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

Results and Discussion

Genome analysis

The Leptospira genome consists of twocircular chromosomes with a total of4,627,366 base pairs (bp), chromosome Iwith 4,277,185 bp and chromosome II with350,181 bp (5). Circular representations ofboth chromosomes are depicted in Figure 1.The origin of replication of the large repli-con was identified between the dnaA anddnaN genes, as in other bacterium genomes(10). GC nucleotide skew (G - C/G + C)analysis (11) confirmed the origin of replica-tion of the large replicon and indicated twoputative sites for the small replicon.

As previously described (12), rRNA genesin L. interrogans are not organized into op-erons, as in most other bacteria, but arescattered over the chromosome I (Figure 1).L. interrogans serovar Copenhageni has onerrf gene, two rrl genes and two rrs genescoding for 5S, 23S and 16S rRNA, respec-tively. As in other parasitic strains, L. inter-rogans serovar Copenhageni has only onerrf (5S) gene, which is located close to theorigin replication region as described beforefor other strains of L. interrogans (12). Com-paring the complete rrs (16S) sequences forL. interrogans, serovars Copenhageni, Laiand Canicola the identity among the se-quences is of 99.9 to 100%. The rrf (5S)sequence identity comparing Lai and Co-penhageni is 100% and the rrl (23S) is 99%.Based on ribosomal genes, Copenhageni andLai are closely related, as supported by thewhole genome comparison (5).

The Spirochaetes are divided into threemajor phylogenetic groups, or families: Spi-rochaetaceae, which includes Borrelia andTreponema among others; Brachyspiraceaeand Leptospiraceae, which contain two gen-era, Leptospira and Leptonema (13). Phylo-genetic analyses based on 16S rDNA se-quences, using Leptonema as an outgroup,show that Leptospira are split into two well-

supported monophyletic groups (Figure 2),one of them formed by the pathogenic strains(e.g., L. interrogans) and the other formedby the non-pathogenic strains (e.g., L. biflexa).At the base of the clade of the pathogenicstrains, L. inadai and L. fainei form a well-supported assemblage. Similar results wereobtained by Postic et al. (14) based on 16SrDNA analyses. In these analyses L. interro-gans serovars formed a well-supported mono-phyletic cluster closely related to L. kirchneri(Figure 2B). Considering a constant diver-gence rate of 1 to 2% per 50 million years forthe 16S rDNA (9), separation time betweenthe two main assemblages (L. interrogansversus L. biflexa) was estimated at 590 to295 million years.

L. interrogans belongs to a growing num-ber of multichromosomal prokaryotes, in-cluding Vibrio cholerae (15) and Ralstoniasolanacearum (16). The small replicon of L.interrogans was previously suggested to bea second chromosome based upon the local-ization of the metF gene which encodes anessential biosynthetic pathway enzyme, meth-ylene tetrahydrofolate reductase (17). Thegenome sequence reveals that genes encod-ing enzymes for metabolic pathways, suchas glycolysis and the tricarboxylic acid cycle,as well as the enzymes for biosynthesis path-ways of amino acid and co-factor are alsodistributed between the two chromosomes.Sequence analysis of chromosome II showsthat an almost complete operon of genescoding for the protoheme biosynthesis path-way is present (hemAIBCENYH). Althoughno homologue of the gene coding for uropor-phyrinogen III synthetase (hemD) was found,experimental evidence has shown that thehemC gene is able to cope with hemD activ-ity (18). Therefore, L. interrogans has theability to synthesize protoheme de novo. Inaddition, 13 genes clustered in chromosomeII coding for the cobalamin biosynthesis path-way were identified (cobC, cobD, cbiP, cobP,cobB, cobO, cobM, cobJ, cbiG, cobI, cobL,cobH, cobF) (Figure 3). Orthologues of

Page 4: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

462

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

Figu

re 1

. Circ

ular

map

s of

the

tw

o ch

rom

osom

es in

Lep

tosp

ira in

terr

ogan

s. C

ircul

ar r

epre

sent

atio

n of

the

L. i

nter

roga

ns s

erov

ar C

open

hage

ni g

enom

e. T

he t

wo

larg

e an

d sm

all c

hrom

osom

esar

e pr

esen

ted.

Circ

les

1 an

d 2

(fro

m t

he o

utsi

de in

), al

l pre

dict

ed p

rote

in-c

odin

g re

gion

s (f

orw

ard

and

reve

rse

stra

nd, r

espe

ctiv

ely)

col

or-c

oded

by

cate

gory

rol

e; c

ircle

3: G

+ C

con

tent

. Num

bers

on t

he o

uter

circ

le a

re b

ase

pairs

. rR

NA

gen

es, r

rl, rr

s an

d rr

f are

23S

, 16S

and

5S

, res

pect

ivel

y. N

ote:

the

tw

o re

plic

ons

are

not

draw

n to

the

sam

e sc

ale.

Chr

omos

ome

II (C

II) is

12

times

sm

alle

rth

an c

hrom

osom

e I (

CI).

Page 5: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

463

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

Figu

re 2

. Phy

loge

netic

ana

lysi

s of

Lep

tosp

ira in

terr

ogan

s. C

onse

nsus

phy

loge

netic

dis

tanc

e tr

ee c

onst

ruct

ed u

sing

16S

rD

NA

seq

uenc

es. N

umbe

rs o

n th

e br

anch

es a

re b

oots

trap

val

ues

(200

0re

plic

ates

) for

dis

tanc

e (t

op),

pars

imon

y (b

elow

the

bran

ch) a

nd m

axim

um li

kelih

ood

(10

repl

icat

es, b

elow

the

bran

ch, i

talic

s). G

enB

ank

acce

ssio

n nu

mbe

rs a

re in

par

ente

ses.

Lep

tone

ma

is u

sed

as o

utgr

oup.

A, B

ased

on

a m

atrix

with

34

taxa

and

125

5 po

sitio

ns;

B,

base

d on

a m

atrix

with

24

taxa

and

137

5 po

sitio

ns.

Page 6: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

464

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

cobGKN genes, known to be involved in thecobalamin pathway (19) were not found.However, there are two predicted codingsequences inside this operon in chromosomeII that could perform these steps. One has anoxidoreductase NAD-binding domain(LIC20133) and the other is a [2Fe-2S] ferre-doxin involved in electron transfer(LIC20135). In addition, other genes presentin the genome coding for reductases such asLIC11145, LIC13354, LIC12391, andLIC10522 could also cope with these activi-ties. The presence of cysG in chromosome I,a gene that encodes a multifunctional pro-tein with methylase, oxidase and ferrochela-tase activities (20), may also be a cobalt-inserting enzyme in the B12 pathway. Othergenes involved in this biosynthesis pathwaywere found in chromosome I (cysG/hemX/cobA, cobT/cobU, cobS) (Figure 3). In fact,experimental evidence has recently shownthat leptospires can grow in medium in theabsence of B12 (Hartskeerl RA, unpublishedresults). This is contrary to the previousstatement that L. interrogans is unable tosynthesize B12, and that is why it is anessential component of the EMJH semi-syn-thetic medium (1,4). Thus, L. interrogans,unlike the spirochetes Borrelia burgdoferiand Treponema pallidum, have the completerepertoire of genes for de novo synthesis ofprotoheme and cobalamin. The functionallink between the two replicons supports theview that the small replicon is in fact asecond chromosome.

Transport

Among the 220 Leptospira transport pro-teins, we found 148 proteins comprising 108different major primary and secondary trans-porters (Figure 4), as defined by Paulsen etal. (21). There are 34 major primary ATP-driven transporters including 30 ABC-trans-porters (53 proteins), the largest protein fam-ily in L. interrogans, as usually is the case forbacteria (22). The ABC superfamily con-

tains both uptake and efflux transport sys-tems, and ATP hydrolysis energizes the trans-port. The porters of the ABC superfamilyconsist of two integral membrane proteins(that determine specificity of the transportedsolute) and two cytoplasmic ATP-hydrolyz-ing proteins present as homodimers or heter-odimers. The uptake systems (but not theefflux systems) additionally possess extra-cytoplasmic solute-binding receptors (oneor more per system), which in Gram-nega-tive bacteria are found in the periplasm. Wefound 21 ABC efflux systems, includingdrug and heavy metal export and detoxifica-tion, lipoprotein-releasing and hemolysinexport systems. The remaining 9 are ABCuptake systems, including iron, sulfate,nickel, phosphate, dipeptide, amino acid,and carbohydrate uptake. There is one F-type ATP-synthase system (8 proteins), asmentioned above, and 3 P-type cation-trans-port ATPases (5 proteins).

We found 59 secondary electrochemicalpotential-driven transporters (65 proteins)including the largest family of secondarytransporters, the resistance-nodulation-celldivision superfamily with 12 members, 5 ofwhich belong to the heavy metal efflux fam-ily (with 7 proteins) and 7 to the multipledrug efflux family (with 8 proteins). Thesesecondary efflux transporters are energizedby proton-motive force and show the widestsubstrate specificity among all known multi-drug pumps, ranging from most of the cur-rently used antibiotics, disinfectants, dyes,and detergents to simple solvents. The sec-ond largest secondary transporter family isthe major facilitator superfamily with 9 mem-bers including drug:cation antiporters (8 pro-teins) and a glycerol-phosphate:inorganicphosphate antiporter (1 protein). Additionalsecondary transporters include three sodium-solute symporters (one sodium-glucose co-transporter), three sodium-bile-acid sympor-ters, and symporters for phosphate, di-tripeptides, glutamate, amino acids, uracil,sulfate as well as an ammonium:potassium

Page 7: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

465

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

Figure 3. The cobalamin biosynthetic pathway of Leptospira interrogans. Genetic organization of the cobalamin biosynthetic locus in the twochromosomes (CI and CII) of L. interrogans serovar Copenhageni.

Figure 4. Major transporters of Leptospira inter-rogans. Major primary transporters are driven byATP hydrolysis and include ABC-transporters, P-type ion transporters and the ATP-synthase. Allother transporters indicated are the major sec-ondary transporters, which are driven by thetransmembrane electrochemical potential. RND= resistance-nodulation-cell division superfamily.

Page 8: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

466

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

antiporter, a glycerol-3-P:Pi antiporter, twodrug-sodium antiporters and two polysac-charide exporters. Secondary transportersalso include 7 members of the TonB familyof auxiliary proteins for energizing the outermembrane receptor-mediated active trans-port. Leptospira has many iron-transportingproteins in addition to the cobalamin/ironperiplasmic binding protein component ofthe ABC-transport system (LIC13403) men-tioned above. Four members of the outermembrane receptor family have been identi-fied, which are probably involved in thetransport of iron dicitrate (LIC10714), hemin(LIC10964), ferric hydroxamate (LIC11345)and ferrienterobactin (LIC12998). In addi-tion, an orthologue of the ferrous iron up-take (FeoB) protein of E. coli was found(LIC11402). FeoB has been characterized asan Fe2+ uptake system and possesses anATP/GTP binding motif at its N-terminalhydrophilic domain, therefore being prob-ably energized by ATP or GTP hydrolysis(23).

Oxygen defense

The Leptospira genome contains severalgenes encoding enzymes with peroxidaseactivities, such as catalase, glutathione per-oxidase and thiol peroxidase. However, su-peroxide dismutase orthologues and twoimportant regulons, SoxRS and OxyR, nor-mally responsible for the defense againstoxidative stress, are absent (5). It is possiblethat metalloporphyrins (24) could providedefense against oxidative damage, since L.interrogans is competent for porphyrin bio-synthesis and has metal uptake transport pro-teins. Genes coding for co-migratory bacte-rioferritin (LIC20093 and LIC10732), thiolperoxidase (LIC12765) and peroxiredoxin(LIC11219) with alkyl hydroperoxide re-ductase (AhpC) and thiol-specific antioxi-dant activities, respectively, were also iden-tified. Two predicted coding sequences forbacterioferritin (LIC11310 and LIC13209),

which may have functions analogous to ani-mal ferritin, are also present and may pro-vide both iron detoxification and storage,that would prevent free iron in Leptospirafrom driving oxidative reactions.

Regulatory systems and signal transduction

There are many genes encoding signaltransduction proteins in Leptospira, indicat-ing that this organism has developed a vastarray of regulatory systems that enable it torespond to environmental signals. This vari-ety of regulatory domains is found in non-obligate parasitic bacteria, indicating a muchgreater need to interpret the signals from theenvironment in order to respond properly,while obligate parasites have a much lowernumber and variety of domains in their sig-nal transduction proteins (25). There are 80genes encoding components of the phospho-rylation-mediated signal transduction path-way: 29 histidine kinases (HK), 30 responseregulators (RR) and 18 hybrid kinase/regu-lators (HK/RR). These members of the two-component systems present several domainsorganized into different arrangements (Fig-ure 5). Nineteen of the histidine kinases arelocated in the inner membrane, nine are cy-toplasmic and one is probably found in theperiplasm, as predicted by the PSORT pro-gram [http://psort.nibb.ac.jp/].

On the other hand, only a third of thehybrid HK/RR are membrane-bound, sug-gesting that these hybrid proteins are mostlikely to be involved in phosphorylation cas-cades, although some of them have the sen-sor PAS domain. The PAS domain has beenreported to sense different environmentalstimuli, like oxygen, redox state, nitrogenavailability or light, and it may or may not beassociated with co-factors such as heme orFAD (26). A two-component protein con-taining a PAS sensor domain was found tobe required for virulence of M. tuberculosisin mice, probably because it senses radicaloxygen intermediates generated by macro-

Page 9: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

467

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

Figure 5. Domain architecture of putative signal transduction proteins from Leptospira interrogans. The domain organization of histidine kinases,response regulators, hybrid histidine kinase/response regulators and other peptides probably associated with signal transduction is shownschematically. The number of peptides containing each domain organization is shown on the right, and the genes are found distributed over the largeor small replicons, except for the seven peptides containing the PAS/GGDEF arrangement, which are encoded by genes located in tandem in thelarge replicon. AAA = ATPase; Cache = small ligand binding domain; EAL = phosphodiesterase; GAF = cGMP-binding domain; GGDEF = diguanylatecyclase; HK = histidine kinase domain; HTH = helix-turn-helix; PAS = sensor domain (may include a PAS/PAC arrangement); PP = phosphatase; RR =response regulator domain; SP = signal peptide; TM = transmembrane region.

Page 10: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

468

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

phage phagocytosis (27), and probably thisis also the case for Leptospira.

The response regulators are the cytoplas-mic effectors of the message, which becomefunctional after being phosphorylated at anaspartate residue by the cognate histidinekinase. The RRs may possess a second effec-tor domain (Figure 5), which will perform itsultimate function, such as the DNA-bindinghelix-turn-helix domain that allows the RRto regulate transcription. Other noticeabledomains found in L. interrogans RRs are theGGDEF and EAL motifs, which correspondto putative diguanylate cyclase and phos-phodiesterase domains, respectively, and aphosphatase domain similar to the mamma-lian phosphatase 2C that may be involved inthe phosphorelay.

Cyclic nucleotides are likely to have amajor regulatory role in Leptospira. Thereare 19 homologues of adenylate/guanylatecyclases, two specific phosphodiesterasesand 7 cyclic nucleotide-binding proteins thatprobably have a regulatory function. Thereare 12 genes encoding proteins containingthe GGDEF motif, seven of which are orga-nized in tandem in chromosome I (LIC11131to LIC11125), and they also have a PAS/PAC sensor domain in the amino-terminalregion, suggesting that they are the productof gene duplication. The diguanylate cyclaseactivity of the GGDEF domain is requiredfor a novel regulatory mechanism involvingbis-(2',5')-cyclic diguanylic acid (c-di-GMP)as an allosteric activator (28). Two of theGGDEF-containing proteins also have theCache domain, which is involved in chemo-taxis signal transduction (29), an importantfeature for Leptospira. There are eight puta-tive diguanylate phosphodiesterases contain-ing an EAL domain, with three of them beingassociated with an RR domain (Figure 5). L.interrogans also presents in its genome threeserine/threonine kinases and two hybrid HK/RR with a GAF domain, which is a bindingdomain for cGMP (30).

Other interesting findings in the genome

include three related genes encoding puta-tive DNA-binding proteins (LIC20104,LIC20105 and LIC20178), which contain 6transmembrane domains in the amino-termi-nus and one helix-turn-helix domain of theAraC type at the carboxyl-terminus. Thesegenes are present in chromosome II, and twoof them are clustered. Orthologues with a C-terminal DNA-2binding domain and a hy-drophobic N-terminal region were describedin other bacteria, including Borrelia and Tre-ponema (31), but their function is unknown.

Motility and chemotaxis

The L. interrogans genome comprises arelatively large number of motility and che-motaxis genes. Enteric bacteria usually haveabout 50 genes coding for structural andfunctional proteins involved in motility (32).A similar number of genes have been identi-fied for the spirochetes T. pallidum and B.burgdorferi (33,34). Apparently, the motil-ity and chemotaxis apparatus of L. interro-gans is more complex since its genome con-tains at least 79 putative motility-associatedgenes. All genes are well conserved amongL. interrogans, T. pallidum and B. burgdor-feri and 42 genes were found to be commonto all three genera. However, the leptospiralgenome contains multiple copies of a num-ber of motility-associated genes, accountingpartly for the higher number. For example,the genome of serovar Copenhageni con-tains 5 flaB genes, 4 motB genes and 2 motAgenes compared to 3 flaB genes and onecopy of both motA and motB in the T. pallidumgenome and one copy each of flaB, motA,and motB in the B. burgdorferi genome. Inaddition, the L. interrogans genome con-tains 11 putative genes encoding methyl-accepting chemotaxis proteins, which isroughly twice as many as in T. pallidum andB. burgdorferi. Forty-eight of the 79 motil-ity-associated genes are positioned into 14gene clusters varying in size from 2 to 8genes. Thus, like in T. pallidum and B. burg-

Page 11: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

469

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

dorferi, the majority of the structural andfunctional motility genes are positioned inpotential operons. However, the operonsprobably underwent extensive rearrange-ments as they are generally smaller, oftencorresponding to only parts of the majorTreponema and Borrelia operons. For ex-ample, the flgB operon in B. burgdorfericonsisting of 26 genes (35) has been dis-rupted into 6 fragments dispersed at 5 posi-tions in the leptospiral genome. Differencesin number of genes and operon organizationmight be associated with the high flexibilityof pathogenic leptospires enabling them tosurvive and adapt to a variety of environ-ments and hosts.

Hemolysins

The primary lesion caused by Leptospirais the damage to the endothelium of smallblood vessels, leading to hemorrhage andlocalized ischemia in multiple organs. As aconsequence, renal tubular necrosis, hepato-cellular damage, meningitis, and myositismay occur in the infected host (1). Hemoly-sins may play a fundamental role in this toxicprocess. Several genes coding for predictedhemolysins were identified in the L. interro-gans genome. Some of them are structurallyrelated to sphingomyelinases C (LIC10657,LIC12631, LIC12632, LIC11040, andLIC13198). Although generically calledsphingomyelinases, it is possible that thesegenes code for hydrolases that act not onlyon sphingomyelins but also on othersphingolipids. Erythrocytes may represent atarget for these enzymes since they are richin glycosphingolipids like the antigenic de-terminants of the ABO system. Interestingly,LIC12631 and LIC12632 are organized asan operon, which may suggest a concertedexpression and action.

The other class of genes coding forhemolysins, tlyABC, was identified. Althoughthey were assigned to the same TlyABCclass, they do not present structural similar-

ity. These putative hemolysins were firstidentified in the spirochete Brachyspira hyo-dysenteriae. Hemolytic and cytotoxic activi-ties of the recombinant TlyA, TlyB and TlyCproteins, expressed in E. coli, were detected(36). TlyB belongs to the family of Clp ATP-dependent proteases (37) and there are 3genes coding for structurally related proteins(LIC10339, LIC11814 and LIC12017). Thus,5 genes of the tlyABC class (LIC10284,LIC10339, LIC11814, LIC12017, andLIC13143) were identified in the L. interro-gans genome.

LIC12134 codes for an HlpA-related pro-tein which was characterized as a hemolysinin Prevotella intermedia, a common oralbacterium associated with periodontitis (38).Another identified gene (LIC10325) is re-lated to the hlyX gene which codes for apredicted hemolysin previously identified inL. borgpetersenii serovar Hardjo (Acces-sion number AAF09252, unpublished re-sults). LIC11352 should also be mentionedas the gene which codes for LipL32 orHAP-1, a ubiquitous lipoprotein of patho-genic Leptospira (see Lipoproteins sectionbelow) with hemolytic activity (39).

Surface-exposed proteins

Pathogenic leptospires require severaltypes of surface-exposed proteins for thepurpose of colonization and survival in themammalian host. Surface-exposed proteinsmay be categorized as nonspecific porins,specific channels for nutrient acquisition,efflux channels, lipoproteins, adhesins, S-layer glycoproteins, peripheral membraneproteins, or surface-maintenance proteins.Aside from S-layer proteins and peripheralmembrane proteins, these surface-exposedproteins would be expected to be integratedinto the outer membrane via transmembraneregions or lipid modification. The leptospi-ral genome contains homologues of SecYand other secretory proteins involved in ex-porting proteins with signal peptides across

Page 12: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

470

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

the cytoplasmic membrane. The leptospiralgenome also contains both the standard sig-nal peptidase and the lipoprotein signal pep-tidase. The standard signal peptidase is in-volved in the hydrolysis of signal peptides oftransmembrane outer membrane proteins andperiplasmic proteins, releasing them fromthe cytoplasmic membrane. Lipoprotein sig-nal peptidase hydrolyzes the signal peptidesof lipoproteins prior to lipidation of the N-terminal cysteine.

Transmembrane outer membrane proteins

Analysis of the L. interrogans genomereveals 83 beta-sheet transmembrane outermembrane proteins (OMPs) and all exceptone (OmpL1, LIC10973) (40) were previ-ously unknown. An example of a newlyidentified leptospiral protein with a trans-membrane beta-sheet structure is LIC10642which is predicted to be a member of theOMP superfamily.

Among these newly identified OMPs is afamily of predicted coding sequences(LIC20202, LIC10995, LIC11465, andLIC11103) belonging to the superfamily ofalpha/beta hydrolases, which includes bac-terial lipases. Another transmembrane OMPis LIC11623, a member of a family of highlyconserved proteins of Gram-negative bacte-ria, including N. meningitides Omp85, whichis thought to be a chaperonin involved in theassembly of OMPs in the outer membrane(41). OmpL1 belongs to the class of nonspe-cific porins allowing passage of small mol-ecules (<1000 Daltons) across the leptospi-ral outer membrane (40). Nonspecific porinsallow both influx of nutrients and efflux ofproducts of metabolism. Another example isLIC11458, which is predicted to be a mem-ber of the porin superfamily.

Specific channels

Survival in the mammalian host environ-ment by bacterial pathogens requires the

acquisition of certain trace nutrients. Forexample, iron is essential for leptospiralgrowth and is bound by a number of high-affinity binding proteins in mammals, whichrestrict its availability. Bacteria scavengetrace nutrients, including iron, heme, andvitamin B12, from their environment utiliz-ing the cytoplasmic membrane protein TonBand a series of “TonB-dependent” OMPs(Figure 6). After binding, transport of thenutrient across the outer membrane into theperiplasm by this type of channel is an en-ergy-dependent step requiring interaction ofthe TonB-dependent OMP with TonB. TheL. interrogans genome contains a TonBorthologue (LIC10889) and a large numberof TonB-dependent OMPs: LIC20151,LIC20214, LIC10998, LIC10964, LIC10714,LIC12374, LIC12898, LIC12998, LIC11694,LIC11345, LIC10882, LIC10881, andLIC10896.

The leptospiral genome also containsOMPs involved in efflux pathways (Figure6). The type I efflux system is representedby an orthologue of TolC (LIC13135), theouter membrane exporter of hemolysin anddrugs, along with an orthologue of CusC(LIC11941), an outer membrane exporter ofcopper ion. A second type of efflux pathwayfound in the leptospiral genome belongs tothe resistance-nodulation-cell division su-perfamily, a three-member complex that cata-lyzes substrate efflux via an H+ antiport mech-anism. The three-member complex consistsof an resistance-nodulation-cell divisiontransporter, a membrane fusion protein, andan outer membrane factor involved in theexport of proteins, carbohydrates, drugs ortoxic heavy metals (Figure 6). The leptospi-ral orthologues are most closely related tothose of the Alcaligenes eutrophus cobalt/cadmium/zinc export system consisting ofthe resistance-nodulation-cell divisiontransporter CzcA (LIC12224, LIC15510and LIC11938), CzcB (LIC12306 andLIC11940), and CzcC (LIC12307 andLIC11941). In addition, L. interrogans has

Page 13: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

471

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

two orthologues of the cation efflux systemprotein CzcD (LIC11714 and LIC13205),which are members of the cation diffusionfacilitator family. CzcD of Bacillus subtilishas been shown to catalyze divalent cation(Zn2+ or Cd2+) efflux in exchange for theuptake of two monovalent cations (K+ andH+) in an electroneutral process energizedby the transmembrane pH gradient (42).

Lipoproteins

Experimental evidence for fatty acidmodification of leptospiral lipoproteins hasbeen described for the outer membrane lipo-proteins LipL32 (LIC11352) (43), LipL36(LIC13060) (44), and LipL41 (LIC12966)(45). The cytoplasmic membrane also con-tains lipoproteins, as demonstrated for

LipL31 (LIC11456) and LipL71 (LIC11003)(46). A total of 184 predicted coding se-quences in the L. interrogans genome werefound to have a lipoprotein signal peptidasecleavage site (5). All proposed lipoprotein-coding sequences conform to the rule ofhaving an L, I, V, or F in the -3 and/or -4position relative to cysteine and most ofthem have A, G, S, or N in the -1 positionrelative to cysteine (47).

Proteolytic functions can be assigned tosome of the newly identified lipoproteins:four lipoproteins are thermolysin homologues(LIC10715, LIC13320, LIC13321, andLIC13322), and one is a predicted metallo-protease (LIC11834). Several lipoproteinsmay be involved in hemolysis: two lipopro-teins are sphingomyelinase homologues(LIC10657 and LIC12632), and one is a

Figure 6. Model of leptospiral membrane architecture. Leptospires have two membranes, an outer membrane(OM) and a cytoplasmic or inner membrane (IM). As in Gram-positive bacteria, the peptidoglycan (PG) cell wall isclosely associated with the IM. The leptospiral surface is dominated by lipopolysaccharide (LPS) carbohydrate sidechains. Subsurface proteins include the cytoplasmic protein, GroEL, and the periplasmic endoflagella (EF). The IMcontains lipoproteins such as LipL31 and transmembrane proteins such as signal peptidase (SP) and ImpL63. TheOM contains lipoproteins including LipL41 and LipL36, and transmembrane proteins including the porin, OmpL1.Genomic sequence analysis reveals several novel types of outer membrane proteins (OMPs), including TonB-dependent OMPs involved in nutrient acquisition. BtuB is an example of a TonB-dependent OMP involved inuptake of vitamin B12. The type I efflux system is represented by TolC, which forms a complex with ATP-bindingcassette (ABC) transporters to export hemolysins and other cytoplasmic components. The leptospiral genomealso contains genes involved in a three-component efflux system consisting of an outer membrane factor (OMF),membrane fusion protein (MFP), and an inner membrane transporter, in this case CzcA, which is involved in heavymetal detoxification.

Page 14: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

472

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

phospholipase D homologue (LIC11754).Lipoprotein LIC10972 is predicted to belocated in the outer membrane and is a MauGhomologue belonging to a family of cyto-chrome c peroxidases that may be involvedin defense against hydrogen peroxide. Otherlipoproteins with putative enzymatic func-tions are homologues of glucose dehydro-genase (LIC12294) and cholesterol oxidase(LIC13202).

S-layer and peripheral membrane proteins

S-layers are thought to provide structuralintegrity to the bacterial surface (48). Al-though an S-layer has not been observed inpathogenic leptospires, the genome containsat least two proteins with S-layer homology(LIC10868 and LIC12952). Experimentalevidence is needed to determine whetherthese proteins are actually S-layer compo-nents.

Like S-layer proteins, peripheral mem-brane proteins are not integral membraneproteins. LipL45 is processed to a peripheralmembrane associated with the outer mem-brane, P31LipL45 (49). P31LipL45 expression isdramatically up-regulated in stationary phasecultures, and for this reason may have amembrane-stabilizing function. At this timeit is unclear whether P31LipL45 is surface-exposed. Interestingly, the leptospiral ge-nome contains a number of predicted codingsequences with homology to LipL45(LIC20102, LIC20114, LIC13414, andLIC10123).

Surface-maintenance proteins

Bacteria are likely to have a variety ofproteins involved in maintaining the surfacestructural integrity. One such protein, glyc-erophosphoryl diester phosphodiesterase, isa protein belonging to this category whichhas been found in all spirochetal genomesstudied to date. E. coli has two forms of thisenzyme, a cytosolic form, ugpQ, and a peri-

plasmic form, glpQ. The leptospiral genomecontains two homologues of this enzyme,LIC13182 and LIC10293. The former shouldbe the cytosolic form because it lacks asignal peptide while the latter should be theexported form because it has an N-terminalsignal peptide. In other spirochetes, GlpQ isassociated with the outer membrane (50).

Cytoplasmic membrane proteins

The leptospiral genome contains a num-ber of proteins that belong to a large familyof prokaryotic proteins with homology to thepeptidoglycan-associated portion of E. coliOmpA (51). These proteins are predicted tobe either cytoplasmic membrane (LIC20250,LIC10592, LIC13479, and LIC10050) orperiplasmic proteins (LIC10537 andLIC10191), rather than OMPs, which is con-sistent with experimental evidence that thespirochetal cell wall is more closely associ-ated with the cytoplasmic membrane thanthe outer membrane. An interesting proteinfamily is the mechanosensitive ion channel(LIC20069 and LIC12671). Two membersof this protein family of M. jannaschii havebeen functionally characterized and bothform mechanosensitive ion channels (52).Therefore, this family is likely to also en-code mechanosensitive channel proteins inL. interrogans, playing a physiological rolein bacterial osmoregulation.

Lipopolysaccharides

Lipopolysaccharides distinguish the lep-tospiral surface from those of the other inva-sive spirochetes. Changes in genes involvedin the LPS biosynthesis apparatus are thoughtto account for serovar diversity among lepto-spires (53). It has been shown that the lep-tospiral LPS resembles a typical LPS fromGram-negative bacteria, containing pentoses,hexoses and heptoses. Although the pre-dominant sugar is rhamnose, a large varietyof other sugar residues are found (1). Anti-

Page 15: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

473

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

bodies raised against LPS from differentLeptospira strains during infections are re-lated to polysaccharide structure in terms ofits sugar composition, number, repetitive-ness, and ramification (1). In Leptospira, asin many other bacteria, at least part of thegenes coding for enzymes of the polysaccha-ride biosynthesis pathway are found clus-tered in a region of chromosomes named O-antigen gene cluster (rfb locus) (53).

The complete genome analysis reveals alarge portion of about 119 kb (genome posi-tion 2.538.470-2.554.255), in which all the108 predicted coding sequences are tran-scribed in the same orientation (Figure 7).Interestingly, this region is dense in genesrelated to LPS biosynthesis (Table 1) andincludes the O-antigen gene cluster previ-ously described in L. interrogans serovarCopenhageni (53,54). In the first 14 kb ofthis region the predicted genes seem not tobe related to LPS biosynthesis, but to DNAreplication, and some genes code for riboso-mal proteins. In the other 105 kb of thisregion there are 94 predicted genes 56 ofwhich are clearly related to LPS biosynthe-sis. These predicted genes encode enzymesfor nucleotide sugar biosynthesis, such asthe enzymes for dTDP-rhamnose biosynthe-sis (54) for CMP-N-acetylneuraminic acidand for perosamine synthase. In addition,many genes coding for sugar transferases,sugar modifications and for the O-antigenprocessing, Wzx-flippase and Wzy-poly-merase, involved in oligosaccharide export-ing and assembly of the LPS, were identi-fied. Some genes encoding proteins of thelipid A biosynthesis (lpxD) and transport(msbA) are also found is this region.

The comparison with the correspondingregion in the genome of L. interrogansserovar Lai (4) revealed only three distinctpredicted coding sequences: two extra cop-ies of genes for aminotransferase (LIC12197and LIC12198) and the absence of a geneencoding galactoside O-acetyltransferase(LA1622).

Seventy-seven other genes, probably re-lated to LPS biosynthesis, were identifiedalong the genome. All the genes related tolipid A biosynthesis described in E. coliwere identified in the Leptospira genome(lpxA, lpxC, lpxD, lpxB, lpxK). However, thepredominant fatty acids in lipid A of L.interrogans are dodecanoic and hexade-canoic acid instead of hydroxymyrystoyl (14carbons), which is the signature of Gram-negative bacteria (1). Comparative analysisbetween LpxA of E. coli and P. aeruginosa(55) showed that few structural differencesin this enzyme determine changes in the fattyacid chain size incorporated during lipid Abiosynthesis. The lower endotoxicity of lep-tospiral LPS as compared to LPS from Gram-negative bacteria has been reported (56). Itwas also reported that leptospiral LPS in-duces a TLR2-dependent cell activation, in-stead of LTR4, the receptor frequently in-volved in the LPS immune response (57).The lower endotoxicity and the difference inthe mechanism of cell immune response ac-tivation are supposed to be related to differ-ences in the lipid A structure.

Genes encoding enzymes of Kdo biosyn-thesis (kdsA, kdsB) such as Kdo-transferase(waaA or kdtA), which catalyzes the bindingof Kdo to lipid IVA, were identified. Al-though typical Kdo was found in LeptospiraLPS, it was reported to be substituted atdifferent carbon positions (1).

Four paralogues encoding MsbA, the pro-tein that transfers the flippase of lipid A in

Table 1. Number of genes at the lipopolysaccha-ride 119-kb locus.

Predicted protein functionNucleotide sugar biosynthesis 22Sugar transferase 17Sugar epimerase 10Lipid A and polysaccharide transport 4Hypothetical 26 (17 conserved)Aminoacyl tRNA synthase 3DNA metabolism 4Ribosomal protein 3Others 19

Page 16: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

474

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

Figu

re 7

. Li

popo

lysa

ccha

ride

regi

on o

f Le

ptos

pira

inte

rrog

ans.

The

119

-kb

regi

on o

f L.

inte

rrog

ans

sero

var

Cop

enha

geni

gen

ome

with

108

pre

dict

ed c

odin

g se

quen

ces

tran

scrib

ed in

the

sam

eor

ient

atio

n. O

rang

e ar

row

s in

dica

te p

redi

cted

gen

es e

ncod

ing

poly

sacc

harid

e bi

osyn

thes

is-r

elat

ed p

rote

ins.

Sha

ded

regi

ons

indi

cate

the

hom

olog

ous

gene

s fr

om L

. bo

rgpe

ters

enii

sero

var

Har

djob

ovis

(53

). Th

e re

gion

con

tain

ing

the

pred

icte

d co

ding

seq

uenc

es,

indi

cate

d by

the

let

ters

F,

E,

A,

B,

D,

C,

was

firs

t in

dica

ted

as t

he r

fb g

enes

rel

ated

to

rham

nose

bio

synt

hesi

s in

L. in

terr

ogan

s C

open

hage

ni (5

4).

Page 17: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

475

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

the inner membrane from the cytoplasmic sideto the periplasmic side, were identified. One ofthe msbA genes is located in the 119-kb region.Another msbA gene is located upstream oflpxK, as reported in many other genomes (55).It will be interesting to compare the set ofgenes coding for enzymes related to LPS bio-synthesis identified in the Leptospira genometo the orthologues in other microorganisms,in order to correlate LPS structural differ-ences and endotoxic activity.

Comparative genomics and insertionsequences

A three-way comparison between the L.interrogans genome and the genomes of B.burgdorferi and T. pallidum yields the fol-lowing results: 1167 (31%) of the genes in L.interrogans Copenhageni are found in B.burgdorferi and/or in T. pallidum, 666 (41%)of the genes in B. burgdorferi are found inthe Copenhageni genome, and 589 (57%) ofthe genes in T. pallidum are found in the L.interrogans genome. A total of 362 pre-dicted genes were found to be shared by allthree spirochetes, 45 of which are hypotheti-cal (detailed list in our project website: http://aeg.lbi.ic.unicamp.br/world/lic/). These re-sults show that a thorough analysis of thegenome can significantly contribute to theunderstanding of spirochete biology.

Our analysis of the L. interrogans ge-nome revealed the presence of the previ-

ously described insertion sequence (IS) ele-ments IS1500, IS1501 (58), reminiscent in-sertions of IS1533 (59) and IS1502 (60) andan IS recently identified that we designatedISlin1 (5) (Table 2). IS1500, IS1501 andISlin1 account for 24 insertions distributedthroughout chromosome I. Together, IS ele-ments and transposases comprise 2% of thegenome. These elements are related to majorIS families such as IS110 and IS3 that aredefined by their conserved amino acid motif(DDE) in the transposase. So far all theinsertions were found in intergenic regions,having no mutagenic effects on L. interro-gans genes.

The L. interrogans genome provides newinsights into biosynthesis pathway, transportfamilies, environmental response, and patho-genesis. A broad array of regulatory systemproteins enable this organism to respond toenvironmental signals. A new group of genesinvolved in LPS biosynthesis may contributeto the elucidation of serovar diversity amongleptospires. Several categories of surface-exposed proteins required for colonizationand survival in the mammalian host wereidentified. These proteins may have a role inmechanisms of leptospiral pathogenesis andprotective immunity. Available vaccines areserovar-specific and have low efficacy (1).Surface-exposed proteins that are conservedamong pathogenic serovars may be used forvaccine development for the prevention ofleptospirosis.

Table 2. Insertion sequence (IS) elements present in Leptospira interrogans serovar Copenhageni.

Element Family Copies Length (bp) Transposase Terminal inverted repeats

ISlin1 IS110 11 1,423 2 (126aa + 195) 5’end AAACTCAACAtctCGCTCTTTAcGAATCGC3’end GCGATTCaTAAAGAGCGttgTGTTGAGTTT

IS1500 IS3 8 1,234 2 (282aa + 99aa) 5’end TAACCTaGtgACGaAttAttGGACACaTTTT3’end AAAAgGTGTCCgtTttTaCGTagCcAGGTTA

IS1501 IS3 5 1,230 2 (132aa + 85aa) 5’end TGAagTAGTcACATAAAAGTgGACAgCcaTTT3’end AAAaaGgTGTCtACTTTTATGTtACTAggTCA

Page 18: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

476

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

Acknowledgments

We are deeply indebted to Drs. I. Raw (FundaçãoButantan, São Paulo, SP, Brazil) for use of laboratoryfacilities and valuable support. We thank Dr. Albert I. Ko(Fiocruz, Salvador, BA, Brazil) for the strain Fiocruz L1-130, the Agronomical and Environmental Genomes (AEG)

Consortium of the network Organization for NucleotideSequencing and Analysis (ONSA) for the genome sequenc-ing data, and Dr. L.C.C. Leite (Instituto Butantan) for acritical reading of this manuscript.

Linear representation of the Leptospira interrogans serovar Copenhageni chromosomes (CI and CII).Genes are colored according to their biological role. Arrows indicate the direction of transcription.The pie representation indicates the distribution of the number of genes according to their biologicalrole. The numbers below protein-producing genes correspond to gene identification numbers (IDs).

Additional information is contained in the project websitehttp://aeg.lbi.ic.unicamp.br/world/lic/

InsertInsertInsertInsertInsert L

Page 19: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

477

Braz J Med Biol Res 37(4) 2004

Leptospira interrogans genome features

References

1. Faine S, Adler B, Bolin C & Perolat P (1999). Leptospira and Lepto-spirosis. MediSci, Melbourne, Australia.

2. Bharti AR, Nally JE, Ricaldi JN et al. (2003). Leptospirosis: a zoonoticdisease of global importance. Lancet Infectious Diseases, 3: 757-771.

3. Romero EC, Bernardo CC & Yasuda PH (2003). Human leptospiro-sis: a twenty-nine-year serological study in São Paulo, Brazil. Re-vista do Instituto de Medicina Tropical de São Paulo, 45: 245-248.

4. Ren SX, Fu G, Jiang XG et al. (2003). Unique physiological andpathogenic features of Leptospira interrogans revealed by whole-genome sequencing. Nature, 422: 888-893.

5. Nascimento ALTO, Ko AI, Martins EAL et al. (2004). Comparativegenomics of two Leptospira interrogans serovars reveals novelinsights into physiology and pathogenesis. Journal of Bacteriology(in press).

6. Simpson AJ, Reinach FC, Arruda P et al. (2000). The genome se-quence of the plant pathogen Xylella fastidiosa. Nature, 406: 151-157.

7. Van Sluys MA, de Oliveira MC, Monteiro-Vitorello CB et al. (2003).Comparative analyses of the complete genome sequences ofPierce’s disease and citrus variegated chlorosis strains of Xylellafastidiosa. Journal of Bacteriology, 185: 1018-1026.

8. Swofford DL (2000). PAUP*-Phylogenetic Analysis Using Parsimony(*and other methods). Version 4. Sinauer Associates, Sunderland,MA, USA.

9. Ochman H, Elwyn S & Moran NA (1999). Calibrating bacterial evolu-tion. Proceedings of the National Academy of Sciences, USA, 96:12638-12643.

10. Messer W (2002). The bacterial replication initiator DnaA. DnaA andoriC, the bacterial mode to initiate DNA replication. FEMS Microbiol-ogy Reviews, 26: 355-374.

11. Lobry JR (1996). Asymmetric substitution patterns in the two DNAstrands of bacteria. Molecular Biology and Evolution, 13: 660-665.

12. Fukunaga M & Mifuchi I (1989). Unique organization of Leptospirainterrogans rRNA genes. Journal of Bacteriology, 171: 5763-5767.

13. Paster BJ & Dewhirst FE (2000). Phylogenetic foundation of spiro-chetes. Journal of Molecular Microbiology and Biotechnology, 2:341-344.

14. Postic D, Riquelme-Sertour N, Merien F, Perolat P & Baranton G(2000). Interest of partial 16S rDNA gene sequences to resolveheterogeneities between Leptospira collections: application to L.meyeri. Research in Microbiology, 151: 333-341.

15. Heidelberg JF, Eisen JA, Nelson WC et al. (2000). DNA sequence ofboth chromosomes of the cholera pathogen Vibrio cholerae. Na-ture, 406: 477-483.

16. Salanoubat M, Genin S, Artiguenave F et al. (2002). Genome se-quence of the plant pathogen Ralstonia solanacearum. Nature, 415:497-502.

17. Bourhy P & Saint Girons I (2000). Localization of the Leptospirainterrogans metF gene on the CII secondary chromosome. FEMSMicrobiology Letters, 191: 259-263.

18. Guegan R, Camadro JM, Saint Girons I & Picardeau M (2003).Leptospira spp. possess a complete haem biosynthetic pathwayand are able to use exogenous haem sources. Molecular Microbiol-ogy, 49: 745-754.

19. Rodionov DA, Vitreschak AG, Mironov AA & Gelfand MS (2003).Comparative genomics of the vitamin B12 metabolism and regula-tion in prokaryotes. Journal of Biological Chemistry, 278: 41148-

41159.20. Spencer JB, Stolowich NJ, Roessner CA & Scott AI (1993). The

Escherichia coli cysG gene encodes the multifunctional protein,siroheme synthase. FEBS Letters, 335: 57-60.

21. Paulsen IT, Nguyen L, Sliwinski MK, Rabus R & Saier Jr MH (2000).Microbial genome analyses: comparative transport capabilities ineighteen prokaryotes. Journal of Molecular Biology, 301: 75-100.

22. Meidanis J, Braga MD & Verjovski-Almeida S (2002). Whole-ge-nome analysis of transporters in the plant pathogen Xylella fasti-diosa. Microbiology and Molecular Biology Reviews, 66: 272-299.

23. Kammler M, Schon C & Hantke K (1993). Characterization of theferrous iron uptake system of Escherichia coli. Journal of Bacteriol-ogy, 175: 6212-6219.

24. Vujaskovic Z, Batinic-Haberle I, Rabbani ZN, Feng QF, Kang SK,Spasojevic I, Samulski TV, Fridovich I, Dewhirst MW & Anscher MS(2002). A small molecular weight catalytic metalloporphyrin antioxi-dant with superoxide dismutase (SOD) mimetic properties protectslungs from radiation-induced injury. Free Radical Biology and Medi-cine, 33: 857-863.

25. Galperin MY, Nikolskaya AN & Koonin EV (2001). Novel domains ofthe prokaryotic two-component signal transduction systems. FEMSMicrobiology Letters, 203: 11-21.

26. Taylor BL & Zhulin IB (1999). PAS domains: internal sensors ofoxygen, redox potential, and light. Microbiology and Molecular Biol-ogy Reviews, 63: 479-506.

27. Rickman L, Saldanha JW, Hunt DM, Hoar DN, Colston MJ, Millar JB& Buxton RS (2004). A two-component signal transduction systemwith a PAS domain-containing sensor is required for virulence ofMycobacterium tuberculosis in mice. Biochemical and BiophysicalResearch Communications, 314: 259-267.

28. Ausmees N, Mayer R, Weinhouse H, Volman G, Amikam D,Benziman M & Lindberg M (2001). Genetic data indicate that pro-teins containing the GGDEF domain possess diguanylate cyclaseactivity. FEMS Microbiology Letters, 204: 163-167.

29. Anantharaman V & Aravind L (2000). Cache - a signaling domaincommon to animal Ca(2+)-channel subunits and a class of prokary-otic chemotaxis receptors. Trends in Biochemical Sciences, 25:535-537.

30. Ho YS, Burden LM & Hurley JH (2000). Structure of the GAFdomain, a ubiquitous signaling motif and a new class of cyclic GMPreceptor. EMBO Journal, 19: 5288-5299.

31. Subramanian G, Koonin EV & Aravind L (2000). Comparative ge-nome analysis of the pathogenic spirochetes Borrelia burgdorferiand Treponema pallidum. Infection and Immunity, 68: 1633-1648.

32. Bourret RB, Charon NW, Stock AM & West AH (2002). Bright lights,abundant operons-fluorescence and genomic technologies advancestudies of bacterial locomotion and signal transduction: review ofthe BLAST meeting, Cuernavaca, Mexico, 14 to 19 January 2001.Journal of Bacteriology, 184: 1-17.

33. Fraser CM, Casjens S, Huang WM et al. (1997). Genomic sequenceof a Lyme disease spirochaete, Borrelia burgdorferi. Nature, 390:580-586.

34. Fraser CM, Norris SJ, Weinstock GM et al. (1998). Complete ge-nome sequence of Treponema pallidum, the syphilis spirochete.Science, 281: 375-388.

35. Ge Y, Old IG, Saint Girons I & Charon NW (1997). Molecular charac-terization of a large Borrelia burgdorferi motility operon which isinitiated by a consensus sigma70 promoter. Journal of Bacteriol-

Page 20: Genome features of Leptospira interrogans serovar Copenhageni · 2004. 4. 22. · The Spirochaetes are divided into three major phylogenetic groups, or families: Spi-rochaetaceae,

478

Braz J Med Biol Res 37(4) 2004

A.L.T.O. Nascimento et al.

ogy, 179: 2289-2299.36. ter Huurne AA, Muir S, van Houten M, van der Zeijst BA, Gaastra W

& Kusters JG (1994). Characterization of three putative Serpulinahyodysenteriae hemolysins. Microbial Pathogenesis, 16: 269-282.

37. Squires C & Squires CL (1992). The Clp proteins: proteolysis regula-tors or molecular chaperones? Journal of Bacteriology, 174: 1081-1085.

38. Beem JE, Nesbitt WE & Leung KP (1999). Cloning of Prevotellaintermedia loci demonstrating multiple hemolytic domains. OralMicrobiology and Immunology, 14: 143-152.

39. Lee SH, Kim KA, Park YG, Seong IW, Kim MJ & Lee YJ (2000).Identification and partial characterization of a novel hemolysin fromLeptospira interrogans serovar lai. Gene, 254: 19-28.

40. Shang ES, Exner MM, Summers TA, Martinich C, Champion CI,Hancock RE & Haake DA (1995). The rare outer membrane protein,OmpL1, of pathogenic Leptospira species is a heat-modifiable porin.Infection and Immunity, 63: 3174-3181.

41. Voulhoux R, Bos MP, Geurtsen J, Mols M & Tommassen J (2003).Role of a highly conserved bacterial protein in outer membraneprotein assembly. Science, 299: 262-265.

42. Guffanti AA, Wei Y, Rood SV & Krulwich TA (2002). An antiportmechanism for a member of the cation diffusion facilitator family:divalent cations efflux in exchange for K+ and H+. Molecular Micro-biology, 45: 145-153.

43. Haake DA, Chao G, Zuerner RL, Barnett JK, Barnett D, Mazel M,Matsunaga J, Levett PN & Bolin CA (2000). The leptospiral majorouter membrane protein LipL32 is a lipoprotein expressed duringmammalian infection. Infection and Immunity, 68: 2276-2285.

44. Haake DA, Martinich C, Summers TA, Shang ES, Pruetz JD, McCoyAM, Mazel MK & Bolin CA (1998). Characterization of leptospiralouter membrane lipoprotein LipL36: downregulation associated withlate-log-phase growth and mammalian infection. Infection and Im-munity, 66: 1579-1587.

45. Shang ES, Summers TA & Haake DA (1996). Molecular cloning andsequence analysis of the gene encoding LipL41, a surface-exposedlipoprotein of pathogenic Leptospira species. Infection and Immuni-ty, 64: 2322-2330.

46. Haake DA & Matsunaga J (2002). Characterization of the leptospiralouter membrane and description of three novel leptospiral mem-brane proteins. Infection and Immunity, 70: 4936-4945.

47. Haake DA (2000). Spirochaetal lipoproteins and pathogenesis. Mi-crobiology, 146 (Part 7): 1491-1504.

48. Schuster B & Sleytr UB (2000). S-layer-supported lipid membranes.Journal of Biotechnology, 74: 233-254.

49. Matsunaga J, Young TA, Barnett JK, Barnett D, Bolin CA & HaakeDA (2002). Novel 45-kilodalton leptospiral protein that is processedto a 31-kilodalton growth-phase-regulated peripheral membrane pro-tein. Infection and Immunity, 70: 323-334.

50. Cameron CE, Castro C, Lukehart SA & Van Voorhis WC (1998).Function and protective capacity of Treponema pallidum subsp.pallidum glycerophosphodiester phosphodiesterase. Infection andImmunity, 66: 5763-5770.

51. De Mot R & Vanderleyden J (1994). The C-terminal sequence con-servation between OmpA-related outer membrane proteins andMotB suggests a common function in both gram-positive and gram-negative bacteria, possibly in the interaction of these domains withpeptidoglycan. Molecular Microbiology, 12: 333-334.

52. Kloda A & Martinac B (2001). Structural and functional differencesbetween two homologous mechanosensitive channels of Methano-coccus jannaschii. EMBO Journal, 20: 1888-1896.

53. de la Pena-Moctezuma A, Bulach DM, Kalambaheti T & Adler B(1999). Comparative analysis of the LPS biosynthetic loci of thegenetic subtypes of serovar Hardjo: Leptospira interrogans subtypeHardjoprajitno and Leptospira borgpetersenii subtype Hardjobovis.FEMS Microbiology Letters, 177: 319-326.

54. Mitchison M, Bulach DM, Vinh T, Rajakumar K, Faine S & Adler B(1997). Identification and characterization of the dTDP-rhamnosebiosynthesis and transfer genes of the lipopolysaccharide-relatedrfb locus in Leptospira interrogans serovar Copenhageni. Journal ofBacteriology, 179: 1262-1267.

55. Raetz CR & Whitfield C (2002). Lipopolysaccharide endotoxins.Annual Review of Biochemistry, 71: 635-700.

56. de Souza L & Koury MC (1992). Isolation and biological activities ofendotoxin from Leptospira interrogans. Canadian Journal of Micro-biology, 38: 284-289.

57. Werts C, Tapping RI, Mathison JC et al. (2001). Leptospiral lipopoly-saccharide activates cells through a TLR2-dependent mechanism.Nature Immunology, 2: 346-352.

58. Boursaux-Eude C, Saint Girons I & Zuerner R (1995). IS1500, an IS3-like element from Leptospira interrogans. Microbiology, 141 (Part9): 2165-2173.

59. Zuerner RL (1994). Nucleotide sequence analysis of IS1533 fromLeptospira borgpetersenii: identification and expression of two IS-encoded proteins. Plasmid, 31: 1-11.

60. Zuerner RL & Huang WM (2002). Analysis of a Leptospira interro-gans locus containing DNA replication genes and a new IS, IS1502.FEMS Microbiology Letters, 215: 175-182.