Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors

Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors

ID:81241350

大小:3.23 MB

页数:16页

时间:2023-09-04

上传者:U-14522
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第1页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第2页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第3页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第4页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第5页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第6页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第7页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第8页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第9页
Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors_第10页
资源描述:

《Perez-Rueda et al. - 2018 - Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库

RESEARCHARTICLEAbundance,diversityanddomainarchitecturevariabilityinprokaryoticDNA-bindingtranscriptionfactors1,213ErnestoPerez-Rueda*,RafaelHernandez-Guerrero,MarioAlbertoMartinez-Nuñez,415DagobertoArmenta-Medina,IsraelSanchez,J.AntonioIbarra*1InstitutodeInvestigacionesenMatema´ticasAplicadasyenSistemas,UniversidadNacionalAuto´nomadeMe´xico,UnidadAcade´micaYucata´n,Me´rida,Yucata´n,Me´xico,2DepartamentodeIngenieriaCelularyBiocata´lisis,InstitutodeBiotecnologı´a,UNAM,Cuernavaca,Morelos,Me´xico,3Laboratoriodea1111111111Ecogeno´mica,UnidadAcade´micadeCienciasyTecnolog´ıadeYucata´n,FacultaddeCiencias,UNAM,a1111111111Me´rida,Yucata´n,Me´xico,4Infotec,Aguascalientes,Aguascalientes,Me´xico,5LaboratoriodeGene´ticaa1111111111Microbiana,DepartamentodeMicrobiologı´a,EscuelaNacionaldeCienciasBiolo´gicas,InstitutoPolite´cnicoa1111111111Nacional,CiudaddeMe´xico,Me´xicoa1111111111*ernesto.perez@iimas.unam.mx(EPR);jaig19@gmail.com,jibarrag@ipn.mx(JAI)AbstractOPENACCESSGeneregulationatthetranscriptionallevelisacentralprocessinallorganisms,andDNA-Citation:Perez-RuedaE,Hernandez-GuerreroR,Martinez-NuñezMA,Armenta-MedinaD,SanchezI,bindingtranscriptionfactors,knownasTFs,playafundamentalrole.ThisclassofproteinsIbarraJA(2018)Abundance,diversityanddomainusuallybindsatspecificDNAsequences,activatingorrepressinggeneexpression.Ingen-architecturevariabilityinprokaryoticDNA-bindingeral,TFsarecomposedoftwodomains:theDNA-bindingdomain(DBD)andanextratranscriptionfactors.PLoSONE13(4):e0195332.domain,whichinthisworkwehavenamed“companiondomain”(CD).Thislattercouldbehttps://doi.org/10.1371/journal.pone.0195332involvedinoneormorefunctionssuchasligandbinding,protein-proteininteractionsorEditor:AxelCloeckaert,InstitutNationaldelaevenwithenzymaticactivity.IncontrasttoDBDs,whichhavebeenwidelycharacterizedRechercheAgronomique,FRANCEbothexperimentallyandbioinformatically,informationontheabundance,distribution,vari-Received:November22,2017abilityandpossibleroleoftheCDsisscarce.Here,weinvestigatedtheseissuesassociatedAccepted:March20,2018withthedomainarchitecturesofTFsinprokaryoticgenomes.Tothisend,19familiesofTFsPublished:April3,2018in761non-redundantbacterialandarchaealgenomeswereevaluated.InthisregardweCopyright:©2018Perez-Ruedaetal.Thisisanfoundfourmaingroupsbasedontheabundanceanddistributionintheanalyzedgenomes:openaccessarticledistributedunderthetermsofi)LysRandTetR/AcrR;ii)AraC/XylS,SinR,andothers;iii)Lrp,Fis,ArsR,andothers;andtheCreativeCommonsAttributionLicense,whichiv)agroupthatincludedonlytwofamilies,ArgRandBirA.Basedonaclassificationofthepermitsunrestricteduse,distribution,andorganismsaccordingtothelife-styles,amajorabundanceofregulatoryfamiliesinfree-livingreproductioninanymedium,providedtheoriginalauthorandsourcearecredited.organisms,incontrastwithpathogenic,extremophilicorintracellularorganisms,wasidenti-fied.Finally,theproteinarchitecturediversityassociatedtothe19familiesconsideringaDataAvailabilityStatement:AllrelevantdataarewithinthepaperanditsSupportingInformationweightscorefordomainpromiscuityevidencedwhichregulatoryfamilieswerecharacter-files.izedbyeitheralargediversityofCDs,herenamedas“promiscuous”familiesgiventheele-Funding:JAIwassupportedbygrant20180647vatednumberofvariabledomainsfoundinthoseTFs,oralowdiversityofCDs.AltogetherfromSecretariadeInvestigacionyPosgradofromthisinformationhelpedustounderstandthediversityanddistributionofthe19ProkaryotesInstitutoPolitecnicoNacional;EP-RandMAM-NTFfamilies.Moreover,initialstepsweretakentocomprehendthevariabilityoftheextrawerefundedbyDGAPA-UniversidadNacionalAuto´nomadeMe´xico(IN-201117andIA-205417,domaininthoseTFs,whicheventuallymightassistinevolutionaryandfunctionalstudies.respectively).Therewasnoadditionalexternalfundingreceivedforthisstudy.Thefundershadnoroleinstudydesign,datacollectionandanalysis,PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20181/16 DiversityandvariabilityofTFsinprokaryoticgenomesdecisiontopublish,orpreparationoftheIntroductionmanuscript.Regulationofgeneexpressionatthetranscriptionallevelisafundamentalprocessformain-Competinginterests:Theauthorshavedeclaredtainingcellularhomeostasis.Generegulationiscriticalforoptimizingproteinsandstructuralthatnocompetinginterestsexist.RNAsandthesubsequentlevelsofmetabolitesandothercellularproperties[1].Thisregula-tionoccursbysensingchangesinthesurroundingenvironmentandintheinternalcellularstate.Transcriptionalregulationcanhappenbymultiplemeans,suchassmallRNAsandribos-witches,amongothers,butbyfarthemostcommonregulatorymechanisminvolvestheactionofregulatoryproteins,alsoknownasDNA-bindingtranscriptionfactors(TFs)[2].Ingeneral,TFsaretwo-domainproteins,withaDNA-bindingdomain(DBD)ineithertheamino-orcar-boxy-terminus,whichisinvolvedinspecificcontactswiththeregulatoryregionofthecorre-spondingcognategenesthatresultinactivationorrepressionofgeneexpression.Theseconddomain,whichwehavenamedinthisarticleasªCompanionDomainº(CD),mighthaveoneormanytasksincludingligandbinding,protein-proteininteractionswithotherproteins(includingthetranscriptionalmachinery),andevenenzymeactivity,ormodulatingactivelytheDNA-bindingabilityoftheDBD.Inmanyreportsthisdomainisalsocalledªeffectorbindingdomainº(EBD)[3],butitisimportanttomentionthatonlyforasmallfractionofthesedomainsacapabilitytobindeffectormoleculesorproteinshasbeendescribedandthere-forewechosetouseourdefinition.Thatsaid,formostoftheTFsthisaccompanyingdomainorCDhasnotbeenextensivelystudied.Therefore,despitethefactthatthisCDmayhaveanassignmentitisescortingtheDBD.OnceaTFreceivesasignal,itmodifiesitsabilitytobindspecificallytotheDNA-bindingsiteandthenactsbyeitheractivatingorrepressinggeneexpression[4].Therefore,theaimofthisprojectwastoevaluate,inasystematicapproach,theabundance,diversityanddomainarchitecturevariabilityin19familiesofDNA-bindingTFsderivedfrom761non-redundantprokaryoticgenomes.OuranalysisonabundanceanddistributionofalltheidentifiedTFsgroupedtheminfourmainclassesoffamiliesdependingontheirabun-dance.Inaddition,weobservedthatsomeTFfamilieshavebeenmoreevolutionarysuccessfulthantheothersdependingonthelifestyleoftheorganisms.Finally,thoseTFfamilieswithahighervariabilityintheirCDswereidentified.ThismighthavesomeevolutionaryimplicationaboutwhichCDshaveahigherprobabilitytobefoundinotherTFinthesamefamily.WebelievethisisaninitialsteptostudyandunderstandtheCDsinTFs.MaterialsandmethodsBacterialandarchaealgenomesanalyzedProkaryoticgenomesweredownloadedfromtheNCBIwebserver.Toachieveacomparativeanalysis,openreadingframes(ORFs)thatencodepredictedproteinsinallorganismswereonlyconsidered.Inaddition,redundantgenomeswereexcludedtoavoidanybiasassociatedwithoverrepresentationoforganismsasreportedbyMartinez-Nuñezetal.[5].Inbrief,theselectionoforganismswascarriedoutthroughconcatenationof21conservedproteinsacrossbacteriaandarchaeagenomes.Inaposteriorstep,asingledatasetforphylogeneticanalysiswasconstructed,andgenomeslocatedcloselytogetheronaphylogenetictreewereexcludedleavingarepresentative;leavingasetof761representativegenomescomposedby672bacteriaand89archaea.IdentificationofregulatoryproteinsandcompaniondomainsTFswereidentifiedbasedonDBDdatabaseassignments[6]andtwomodelorganismswithexperimentalevidencescollectedinRegulonDBv9.0[7]andDBTBSv5.0[8]databases.InPLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20182/16 DiversityandvariabilityofTFsinprokaryoticgenomesaddition,family-specifichiddenMarkovmodel(HMM)profileswereconstructedfromknownTFsandfrominformationintheRegulonDBandDBTBSdatabasestosearchforinthebacterialandarchaealgenomesequences.TherepertoireofTFsidentifiedinthemodelorgan-ismswascross-checkedtocalculatetheintersectionbetweenwell-knownTFsidentifiedbyHMMprofiles,i.e.theefficiencyoftheprofilestoidentifypossibleTFs.TheHMMprofiles−4wereusedtoidentifypotentialDBDsbysettingtheE-valueat10andacoverageof60%.ThoseproteinsidentifiedbyHMMprofileswerescrutinizedtoassesstheirdomainorganiza-tionbyusingtheSuperfamilydatabaseassignments[9].Inbrief,weusedallthelibraryof1659superfamilyHMMmodelstorunintheHMMerprogramagainstthecollectionofTFs,with−3anE-valueat10tobeconsideredassignificant.CorrelationsbetweenfamiliesandnumberofdifferentgenomesTheabundanceofafamilyineachgenomewasmeasuredasthenumberofproteinswithatleastonepredictedhitfortherespectivefamily.Diversefamiliescanbepresentindifferentproportionsineachorganism,suchasthosethatoccurinonlyoneortwogenomesandinonlyafewproteinsorthosefamilieswidelydistributedinallthegenomes.Foreachfamily,changesinabundanceacrossdifferentbacteriaandarchaeacanbedescribedinanabundancepatternorprofile,aspreviouslydescribedreportedbyVogelandChothia[10].Inbrief,theabundancecountsforonefamilyacrossdifferentgenomeswerenormalizedaccordingtothefollowingformula:…AAavg†iAˆnAsdvWhereAiandAnaretheabsoluteandnormalizedabundancecounts,respectively,inaparticu-largenome,andA_avgandA_sdvaretheaverageabundanceandstandarddeviationsacrossallgenomesforthatfamily,respectively.Thismeansthattheabundanceofafamilyinonegenomecanbedescribedrelativetoitsabundanceinothergenomes.EnrichmentanalysisToevaluatetheassociationbetweenaDBDandCDsinthe19TFfamilies,anenrichmentanal-ysisconsideringaone-tailFisher'sexacttestwasconducted,withastatisticalsignificancep-valuee−10.MultipletestingcorrectionswereperformedusingtheBenjamini-Hochbergstep-upfalse-discoveryrate-controllingproceduretocalculateadjustedp-values.Rstudiowasusedtoperformalltheanalyses[11].EvaluationofdomainarchitecturediversityInordertodeterminethedomainarchitecturediversityassociatedwithTFs,theweighteddomainarchitecturescore(WS)wascalculatedasdescribedinreference[12].Inbrief,themethodconsiderstheproteinscontainingadomainandthetotalnumberofproteinsunderstudy(totalproteinspergenome)viatheinverseabundancefrequency(IAF)statistic,asfol-lows:PtIAF…d†ˆlog2PdWherePtisthenumberoftotalproteins(pergenome)andPdisthenumberofproteinscon-tainingdomaind.Tomeasuretheassociationofaproteindomain,wedefinedtheinversevariability(IV)obtainedfromtheinverseofthenumberofdistinctpartnerdomainfamiliesattheN-andPLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20183/16 DiversityandvariabilityofTFsinprokaryoticgenomesC-terminalsidesadjacenttoadomain,i.e.thediversityofarchitecturesassociatedtoaspecificdomain.ThedefinitionoftheIVofadomain,d,is:1IV…d†ˆfdWherefdisthenumberofdifferentdomainfamiliesadjacenttodomaind.Finally,theWSofadomainistheproductoftheIAFandtheIVofadomain:WSˆIAFIVGenomesizewindowsToevaluatetheassociationbetweentheWSandgenomesizes,allthegenomessizes(measuredinORFs)werebinnedin11intervalswithoutoverlaps,withawidthof836ORFs.ThenumberofintervalswascalculatedbyusingtheSturges'sformula,whichgroupedmanydifferentval-uesinequalclasses:k=1+log2(N),wherekisthenumberofequalclassesandNisthenumberofdata,roundedtothenearestintegervalue[11].Then,thewidthofclasseswasdeterminedwiththeformula:c=R/k,whereRisthedifferencebetweenthehighvalue(smallgenome)andlowvalue(largegenome).ResultsanddiscussionAbundanceofregulatoryproteinfamiliesinbacterialandarchaealgenomesInordertodeterminetheabundanceofDNA-bindingTFsinbacterialandarchaealgenomes,761non-redundantgenomeswereevaluated,andTFsbelongingto19familieswereidentifiedasdescribedinthemethodssection[5].TheresultssummarizedinTable1andS1TableshowthatthemostabundantfamiliesidentifiedwereTetR/AcrRandLysR,whicharewidelydistrib-utedamongallthebacterialandarchaealgenomesandtogetherrepresent28.4%ofthecom-pletecollectionofproteinsanalyzedinthiswork.MembersoftheTetR/AcrRfamily(15.9memberspergenome)aremainlyinvolvedinregulationofmultidrugresistance,biosynthesisofantibiotics,andpathogenicity(see[13]),i.e.fundamentalprocessesofbacterialresistance.LysRTFs(15.26memberspergenome)alsoregulateawidevarietyoftranscriptionunitsandfunctions.Forinstance,inthebacteriumEscherichiacoliK-12thisfamilyisinvolvedintheregulationofaminoacidbiosynthesisandcatabolism,oxidativestressresponseanddetoxifica-tionofthecell[7].AsecondgroupofsixfamiliesthatincludedAraC/XylS,GntR,MarR,GerE,PhoB,andSinRfamilies,withanaverageof8.01memberspergenome,represents44%ofthetotalofpro-teinsidentifiedinthiswork.Thesefamiliesregulatealargediversityoffunctions,suchascar-bonsourceassimilation,multipleantibioticsresponses,andphosphateregulation,amongothers[14,15].Athirdgroupoffamilieswithanaverageof4.12TFspergenomewasidentifiedandincludedLrp,Fis,GalR/LacI,andArsR(representingthe18%ofthetotaloftheproteinsevaluated).Finally,inafourthgroup,sevenfamilieswithasmallnumberofmembers(anaver-ageof1.19memberspergenome)wereidentifiedandgroupedtogether,representingatotalof9.5%ofthecollection,includingArgR,Fur,LexA,IclR,Crp,BirA,andTrmB.Indeed,proteinmembersofthesefamiliesarepresentusuallyinonlyonecopyperorganismorareevenabsent.Theseresultssuggestthat,basedintheirabundance,someofthefamiliesdescribedabovehavebeenevolutionarymoresuccessfulinbacteriaandarchaea;whereassmallfamilieshavebeenlesssuccessfulintermsofthenumberofmemberspergenome.However,theyPLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20184/16 DiversityandvariabilityofTFsinprokaryoticgenomesTable1.Familiesoftranscriptionfactorsidentifiedinthebacteriaandarchaeagenomes.abcFamilyTotalnumberofTFs()NumberofdifferentCDs()LengthsizeoftheCD(±SD)Enrichmentofdomains(e−10)Pearson's(R)value()PhoB4948(5.93)72136.08(42.06)80.73GerE6074(7.28)96134.32(43.21)240.71GntR6639(7.95)76188.69(91.35)90.71MarR6305(7.55)126149.56(72.11)300.69SinR5939(7.11)133126.99(62.05)350.65ArsR3314(3.97)86138.97(56.56)210.65TetR/AcrR12097(14.49)70111.91(25.01)40.65LysR11610(13.91)24201.09(15.79)10.62AraC/XylS6860(8.22)106136.78(52.77)460.61Crp2264(2.71)53133.14(31.75)40.61Lrp4119(4.93)126109.55(51.13)300.60LexA611(0.73)36119.99(31.21)60.53Iclr1323(1.58)80164.31(44.13)110.51GalR/LacI3504(4.20)25258.52(32.84)10.51Fur1428(1.71)1484.50(71.21)10.50Fis4143(4.96)65192.78(53.01)200.47BirA1076(1.29)58121.56(62.46)90.47ArgR485(0.58)2284.42(38.25)20.17TrmB746(0.89)84139.84(69.73)190.12dTotal83485457()210a.Numbersinbracketsindicatethepercentageofproteinscorrespondingtothetotaldatasetanalyzed.b.ThisnumberrepresentsthetotalnumberofdifferentCDsineachfamily,notthetotalnumberofCDs.c.CorrelationvaluebetweengenomesizeandproportionofTFs.d.TotalnumberofdifferentCDsidentified.Pleasenotethatitdoesnotrepresentthesumofthenumbersinthiscolumn.https://doi.org/10.1371/journal.pone.0195332.t001couldbeassociatedwithglobalregulation,asithasbeendescribedforCrpinEscherichiacoliK-12,forwhichtwofamilymembershavebeenidentifiedandalmost25%oftheirgenesareundertheregulationofthisglobalregulatorinsomeorganisms[16].ProbableexpansionsasaconsequenceofduplicationeventsInordertoevaluatehowgenomesizehasinfluencedtherepertoireofTFs,wereliedonthenumberofORFsperorganism,underthehypothesisthatorganismswithalargenumberofORFscouldbeassociatedwithmoreduplicationevents,whereasorganismswithsmallgenomeswouldbeassociatedwithaminornumberofduplicationevents[5,17].Tothisend,wenormalizedtheabundanceofeachfamilyofTFspergenome,asdescribedintheMethodssection,andanabundanceprofilewasdisplayed.Posteriorly,foreachfamily,wecalculatedthePearsoncorrelationbetweentheabundanceprofileandtheestimatednumberofORFs(Table1).Fromthis,3families(GntR,GerE,andPhoB)showedastrongcorrelationbetweentheirabundanceandthenumberofORFs,withanR-valueof0.70.Thisresultsuggestsanexpansionofthesefamiliesinalmostallbacteriaandarchaea,withintermediateabundanceinorganismswithgenomesizesbetween1000and2000ORFs,andlowfrequencyorevenabsentinorganismswithsmallgenomesizes(lessthan1000ORFs).Interestingly,thesefamilieswereincludedinthesecondgroupofabundantfamiliesdescribedinthepreviousparagraph.Therefore,theirduplicationeventscouldbeassociatedpreferentiallywithanincreaseingenomesizes.Twelvefamilies,includingLysR,TetR/AcrR,andGalR/LacI,haveaPLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20185/16 DiversityandvariabilityofTFsinprokaryoticgenomescorrelationcoefficient(R)between0.47and0.69.Thesefamiliesaresubstantiallyincreasedinsomegenomes,butthisisnotdirectlyrelatedtotheincreasingofgenomesizes.Finally,twofamilies(TrmBandArgR)havecorrelationcoefficientsoflessthan0.2,suggestingthatthesefamiliesareinlowcopynumberorevenabsentinbacteriaandarchaeagenomes.Probably,thesefamilieshavebeenreplacedwithalternativeregulatoryprocessesindiverseorganisms;italsosuggeststhattheirabsencedoesnotcompromisetheresponseofbacteriaandarchaeatodiversestimuli.Insummary,weconsiderthattheGntR,GerE,andPhoBfamiliesfollowasimilartrendofduplicationandlosseventsasafunctionofgenomedynamics,i.e.whenthegenomeisdupli-cated,membersofthesefamiliesarealsoduplicated,butwhengenelossoccurs,thesefamiliesareaffected,increasingorcontractingthefamily,respectively.Incontrast,highlyabundantfamilies,suchasLysRandTetR/AcrR,donotfollowthistrend,i.e.thesefamiliesareabundantinsomeorganisms(i.e.250membersoftheTetR/AcrRfamilyinthebacteriumAmycolatopsismediterraneiU32)orscarceinorganismswiththesamegenomesize(i.e.46membersoftheTetR/AcrRfamilyinthebacteriumSorangiumcellulosum56);bothgenomesareofsimilarsizes(around9200proteins).ThesedatareinforcethescenarioproposedbyItzkovitzetal.[13]fortheevolutionofDBDs,wheresomeorganismspreferentiallyusesomeDBDsandwhentheseDBDsreachthecorrespondingupperbound,newDBDsareneeded.Atsuchpoints,organismsshifttheirTFusagetonovelTFfamiliesorfamilieswithfewermembersbutwithmoredegreesoffreedomandhighermaximalnumbers.LifestyleinfluencesthecontentofTFfamiliesTodeterminetheinfluenceoflifestyleonthecontentofTFfamilies,organismsweregroupedintofourclassesaccordingtotheirlifestyle,inaccordancewithpreviousreports[18,19]andexpandedwithinformationprovidedinthecorrespondingliteraturedepositedintheNCBIdatabase[20]aswellasintheBacMapGenomeAtlas[21].Takinginconsiderationthisclassi-ficationfree-livingorganismsincluded368genomes,consideringnonpathogenicbacteria,evenwhentheseorganismhavetheabilitytoactassymbiontswithplantsoranimals;patho-gens(187genomes),includesorganismsreportedtoproduceillnessinplantsoranimalsdespitethefactthatsomehavestagesintheirlifecyclewheretheysurviveasfree-livingorgan-isms;extremophiles(158organisms),thoselivinginextremeenvironmentalconditions;andintracellularpathogens(48organisms),whichincludesobligateendosymbiontsandintracellu-larpathogens.ThisanalysisisrelevantunderthehypothesisthatorganismswithsimilarlifestyleswouldhaveasimilarrepertoireofTFs,becauseTFfamilyabundanceisinfluencedbytheenviron-mentanorganisminhabits.Therefore,weplottedtheproportionofTFspergenomeaccordingtolifestyle(Fig1).Basedonthisanalysis,wefoundthatorganismsinthefree-livingcategory,withtheirlargergenomes,hadahigherproportionofTFs(Kruskal-WallistestP-value,<2.2e−16),whileintracellularorganismsexhibitedthelowestTFcontent,whichcorrelateswiththeirsmallgenomes.Onthebasisofthesedata,wesuggestthatthefluctuatingenvironmentalconditionsencounteredbyfree-livingbacteriafavorincreasesinTFcontent,butthisisalsoassociatedwiththegenomesize,aspreviouslyreported[19].Theseresultsreinforcetheideathatfree-livingorganismshaveagreaterpercentageofTF-encodinggenes,whileintracellularorganismshavealowerproportionofgenesdevotedtogeneregulation.Toexpandtheseobservations,wedecidedtoexploretheproportionoffamiliesaccordingtolifestyle,underthehypothesisthatlargefamilieswillcontributesignificantlytothereper-toireofTFsinallbacteriaandarchaea.Todothis,wecalculatedtherateofoccurrenceofeachfamilyforthenumberofORFsperorganismsandperlifestyle,andahierarchicalclusteringPLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20186/16 DiversityandvariabilityofTFsinprokaryoticgenomesFig1.ProportionofTFsinorganismsdependingontheirlifestyles.Organismswereclassifiedineitheroffourcategories:intracellular,pathogens,extremophiles,andfree-living,inagreementwith[18].TFproportionswerecalculatedastheratiobetweenthetotalnumberofTFsandthegenomesize(inORFs).Thelineshowninsidetheboxisthemedianvalue.Thewhiskercapsrepresenttheminimumandmaximumvalues.Pointsoutsidethebarsrepresenttheoutliergenomes.https://doi.org/10.1371/journal.pone.0195332.g001approachwithaManhattandistanceandsupporttreewithaveragelinkagealgorithm,withcorrelationuncenteredasasimilaritymeasure,wasimplementedintheMev4program[22].Inthisregard,Fig2showsthatLysRandTetR/AcrRfamiliescontributesignificantlytothetotalrepertoireofTFsinorganismsofallofthelifestyles,i.e.theyareubiquitouslydistributedinalltheorganisms.Incontrast,theSinR,AraC/XylS,GntR,GerE,PhoB,andMarRfamiliescontributetotherepertoireofTFsinextremophilic,pathogenic,andfree-livingorganisms;whereastheLrp,Fis,GalR/LacIandArsRfamiliescontributedtotheextremophilicTFsandfree-livingorganisms.Finally,theTrmB,Crp,Fur,IclR,BirA,ArgR,andLexAfamiliesdoesnothaveanevidentcontributiontotherepertoireofTFsintheorganismsasafunctionoflife-styles,exceptforTrmBthatcontributestotherepertoireofTFsoftheextremophileorganisms.Takentogethertheseobservationssuggesthoweachfamilyisdistributedineachgroupoforganismsandalsothatperhapstheenvironmentalpressure(ortheabsenceofit)mightaffecttheabundance/scarcityofeachTFfamily.ProteinarchitecturediversityinthetranscriptionalfactorsInordertogaininsightsintotheproteinarchitectureofthe19TFsfamilies,thecompletesetofregulatoryproteinswasanalyzedintermsofthecompositionofstructuraldomainswithspecialemphasisintheextradomains,orCDs.Fromthisanalysis,wefoundthat32%oftheTFsaremonodomainproteins(ormonolythic),almost59%oftheTFsareorganizedintwodomains,andtheremaining9%exhibitsthreeormorestructuraldomains.Inaposteriorstep,alltheseproteinswereevaluatedaccordingtotheirWSthatconsidersthediversityofassocia-tionbetweentheDBDandthecompaniondomains,evenwhentheDBDisonlypresent.ThisinformationwasusefultodeterminehowdiversetheCDsarewithinthesefamilies.Asmen-tionedintheIntroductionsection,aCDwasdefinedasastructuraldomaininthesamepoly-peptidebutnotformingpartoftheDBD.Consequently,aDBDmightnothaveaCDatalloritmighthaveoneorseveralCDs.Therefore,fromthisstructuraldissection,wefoundatotalof457differentdomainsassociatedwithallDBDsintheTFcollection(Table1andS1Dataset).Thesedomainswereclassifiedinto239differentsuperfamiliesandtheycanbefunctionallyassociatedto35functionalcategoriesaccordingtotheSuperfamilyDatabase,beingthemetab-olism,regulationandsignaltransductionthemostsignificantfunctions(S1Fig).Fromthis,themostabundantCDswereassociatedwiththemostabundantfamilies,suchastheLysR,PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20187/16 DiversityandvariabilityofTFsinprokaryoticgenomesFig2.DistributionofTFfamiliesperlifestylegroup.EachcolumndenotesalifestyleasdescribedinFig1,whereasrowsdenotethe19TFfamiliesanalyzedinthiswork.Theheatmapbaratthetopofthefigureindicatestherelativeabundanceoffamilyperlifestyle.FourgroupsofTFfamilieswereidentifiedbasedinahierarchicalclusteringapproachbyusingaManhattandistanceandasupportingtreewithanaveragelinkagealgorithm,alsowithacorrelationuncenteredasasimilaritymeasure.NumbersontopoftheTFfamiliesdenotetheproportionoftheseTFsandthenumbersintheupperleftsectionshowtheweightscores(WS).https://doi.org/10.1371/journal.pone.0195332.g002whichcontains24differentCDs,mainlyassociatedtotheperiplasmic-bindingprotein-likeII(PBPII),followedbyTetR/AcrR,whichcontains70differentCDs(Table1),beingthemostabundantdomaintheassociatedtotetracyclinbindingattheC-terminaldomain;andAraC/XylSwith106differentCDs,wherethemostabundantdomainisassociatedtothearabinosebindinganddimerisationdomain.Thus,CDsaredistributedindifferentproportionsamongPLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20188/16 DiversityandvariabilityofTFsinprokaryoticgenomestheTFfamiliesandtheirassociationwiththeDBDcouldcontributetorespondtodifferentstimuli.Therefore,proteinarchitecturesassociatedwithalltheseTFfamilieswereevaluated,accordingtotheformuladescribedinreference[12].Inaposteriorstep,theweightscore(WS)asameasureofdomainpromiscuityinnon-redundantproteomesofbacteriaandarchaeawasdeterminedandplottedasafunctionofthegenomesize.Obtainedresultswereinterpretedasfollows,valuescloserto0representpromiscuityandhighervaluessuggestnodiversityatallintheproteinarchitectureand,inconsequence,proteinsmustbeconsideredasnotpromiscuousormonolithic(SeeS2Table).Inaposteriorstep,toidentifysimilargroupsoffamiliesbasedtheWSpergenomewecalculateacoefficientofvariationand,threegroupswereidentified:highlypromiscuousfamilies(CVbetween0.99and1.36),slightlypromiscuousfamilies(CVbetween1.74and2.47),andmonolithicfamilies(CVbetween3.6and4.3)(Fig3).InthefirstgroupwereincludedthehighlypromiscuousfamiliesofLysR,TetR/AcrR,AraC/XylS,GerE,GalR/LacI,MarR,PhoB,Fis,GntR,SinRandLrp,withWSvaluescloserto0thantheotherfamilies(Figs3and4aandS2Fig).ThepromiscuityindexvalueobservedcorrelateswiththenumberofCDsshowninTable1andincreaseinrelationtogenomesize.Inaddition,whenanenrichmentstatisticalanalysis(relationshipsoflessthane−10wereconsideredsignif-icant)wasachievedtodeterminewhetherthereisasignificantrelationbetweentheDBDandtheCD,wefoundahighnumberofenrichedanduniqueCDsperfamily,suggestinganincreaseintheirabilitytosenseawidediversityofcompounds.LysRfamilyincludesthemostcommonTFsfoundinnature[23,24],andtheirCDsarealsoverydiverse.Inparticular,theFig3.CoefficientofVariationperfamily.InordertodefinehowcomplexwheretheTFinthemultiplefamiliesTFfamiliesweregroupedintothreeclasses(indicatedbycircles)dependingontheircoefficientofvariation(CV)asfollows:0.9±1.36,highlypromiscuous;1.74±2.5,intermediatepromiscuity;3.5±4.5,notpromiscuousormonolithic.OntheX-axistheCVperfamilyisindicated.OntheY-axisthetotalofCompanionDomainsisindicated.CVwasdeterminedastheratioofthestandarddeviationtothemeanforalltheWSinthecorrespondingTFfamily(formoredetailspleaserefertoMethodssections).https://doi.org/10.1371/journal.pone.0195332.g003PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,20189/16 DiversityandvariabilityofTFsinprokaryoticgenomesFig4.Architectureoffamiliesasafunctionofthegenomesize.AnalysisandclassificationofthemultipleTFsclassifiedthemin3groupsasshowninS1TableandFig3.Examplesofthesethreegroupsareshownasfollows:a)highlypromiscuous,theAraC/XylSfamily;b)intermediatepromiscuity,theBirAfamily;andc)notpromiscuousormonolithic,theFurfamily.OntheX-axisofeachgraphthegenomesizesaredisplayedinelevenwindowswithalengthof836ORFseach(SeeMethods).OntheY-axis,theweightscore(WS)isrepresented(seetheMethodssectionfordetails).Thelineshownintheboxisthemedianvalue.Thewhiskercapsrepresenttheminimumandmaximumvalues.https://doi.org/10.1371/journal.pone.0195332.g004PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,201810/16 DiversityandvariabilityofTFsinprokaryoticgenomesAraC/XylSfamilyoftranscriptionalregulatorsincludesdiverseproteinsthatcontroltheexpressionofgenesinvolvedindiversebiologicalprocesses,suchasmetabolismofcarbonsources,pathogenesis,andstressresponses,amongothers[15,25].Theseproteinsusuallycon-tainaDBDandaCDthatareinvolvedineffector/multimerizationfunction.TheDBDisaconservedregionofaround60aminoacidsthatcontainstwohelix-turn-helix(HTH)DNA-bindingmotifsseparatedbyoneα-helix[15,25].Incontrast,theCDsexhibitalargediversityoflengthsizes(anaverageof136.7±52.7aminoacids)andforseveralmembersofthisfamilyithasbeenshowntosenseenvironmentalsignalsbyinteractingwiththem.ForinstanceXylSandBenRareabletodetectthepresenceofaromaticcompoundssuchastoluene[15,25,26];AraC,RhaSandMelRareabletodetectsugars;RegAfromCitrobacterrodentioumdetectsbicarbonate;TxtRfromStreptomycesscabies,cellobiose;UreRformProteusmirabilis,urea;ToxT(Vibriocholerae),bilesalts;InvF(Salmonellaenterica)andExsA(Pseudomonasaerugi-nosa)bindtoanotherprotein(SicAandExsD,respectively)toexerttheirrole[15,25,26].Alltheseexamplesareaclearsupportforourpointaboutthisfamilybeingoneofthemostªpro-miscuousºhavingCDswithawidepossibilityofinteractions.Inouranalysis,the46identifiedCDswerenotsharedwithotherfamiliesandthereforewereuniquetothisfamily.AnotherinterestingcaserepresentstheGntRfamilyincludedinthisgroupofpromiscuousfamiliesthatcorrelateswithpreviousanalysisdescribingthedivisionofthisfamilyintofoursubgroups,accordingtothetypeofC-terminalCD(FadR,HutC,MocR,andYtrA)andtwominorsub-families(AraRandPlmA).TheC-terminaleffector-bindingandoligomerization(E-O)domainimposesstericconstraintsontheDBD,influencestheHTHmotifandplaysanimpor-tantroleinregulation[27].Forexample,theE-OdomainisabletorestrictDBDflexibilityintheGntRfamily,reducingitsabilitytoadapttovaryingdistancesbetweenthepartsofapalin-dromicmotif,whichreflectsonthebindingmotifstructureandthereforeinitsinteraction,ornot,withtheregulatoryregion[28].Thesecondgroupincludedthosefamiliesdefinedasslightlypromiscuous,becausetheirweightscoresareslightlyrelatedtothegenomesizes,suchasBirA,Crp,TrmB,IclR,andArsR(Fig4andS2Fig).Thesefamiliescontainbetween53and86differentCDs,andtheirassocia-tionseemstobesimilarinbacterialandarchaealgenomes.Theenrichmentanalysisidentifiedbetween4and21specificdomains(Table1).Inthissecondgroupareincludedsomemoon-lightingproteins,suchasBirA,wheretheN-terminaldomainisrequiredforbothtranscrip-tionalregulationofbiotinsynthesisandbiotinproteinligaseactivity.TheroleofthewingassociatedwiththeDBDintheBirAenzymaticreactionistoorienttheactivesiteandprotectbiotinoyl-50-AMPfromattackbysolvents[29].Globalregulators(Crp)werealsoidentifiedwithlowdiversityofCDs,wherethecAMPdomainistherepresentative.Theseresultsshowedthatthecombinationofsignificantstructuraldomainsidentifiedinallthefamiliesprovidesdiversitythataffordsabilitiestosenseawidediversityofenvironmentalsignals.Thethirdgroupincludedthemonolithicfamilies,suchasArgR,Fur,andLexA(Fig4andS2Fig).Thesefamiliescontainedhighweightscoresindependentofthegenomesize,suggest-inglowdiversityintheirproteinarchitecturesamongtheorganismsanalyzed.ThisfindingcorrelateswiththefactthatthesefamiliescontainalowproportionofdiverseCDs,includingspecificones.SuchastheLexAfamilyoftranscriptionalfactors,usuallycomposedofaDBDandaCD,whichismostlyinvolvedinproteolyticcleavage[30];thissamedomainorganiza-tionwasaconstantproteinarchitectureinalltheorganismsanalyzedforthistypeofproteins.TheFurfamily,whichisprimarilyassociatedwiththeªFurC-terminaldomain,ºcomprisesmemberswithacommonstructurethataremainlyinvolvedinironandzincmetabolism(afewareinvolvedintheperoxidestressresponse)[31].Therefore,membersofthisfamilyalsoshowsomepromiscuity.Finally,theDBDthatdefinestheArgRfamilyistheªargininerepres-sorC-terminaldomain,ºanditisthesameinalmostalltheArgR-relatedproteinsinbacterialPLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,201811/16 DiversityandvariabilityofTFsinprokaryoticgenomesFig5.ProportionofCDssharedbetweendifferentfamilies.Inordertoidentifycompaniondomains(CDs)thatarecommontomorethanonefamilyofTFseachoftheCDsassociatedtoTFfamilieswascomparedagainsttheotherfamilies.Onlydomainsthatwereidentifiedasenrichedwereplottedasaheatmap(uppersection),inwhich0representsabsenceand1.0represents100%ofCDsincommonwithtwoormorefamilies.https://doi.org/10.1371/journal.pone.0195332.g005andarchaealgenomes[32].Despitethefactthatidentityis27%amongthemembersofthisfamily,ithasbeenshownthattheyareinvolvedinmetabolismandhavestructuralsimilarity(reviewedinreference[32]).Finally,weaskedhowtheCDsaredistributedamongtheTFfamilies,toidentifyCDsasso-ciatedtomorethanonefamilyandthoseCDsspecificallyassociatedtosomefamilies.Fig5showswhichofthetestedfamilieshadhigherpromiscuity,butthistimethelevelofpromiscu-itywasbasedonsharingtheCDswithotherfamilies.Fromthisanalysis,itisclearthattheArsR,IclR,Lrp,MarR,andTrmBfamiliessharemanyCDswithotherfamilies,whileotherfamilies,suchasCrp,GalR,Fur,BirA,ArgR,AraC/XylS,Fis,andPhoB,sharefewerCDs.ThefactthataCDissharedbymanyTFfamiliesmightbeexplainedbydomainshufflingamongproteins,inwhichgenefusionplaysacentralrole[33].Ifthisistrue,thenthosefamilieswithahighsharingratearethemostpromiscuousamongthistypeofproteins,whilethosenotshar-ingbeyondthefamilycanbeconsideredªconservativeº.Thiswouldmean,inanotherwords,thatthoseintheformergroupmightbesubjectedtomoredomainsshufflingthanthoseinthelatter.Furtheranalysis,searchingforthiswouldbeinterestingtoperform.PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,201812/16 DiversityandvariabilityofTFsinprokaryoticgenomesInsummary,characterizingandanalyzingCDsnotonlygivesahintabouttheirdistribu-tion,variabilityandfunction,butalsoonhowtheyevolvedandaresharedamongTFs.Thesedomains(alsoreferredasEDBs)mightbeofsignificanceinthesyntheticbiologyfield,thisbecausethemoreweknowabouttheirfunctionandoriginsthemorewecanusethemasªbio-logicaldetectorsº[3].ConclusionsInthiswork,wehaveevaluated19familiesofDNA-bindingTFsfortheirarchitecturesintermsofstructuraldomains,howabundanttheyareinthebacteriaandarchaeagenomes,howtheyaredistributedaccordingtoorganismlifestyle.Moreover,aninsightonhowtheseCDsaresharedwithinandbetweenthefamiliesisalsodiscussed.Fortheexaminedfamilies,anabundancedegreewasdefined,suggestingwhichofthemhavebeenevolutionarilysuccessful.Asforlifestyle,thosethatarefree-livinghavethemostprosperousfamilies,whichalsohappentobetheoneswiththehighestpromiscuity.Thiscombinationcouldbeexplainedbecausepreciselythesetypesoforganismsareincontactwiththeenvironmentandotherorganisms,whichcantransferandacquiregeneticmaterial.Ourdatareinforcethenotionthatincreasedgenecomplexityalsorequiresthedevelopmentofmechanismsforgeneregulationatthetran-scriptionlevel[34],inparticular,thecombinationofDBDsandCDs,suggestingthattheinter-playofthesestructuraldomainscouldincreasetheabilityoftheorganismstorecognizeandrespondtodiverseenvironmentalstimuli.Also,abiastowardsspecificassociationsbetweenDBDsandCDwasidentifiedthatdependedontheCDspresentandthefrequencyatwhichtheTFsweregroupedintothethreeclasses,basedonthepromiscuityoftheCDs.Insummary,inthisstudywefoundthattheproteinarchitectures,duplicationevents,andtheinterplayoftheminassociationwithgenomesizescouldhelporganismscontendwithgenecomplexityatthetranscriptionallevel,increasingtheabilityofbacteriaandarchaeatorecognizeandrespondtodiversestimuliandenvironmentchallenges.SupportinginformationS1Fig.FunctionalcategoriesofsuperfamilyassociatedtoCDs.239superfamilieswereclassifiedintooneofthe6majorcategories(General,Information,Intra-cellularprocesses,Metabolism,Regulation,andother).OntheX-axisisthefamilyname.OntheY-axisistheproportionoffunctionalcategory.(TIFF)S2Fig.BoxplotofweightscoresfortheTFfamilies.Analysisisshownformemberofthehighlypromiscuousgroup:A)TetR/AcrR,LysR,GerE,GntR,Fis,GalR/LacI,MarR,PhoB,andSinR,Lrp;B)intermediatelypromiscuous:ArsR,IclR,Crp,andTrmB;andC)monolithicornon-promiscuous:ArgR,FurandLexA.OntheX-axis,thegenomesizesaredisplayedinelevenwindowswithalengthof836ORFs.OntheY-axis,theWSisrepresented.Themeanofeachwindowisdisplayedwithaline.TFfamiliesweregroupedintothreeclassesdependingontheirCV,asfollows:0.9±1.36,highlypromiscuous;1.74±2.5,intermediatepromiscuity;3.5±4.5,notpromiscuous.(PDF)S1Table.TotalofTFspergenome.Columnsareasfollows:FirstcolumnindicatesthelifestyleasdescribedintheMethodssection;secondcolumnshowstheorganismsnameofthegenomeanalyzed;columns3to21indicatetheTFfamiliesandthenumberofmembersperfamilyidentified;finally,column22showsthetotalTFsforeachorganism.(XLSX)PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,201813/16 DiversityandvariabilityofTFsinprokaryoticgenomesS2Table.Weightscoresassociatedtothefamiliespergenome.Weightscores(WS)werecalculatedasdescribedintheMethodssection.Columnsareasfollows:column1,nameoftheorganism;column2,numberofidentifiedTFsforeachorganism;column3to21,weighscoresforeachTFsfamily.(XLSX)S1Dataset.Transcriptionfactorsidentifiedinbacteriaandarchaeagenomes.Theinforma-tionofgenomesandtranscriptionfactorsisorganizedinatabularformat.ThefileS1_Dataset.zipcontains761files,onepergenome.Theinformationisorganizedasfollows:GenomeID,identifierfromNCBIdatabase;SupfamID;physicalpositionoftheDomain;E-value;Supfammodelassignment;Supfamdescription;E-valueassociatedtothefamily;FamilyID;FamilyID;representativePDBassociatedtoeachassignment.(ZIP)AcknowledgmentsTheauthorswouldliketothankourrespectiveLabgroupsfortheirassistanceandsupport.JoaquinMoralesandSandraSauzaareverymuchappreciatedfortheircomputationalsupport.AuthorContributionsConceptualization:ErnestoPerez-Rueda,IsraelSanchez,J.AntonioIbarra.Datacuration:ErnestoPerez-Rueda,RafaelHernandez-Guerrero,MarioAlbertoMartinez-Nuñez,IsraelSanchez,J.AntonioIbarra.Formalanalysis:ErnestoPerez-Rueda,RafaelHernandez-Guerrero,MarioAlbertoMartinez-Nuñez,DagobertoArmenta-Medina,IsraelSanchez,J.AntonioIbarra.Fundingacquisition:ErnestoPerez-Rueda,J.AntonioIbarra.Investigation:ErnestoPerez-Rueda,MarioAlbertoMartinez-Nuñez,J.AntonioIbarra.Methodology:ErnestoPerez-Rueda,RafaelHernandez-Guerrero,MarioAlbertoMartinez-Nuñez,DagobertoArmenta-Medina,IsraelSanchez.Projectadministration:J.AntonioIbarra.Software:RafaelHernandez-Guerrero,MarioAlbertoMartinez-Nuñez,DagobertoArmenta-Medina,IsraelSanchez.Supervision:ErnestoPerez-Rueda,RafaelHernandez-Guerrero,MarioAlbertoMartinez-Nuñez,J.AntonioIbarra.Validation:ErnestoPerez-Rueda,RafaelHernandez-Guerrero,MarioAlbertoMartinez-Nuñez.Writing±originaldraft:ErnestoPerez-Rueda,J.AntonioIbarra.Writing±review&editing:ErnestoPerez-Rueda,J.AntonioIbarra.References1.EngstromMD,PflegerBF.Transcriptioncontrolengineeringandapplicationsinsyntheticbiology.SynthSystBiotechnol.2017;2(3):176–91https://doi.org/10.1016/j.synbio.2017.09.003PMID:293181982.Perez-RuedaE,Martinez-NunezMA.TherepertoireofDNA-bindingtranscriptionfactorsinprokary-otes:functionalandevolutionarylessons.SciProg.2012;95(Pt3):315–29PMID:23094327PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,201814/16 DiversityandvariabilityofTFsinprokaryoticgenomes3.Fernandez-LopezR,RuizR,delaCruzF,MoncalianG.Transcriptionfactor-basedbiosensorsenlight-enedbytheanalyte.FrontMicrobiol.2015;6:648https://doi.org/10.3389/fmicb.2015.00648PMID:261910474.Balderas-MartinezYI,SavageauM,SalgadoH,Perez-RuedaE,MorettE,Collado-VidesJ.Transcrip-tionfactorsinEscherichiacoliprefertheholoconformation.PLoSOne.2013;8(6):e65723https://doi.org/10.1371/journal.pone.0065723PMID:237765355.Martinez-NunezMA,Poot-HernandezAC,Rodriguez-VazquezK,Perez-RuedaE.Incrementsandduplicationeventsofenzymesandtranscriptionfactorsinfluencemetabolicandregulatorydiversityinprokaryotes.PLoSOne.2013;8(7):e69707https://doi.org/10.1371/journal.pone.0069707PMID:239227806.KummerfeldSK,TeichmannSA.DBD:atranscriptionfactorpredictiondatabase.NucleicAcidsRes.2006;34(Databaseissue):D74–81https://doi.org/10.1093/nar/gkj131PMID:163819707.Gama-CastroS,SalgadoH,Santos-ZavaletaA,Ledezma-TejeidaD,Muniz-RascadoL,Garcia-SoteloJS,etal.RegulonDBversion9.0:high-levelintegrationofgeneregulation,coexpression,motifcluster-ingandbeyond.NucleicAcidsRes.2016;44(D1):D133–43https://doi.org/10.1093/nar/gkv1156PMID:265277248.SierroN,MakitaY,deHoonM,NakaiK.DBTBS:adatabaseoftranscriptionalregulationinBacillussubtiliscontainingupstreamintergenicconservationinformation.NucleicAcidsRes.2008;36(Data-baseissue):D93–6https://doi.org/10.1093/nar/gkm910PMID:179622969.WilsonD,PethicaR,ZhouY,TalbotC,VogelC,MaderaM,etal.SUPERFAMILY—sophisticatedcom-parativegenomics,datamining,visualizationandphylogeny.NucleicAcidsRes.2009;37(Databaseissue):D380–6https://doi.org/10.1093/nar/gkn762PMID:1903679010.VogelC,ChothiaC.Proteinfamilyexpansionsandbiologicalcomplexity.PLoSComputBiol.2006;2(5):e48https://doi.org/10.1371/journal.pcbi.0020048PMID:1673354611.R-programming.DevelopmentCoreTeam.R:ALanguageandEnvironmentforStatisticalComputing.RFoundationforStatisticalComputing,Vienna,Austria.2011;12.LeeB,LeeD.Proteincomparisonatthedomainarchitecturelevel.BMCBioinformatics.2009;10Suppl15:S513.RamosJL,Martinez-BuenoM,Molina-HenaresAJ,TeranW,WatanabeK,ZhangX,etal.TheTetRfamilyoftranscriptionalrepressors.MicrobiolMolBiolRev.2005;69(2):326–56https://doi.org/10.1128/MMBR.69.2.326-356.2005PMID:1594445914.GallegosMT,MichanC,RamosJL.TheXylS/AraCfamilyofregulators.NucleicAcidsRes.1993;21(4):807–10PMID:845118315.IbarraJA,Perez-RuedaE,SegoviaL,PuenteJL.TheDNA-bindingdomainasafunctionalindicator:thecaseoftheAraC/XylSfamilyoftranscriptionfactors.Genetica.2008;133(1):65–76https://doi.org/10.1007/s10709-007-9185-yPMID:1771260316.Perez-RuedaE,JangaSC,Martinez-AntonioA.Scalingrelationshipinthegenecontentoftranscrip-tionalmachineryinbacteria.MolBiosyst.2009;5(12):1494–501https://doi.org/10.1039/b907384aPMID:1976334417.RaneaJA,GrantA,ThorntonJM,OrengoCA.Microeconomicprinciplesexplainanoptimalgenomesizeinbacteria.Trendsingenetics:TIG.2005;21(1):21–5https://doi.org/10.1016/j.tig.2004.11.014PMID:1568050918.CasesI,deLorenzoV,OuzounisCA.Transcriptionregulationandenvironmentaladaptationinbacte-ria.TrendsMicrobiol.2003;11(6):248–53PMID:1282393919.Martinez-NunezMA,Rodriguez-VazquezK,Perez-RuedaE.Thelifestyleofprokaryoticorganismsinfluencestherepertoireofpromiscuousenzymes.Proteins.2015;83(9):1625–31https://doi.org/10.1002/prot.24847PMID:2610900520.SayersEW,BarrettT,BensonDA,BoltonE,BryantSH,CaneseK,etal.DatabaseresourcesoftheNationalCenterforBiotechnologyInformation.NucleicAcidsRes.2012;40(Databaseissue):D13–25https://doi.org/10.1093/nar/gkr1184PMID:2214010421.StothardP,VanDomselaarG,ShrivastavaS,GuoA,O’NeillB,CruzJ,etal.BacMap:aninteractivepictureatlasofannotatedbacterialgenomes.NucleicAcidsRes.2005;33(Databaseissue):D317–20https://doi.org/10.1093/nar/gki075PMID:1560820622.SaeedAI,SharovV,WhiteJ,LiJ,LiangW,BhagabatiN,etal.TM4:afree,open-sourcesystemformicroarraydatamanagementandanalysis.Biotechniques.2003;34(2):374–8PMID:1261325923.MaddocksSE,OystonPC.StructureandfunctionoftheLysR-typetranscriptionalregulator(LTTR)familyproteins.Microbiology.2008;154(Pt12):3609–23https://doi.org/10.1099/mic.0.2008/022772-0PMID:19047729PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,201815/16 DiversityandvariabilityofTFsinprokaryoticgenomes24.ParejaE,Pareja-TobesP,ManriqueM,Pareja-TobesE,BonalJ,TobesR.ExtraTrain:adatabaseofExtragenicregionsandTranscriptionalinformationinprokaryoticorganisms.BMCMicrobiol.2006;6:29https://doi.org/10.1186/1471-2180-6-29PMID:1653973325.GallegosMT,SchleifR,BairochA,HofmannK,RamosJL.Arac/XylSfamilyoftranscriptionalregula-tors.MicrobiolMolBiolRev.1997;61(4):393–410PMID:940914526.YangJ,TauschekM,Robins-BrowneRM.ControlofbacterialvirulencebyAraC-likeregulatorsthatrespondtochemicalsignals.TrendsMicrobiol.2011;19(3):128–35PMID:2121563827.RigaliS,SchlichtM,HoskissonP,NothaftH,MerzbacherM,JorisB,etal.Extendingtheclassificationofbacterialtranscriptionfactorsbeyondthehelix-turn-helixmotifasanalternativeapproachtodiscovernewcis/transrelationships.NucleicAcidsRes.2004;32(11):3418–26https://doi.org/10.1093/nar/gkh673PMID:1524733428.SuvorovaIA,KorostelevYD,GelfandMS.GntRFamilyofBacterialTranscriptionFactorsandTheirDNABindingMotifs:Structure,PositioningandCo-Evolution.PLoSOne.2015;10(7):e0132618https://doi.org/10.1371/journal.pone.0132618PMID:2615145129.ChakravarttyV,CronanJE.Thewingofawingedhelix-turn-helixtranscriptionfactororganizestheactivesiteofBirA,abifunctionalrepressor/ligase.TheJournalofbiologicalchemistry.2013;288(50):36029–39https://doi.org/10.1074/jbc.M113.525618PMID:2418907330.ButalaM,Zgur-BertokD,BusbySJ.ThebacterialLexAtranscriptionalrepressor.CellMolLifeSci.2009;66(1):82–93https://doi.org/10.1007/s00018-008-8378-6PMID:1872617331.FillatMF.TheFUR(ferricuptakeregulator)superfamily:diversityandversatilityofkeytranscriptionalregulators.ArchBiochemBiophys.2014;546:41–52https://doi.org/10.1016/j.abb.2014.01.029PMID:2451316232.CharlierD.ArginineregulationinThermotoganeapolitanaandThermotogamaritima.BiochemSocTrans.2004;32(Pt2):310–3PMID:1504659733.PasekS,RislerJL,BrezellecP.Genefusion/fissionisamajorcontributortoevolutionofmulti-domainbacterialproteins.Bioinformatics.2006;22(12):1418–23https://doi.org/10.1093/bioinformatics/btl135PMID:1660100434.RaneaJA,BuchanDW,ThorntonJM,OrengoCA.Evolutionofproteinsuperfamiliesandbacterialgenomesize.JMolBiol.2004;336(4):871–87https://doi.org/10.1016/j.jmb.2003.12.044PMID:15095866PLOSONE|https://doi.org/10.1371/journal.pone.0195332April3,201816/16

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭