Summary 1
PF00399 - Yeast PIR protein repeat
PF00399 Protein family information
There are different architecture domains, in which not annotated but well predicted regions were found, new annotations have been created on N-term and C-term.
https://www.ebi.ac.uk/interpro/protein/UniProt/A0A0L8VNF4/alphafold/ (C-term)
https://www.ebi.ac.uk/interpro/protein/UniProt/A0A0B4HCL9/alphafold/ (N-term)
We found a group of proteins in which the repeat region protein is in most of the protein length
A0A072PHP4
A0A072PHP4 Interpro sequence information
Sequence:
>tr|A0A072PHP4|A0A072PHP4_9EURO Uncharacterized protein OS=Exophiala aquamarina CBS 119918 OX=1182545 GN=A1O9_04500 PE=4 SV=1
MASLRYVPQCKGLPACQLGSNIAPTGAPVSQISDGQPQAPTGAPVSQISDGQPQAPTGAPVSQISDGQPQAPTGAPV
SQISDGQPQAPTGAPVTQISDGQPQAPTGVPVTQISDGQPQAPTGAPVSQISDGQVQAATGTSAAPAAYTGAAHRNG
ISGGLAIAGAIAGAVILI
MRF results:
Region 1: 21 - 132, 16 aa length, 7 units
NIAPTGAPVSQISDGQ
PQAPTGAPVSQISDGQ
PQAPTGAPVSQISDGQ
PQAPTGAPVSQISDGQ
PQAPTGAPVTQISDGQ
PQAPTGVPVTQISDGQ
PQAPTGAPVSQISDGQ
TAPAS results:
PF00402 - Calponin family repeat
PF00402 Protein family information
There are different architecture domains, we evaluate the ones with more presence of the repeat region in the protein sequence. We have found that there is an amyloid region
A0A183PNT2
A0A183PNT2 Interpro sequence information
A0A183PNT2 AlphafoldBDB sequence information
Sequence:
>tr|A0A183PNT2|A0A183PNT2_9TREM Calponin OS=Schistosoma mattheei OX=31246 GN=SMTD_LOCUS16018 PE=3 SV=1
CQRNGFQGPVCGPKPTYSQPRQWTEEKLRAGEGIIGLQAGTNKLASQKGMSFGAQRHIAD
IRCAAVLSLQMGTNKFASQKGMSFSNQRHIADIKCDDLSQEGKSVINLQMGTNQFASQKG
MRIGSSRHIADIRCDDISKEGQNVISLQYGTNKLASQKGMRMGGQRHIADIRCDNLSADG
ASVIGLQMGLPQNQVASQAGMSFGAHRHINDSH
MRF results:
Region 1: 37-185, 39 aa length, 4 units
LQAGTNKLASQKGMSFGAQRHIADIRC-------AAVLS
LQMGTNKFASQKGMSFSNQRHIADIKCDDLSQEGKSVIN
LQMGTNQFASQKGMRIGSSRHIADIRCDDISKEGQNVIS
LQYGTNKLASQKGMRMGGQRHIADIRCDNLSADGASVIG
TAPAS results:
A0A077YXR0
A0A077YXR0 Interpro sequence information
A0A077YXR0 AlphafoldDB sequence information
Sequence:
>tr|A0A077YXR0|A0A077YXR0_TRITR Calponin domain containing protein OS=Trichuris trichiura OX=36087 GN=TTRE_0000115401 PE=3 SV=1
MSDAENEQTDQDGQDQEDMEELDQEAIEEAHKRREHARAEREEVANLAAFGKPSALPKEK
LMRSEGIIPIQSGTNKYASQKGMTGFGRPRDVIDKVKCENLKPIEDESKIQSLRDVLPLQ
SGTNKFASQKGMTGFGCPRDVINKTKGTGGTGEIEEEKAKATDGVIPLQAGTNKLASQAG
MTGFGMPRSVLHRFNPDQDRQSQGFVHLQAGTNKLATQQGMTSFGSPRTNVTKYKDSQRG
EMANDESVIPRQTTGYKEGANQAGMTGFGMPRNTTIMQLSRQEQKSQGLIPYQMGINWGD
SQAGKTGFGMPRQVFTNYTDDIRGELPEELARMPDVPFWSGMEKLASQAGMTAMGMPRDV
KGTYLRRLW
MRF results:
Region 1: 45 - 348, 67 aa length, 7 units
ANLAAFGKP-SALPK------------E---------K---LMRSEGI-----IPIQ-SGTNKYASQ
KGMTGFGRPRDVIDKVKCENLKPIEDES---------K---IQSLRDV-----LPLQ-SGTNKFASQ
KGMTGFGCPRDVINKTK---------GTGGTGEIEEEK---AKATDGV-----IPLQ-AGTNKLASQ
AGMTGFGMPRSVLHRFN---------PD---------Q---DRQSQGF-----VHLQ-AGTNKLATQ
QGMTSFGSPRTNVTKYK---------DS---------QRGEMANDESV-----IPRQTTGYKEGANQ
AGMTGFGMPRNTTIMQL---------SR---------Q---EQKSQGL-----IPYQ-MGINWGDSQ
AGKTGFGMPRQVFTNYT---------DDI--------R---GELPEELARMPDVPFW-SGMEKLASQ
Region 2: 15 - 29, 8 aa length, 2 units
DQEDMEEL
DQEAIEE-
TAPAS results:
P37397
P37397 Interpro sequence information
P37397 AlphafoldBDB sequence information
Sequence:
>sp|P37397|CNN3_RAT Calponin-3 OS=Rattus norvegicus OX=10116 GN=Cnn3 PE=1 SV=1
MTHFNKGPSYGLSAEVKNKIASKYDQQAEEDLRNWIEEVTGMGIGTNFQLGLKDGIILCE
LINKLQPGSVKKVNESSLNWPQLENIGNFIKAIQAYGMKPHDIFEANDLFENGNMTQVQT
TLVALAGLAKTKGFHTTIDIGVKYAEKQTRRFDEGKLKAGQSVIGLQMGTNKCASQAGMT
AYGTRRHLYDPKMQTDKPFDQTTISLQMGTNKGASQAGMSAPGTRRDIYDQKLTLQPVDN
STISLQMGTNKVASQKGMSVYGLGRQVYDPKYCAAPTEPVIHNGSQGTGTNGSEISDSDY
QAEYPDEYHGEYPDEYPREYQYGDDQGIDY
MRF results:
Region 1: 5 - 324, 100 aa length, 4 units
NKGPSY-GLSAE-VK-NKIASKYD-----QQA-EEDLRNWIEEVT-GMGIGTN------FQLGLKDGIILCEL--IN--KLQPGSVKKVNESSLNWPQLE
NIGNFIKAIQAYGMKPHDI---FEANDLFENG-NM---TQVQTTLVAL-AGLAKTKGFHTTIDIGVKYAEKQTRRFDEGKLKA------GQSVIGLQMGT
NKCASQAGMTAYGTR-RHL---YD-----PKMQTD---KPFDQTTISLQMGTNKGA---SQAGMSAPGTRRDI--YDQ-KLTLQPV---DNSTISLQMGT
NKVASQKGMSVYGLG-RQV---YD-----PKY-CA---APTEPVIHNGSQGTGTNG---SE--ISDSDYQAEY--PDE-YHGEYP----DEYPREYQYGD
Region 2: 303 - 322, 4 aa length, 5 units
EYPD
EYHG
EYPD
EYPR
EYQY
TAPAS results:
PF00414 - Neuraxin and MAP1B repeat
PF00414 Protein family information
There are different architecture domains, in which not annotated but well predicted regions were found, new annotations have been created on N-tern and C-term .*
The protein has usually more than 2000 aa, we tried to predict the structure in the cluster
We found a group of proteins in which the protein is smaller than the rest, and a structure prediction was made
A0A1V4L0S5
A0A1V4L0S5 Interpro sequence information
A0A1V4L0S5 AlphafoldDB sequence information
Sequence:
>tr|A0A1V4L0S5|A0A1V4L0S5_PATFA Microtubule-associated protein 1B OS=Patagioenas fasciata monilis OX=372326 GN=MAP1B PE=4 SV=1
MSISEGTVSDKSATPVDEVVAEDTYSHIEGVASVSTASVATSSFPEPTTDDVSPSLHAEV
GSPHSTEVDDSLSVSVVQTPTTFQETEMSPSKEECPRPMSISPPDFSPKTAKSRTLVHDH
RSPEQSTMSVEFGQESPEQSLAMDFSRQSPEYPTLGTSMQHISENGPTEVDYSPSDIQEP
TYARKISPVEQSSYSQEKDISEIISVSQIEASSSTSSAHTPSQVTSPLPEETFSGVVPPT
DMSLHSFTSEKVQSLGEKLSPKSDLSPLTPRESSPLYSPSFPDSPPEITGAVSASHTPSL
SLQMSSVTAFGYQESLTKHSPEPLLSPEKEDSEKSSRSPEDLSYSYEATEKTTRSPEDIS
YSYEADGKPTRSLQTTVYSYETTGKTTRSPEVADYSYEKIAKDMRTSETTDYSYEMPGKT
TRSPEVMDYSYEMTGKTTRSPEAKDYSYETTGKTIKSSEATDYAYEITGKSTKSPEATDY
SYERIGKATRSPDTMDYSYETTGKSTKSPEAISPCYETTGRTTMSPEAVAYSYETTEKVS
SSPEVTDYSFETTGRATRSPKATSYSYEATAHFTPGKSLAESRQDVDLCLVSSCEYKHPK
TELSPSFINPNPLEWFASEEQPQDQEKPLTQSGGAQPPSGGKQQGRQCDETPPTSVSESA
PSQTDSDVPPETEECPSITADANIDSEDESETIPTDKTITYKHIDPPPVPMQDRSPSPRH
PDVSMVDPEALPVDQNLGKSLKKDLKEKTKTKKQGTKTKSSSPVKKSDGKSKQGASPKPA
TKESLDKISKTVSSKKKESVEKATKNISTPEVKSRVEEKDKDTKNAANTTTSKSAKTATP
GPGNTKVAKSTAVPPGPPVYLDLVYIPNHSNSKNVDVEFFKRVRSSYYVVSGNDAAAEEP
SRAVLDSLLEGKAQWESNLQVTLIPTHDSEVMREWYQETHEKQQDLNIMVLASSSTVVMQ
DESFPACKIEL
MRF results:
Region 1:337 - 574 ,17 aa length, 14 units
RSPEDLSYSYEATEKTT
RSPEDISYSYEADGKPT
RSLQTTVYSYETTGKTT
RSPEVADYSYEKIAKDM
RTSETTDYSYEMPGKTT
RSPEVMDYSYEMTGKTT
RSPEAKDYSYETTGKTI
KSSEATDYAYEITGKST
KSPEATDYSYERIGKAT
RSPDTMDYSYETTGKST
KSPEAISPCYETTGRTT
MSPEAVAYSYETTEKVS
SSPEVTDYSFETTGRAT
RSPKATSYSYEATAHFT
Region 2:705 -807, 56 aa length,2 units
DPPPVPMQDRSPSPRHPDVSMVDPEAL-PVDQNLGKSLKKDLKEKTKTKKQGTKTK
SSSPVKKSD-GKSKQGASPKPATKESLDKISKTVSSKKKESVEKATKN-------I
TAPAS results:
A0A250Y8D3
A0A250Y8D3 Interpro sequence information
A0A250Y8D3 AlphafoldDB sequence information
Sequence:
>A0A250Y8D3 1-2341
MITDAARHKLLVLTGQCFENTGELILQSGSFSFQNFIEIFTDQEIGELLSTTHPANKASLTLFCPEEGDWKNSNLDRHNL
QDFINIKLNSASILPEMEGLSEFTEYLSESVEVPSPFDILEPPTSGGFLKLSKPCCYIFPGGRGDSALFAVNGFNMLING
GSERKSCFWKLIRHLDRVDSILLTHIGDDNLPGINSVLQRKIAELEEEQSQGSTTNSDWMKNLISPDLGVVFLNVPENLK
NPEPNIKMKRGIEEACFTLQYLTKLSMKPEPLFRSVGNTIDPVILFQKMGVGKLEMYVLNPVKNSKEMQYFMQQWTGTNK
DKAELILPNGQEVDIPIPYLTSVSSLIVWHPANPAEKIIRVLFPGNSTQYNILEGLEKLKHLDFLKQPLATQKDLTGQVP
TPTVKQVKLKQRADSRESLKPAAKPLPSKSVRKDSKEEAPDVSKANLVEKPPKVESKEKVIVKKDKPVKTETKPPVTEKE
VPSKEEQPPAKVEVPEKPATDVKPKITKEKVVKKETKAKVEEKKEEKEKPKKEVAKKEEKTPVKKEEKPKKEEVKKEVKK
EIKKEEKKEFKKEVKKETPMKEAKKEIKKEEKKEVKKEEKEPKKEVKKLSKDTKKTSTPLSDTKKPAALKPKVPKKEEPV
KKESVTAGKPKEKGKIKVVKKESKPTEAAAAAAIGTVAATAAVAGIVAAGPAKELEAERSLMSSPEDLTKDFEELKAEEI
DVAKDIKPQLELVDDEEKLKETESVEAYVIQKETEVIKGPAESPDEGITTTEGEGECEQTPEELEPVEKQAVDDIEKFED
EGAGFEESSETGDYEEKAETEEAEEPEEDGEENVCESTSKLSPTEDEESGKAEADVHIKEKRESVASADDRAEEDMEEGV
EKGEAEQSEEEGEEDKAEDAREEEYEPEKAEAEDYVRAVVDKAAEAGGTEDQYGFPTMPPKQPGAQSPGREPASSIHDET
LPGGSESEATASDEENREDQPEEFTATSGYTQSTIEISSEPTPMDEMSTPRDVMSDETNNEETESPSQEFVNITKYESSL
YSQEFSKPVVASFNGLSDGSKTDATEGKDYSATASTISPPSSMEEDKFSKSALRDAYCSEEKAEKASAMLDIKGTVSPVS
DERLSPAKSPSLSPSPPSPIEKTPLGERSVNFSLTPNEIKVSTEAEAVSVSPEVTQEVVEEHCASPEEKTLEVVSPSQSV
TGSAGHTPYYQSPTEEKSSHLPTEVTGKPQAVPVSFEFGDAKDESERASISPMDEPVPDSESPIEKVLSPLRSPPLFGSE
SAYESFLSADGTAPERCTESPFEGKDGKPSSPDQISPISEMTSTGLYQDEREGKSTDFIPIKEDFGPEKKSDDMEAMGAQ
PALALDERKLGGDVSPTQIDVSQFGSFKEDTKMSISEGTVSDKSATPVDEGIAEDTHSHMEGVASVSTASVATSSFPEPT
TDDVSPSLHAEVGSPHSTEVDDSLSVSVVQTPTTFQETEMSPSKEECPRPMSISPPDFSPKTAKSRTPVQDHRSEQSSMS
IEFGQESPEHSLAMDFSRQSPDHSTVGAGVLHITENGPTEVDYSPSDMQDSSLSHKIPPTEEPSYTQDNDLSEFISVSQV
EASPSTSSAHTPSQIASPLQEDTLSDVAPPRDMSLYASLASEKVQSLEGEKLSPKSDISPLTPRESSPLYSPEFSDSTSA
VKESAAACHTSSSPPGDATSAEPYGFRASMLFDTMQHHLALNRDMTASGLEDSGGKTPGDFSYAYQKSEKTTRSPDEEDY
DYESYEKSTRTPDMGSYYYEKTEQTIKSPCDSGYLYETVEKTTKTPEDGGYACEITEKTTRTPEEGGYSYEVTEKTTRTP
EVGGYSYEKTERSRKLLDDISNGYDDSEDAAHTFGDSSYSYETTEKLSSFPESESYSYETSTKTTRSPESAAYCYETTEK
ITKTPQASTYSYETSDRCYTTEKKSPSEARQDVDLCLVSSCEYKHPKTELSPSFINPNPLEWFASEDPIEESEKPLTQSG
GAPPPPGEKQQGRQCDETPPTSVSESAPSQTDSDVPPETEECPSITADANIDSEDESETIPTDKTVTYKHMDPPPAPLQD
RSPSPRHPDVSMVDPEALAIEQNLGKALKKDLKEKTKTKKPGTKTKSSSPVKKADGKPKPLAASPKPGALKESSDKVSRV
ASPKKKDSVEKATKTTTTPEVKATRGEEKDKETKNAANASTSKSVKTAAAGPGTTKTAKSSAVPPGLPVYLDLCYIPNHS
NSKNVDVEFFKRVRSSYYVVSGNDPAAEEPSRAVLDALLEGKAQWGSNMQVTLIPTHDSEVMREWYQETHEKQQDLNIMV
LASSSTVVMQDESFPACKIEL
MRF results:
TAPAS results:
PF00624 - Flocculin repeat
PF00624 Protein family information
There are some cases in which the predictor identifies a beta flat solenoids with low model confidence (A7TTI5), but also cases where the prediction of the unit is confident to very high (A0A1Q3ALI5)
A7TTI5
A7TTI5 Interpro sequence information
A7TTI5 AlphafoldDB sequence information
Sequence:
>tr|A7TTI5|A7TTI5_VANPO Uncharacterized protein (Fragment) OS=Vanderwaltozyma polyspora (strain ATCC 22028 / DSM 70294 / BCRC 21397 / CBS 2163 / NBRC 10782 / NRRL Y-8283 / UCD 57-17) OX=436907 GN=Kpol_249p1 PE=4 SV=1
MKHFTRLLTFLNFVLFACSLSNHENNQALSLSELIDHEAILEGNTALVGDNPKSKLHSEK
KLLSIPLNINQNESIYTSVPSTKNQTYFISDHLATNVKNVDKKDITIKSNDISIITIRTQ
NLNILAETTSTELTWVTGHNGIESKLFIYYIEYPVDHFSFTFIRPMTVNNLEKRLVENED
ISSSSIVKPIVTESTKTIVNTITKSDNALVVETTYIVYSRSPYTSTNSKKTYWTGSYTTT
TKTEITTYIGTNGGVTTETIYFIATPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIE
TTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAF
ETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAFETTSFTYWTGSTANT
LSTVTTTFTGTDGIETTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIE
TTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAF
ETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAFETTSYTYWTGSTANT
LSTVTTTFTGTDGIETTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIE
TTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAF
ETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAFETTSFTYWTGSTANT
LSTVTTTFTGTDGIETTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIE
TTETIYIVETPTTAFETTSFTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAF
ETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAFETTSYTYWTGSTANT
LSTVTTTFTGTDGIETTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIE
TTETIYIVETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAF
ETTSFTYWTGSTANTLSTVTTTFTGTDGIETTETIYIVETPTTAFETTSYTYWTGSTANT
LSTVTTTFTGTDGIETTETIYIVETPTTAFETTSFTYWTGSTANTLSTVTTTFTGTDGIE
TTETIYIV
MRF results:
Region 1: 207-1197, 60 aa length, 47 units
NALVVETTYIVYSRSPYTSTNSKK-TYWTGSYTTTTKTEITTYIGTN
GGVTTETIYFI--ATPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSFTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSFTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSFTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSFTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSYTYWTGSTANTLSTVTTTFTGTD
GIETTETIYIV--ETPTTAFETTSFTYWTGSTANTLSTVTTTFTGTD
TAPASS results:
A0A1Q3ALI5
Repeat units annotated: 207-307, 314-353
A0A1Q3ALI5 Interpro sequence information
A0A1Q3ALI5 AlphafoldDB sequence information
Sequence:
>tr|A0A1Q3ALI5|A0A1Q3ALI5_ZYGRO PA14 domain-containing protein (Fragment) OS=Zygosaccharomyces rouxii OX=4956 GN=ZYGR_0BQ00100 PE=4 SV=1
MVSHKSIFQWLLWFSVLGITKALAATACLPANGAQSGFKANFFQYNYGDMTTLRQPSFIA
GGYAKRQLLGTQNNVNNILIAYGMECQLSNGEVVTPTEPWNFDYSQCKNKRYFSQRHNGT
IFGFELTATNFTVELTGYLLAPQTGTYTFTFDHVDDSAILNFGEGIAFDCCNQDAAANGN
TQFSINAIKPDYGPTAHMNYSVDLVGNYYYPMRIVYTNRHVFGWLFTTLTLPDGTNIDND
FTGYVYSFVSEPEQPNCTVTSPLPFVTSTSTTPWTGSFTSTYSTQTNVNTDSDGDNAGTV
IIDVETPTTPPVLTTEYTGYSGSETSTYSTESTWVTGTDGKTTPETIYHVETPTIPPV
MRF results:
Region 1: 326-334,3 aa length,3 units, regex_SX3 0.86
STY
STE
STW
TAPAS results:
PF00880 - Nebulin repeat
PF00880 Protein family information
In the literature we can observe a very high model confidence (https://www.mpg.de/18283745/nebulin-no-longer-nebulous)
A0A0S7IV57
A0A0S7IV57 Interpro sequence information
A0A0S7IV57 AlphafoldDB sequence information
Sequence:
>tr|A0A0S7IV57|A0A0S7IV57_9TELE NEBU (Fragment) OS=Poeciliopsis prolifica OX=188132 GN=NEBU PE=4 SV=1
SNDVVQARLAYDLQSDAVYKADLKWLQGLGWVPIGSLDVEKAKKAAEVLSDRKYRQHPST
VKFTSPIDAMNIVLAKSNAMTMNKRLYTEAWENEKTKLHIKPDTPEIVLSQQNAINMSKK
LYKQGFEETISKGYFLPPDAVSVKAAKTSRDIISDYKYKTG
MRF results:
Region 1: 3-141, 43 aa length, 4 units
DVVQARLAYDLQSDA--VYK---A---DLKWLQGLGWVPIGSL
DVEKAKKAAEVL--SDRKYR---Q---HPSTVKFTS--PIDAM
NIVLAKSNAMTMN--KRLYTEAWE---NEKTKLHIK--P-DTP
EIVLSQQNAINM--SKKLYK---QGFEETISKGYFL--PPDAV
TAPAS results:
PF00904 - Involucrin repeat
PF00904 Protein family information
P14591
P14591 Interpro sequence information
P14591 AlphafoldDB sequence information
Sequence:
>sp|P14591|INVO_PANPA Involucrin OS=Pan paniscus OX=9597 GN=IVL PE=2 SV=1
MSQQHTLPVTLSPALSQELLKTVPPPVNTQQEQMKQPTPLPPPCQKMPVELPVEVPSKQE
EKHMTAVKGLPEQECEQQQQEPQEQELQQQHWEQHEEYQKAENPEQQLKQEKAQRDPQLN
KQLEEEKKLLDQQLDQELVKRDEQLGMKKEQLLELPEQQEGHLKHLEQREGQLELPEQQE
GQLKHLEQQKGQLELPEQQEGQLELPEQQEGQLKHLEQQEGQLKHLEHQEGQLEVPEEQV
GQLKYLEQQEGQLKHLDQQEKQPELPEQQVGQLKHLEQQEGQPKHLEQQKGQLEHLEEQE
GQLKHLEQQEGQLEHLEHQEGQLGLPEQQVQQLKQLEKEEGQPKHLEEEEGQLKHLVQQE
GQLEHLVQQEGQLEHLVQQEGQLEQQEGQVEHLEQQVEHLEQLGQLKHLEEQEGQLKHLE
QQQGQLGVPEQVGQPKNLEQEEKQLELPEQQEGQLKHLEKQEAQLELPEQQVGQPKHLEQ
QEKQLEPPEQQDGQLKHLEQQEGQLKDLEQQKGQLEQPVFAPAPGQVQDIQSALPTKGEV
LLPLEHQQQKQEVQWPPKHK
MRF results:
TAPAS results:
B4DWR5
B4DWR5 Interpro sequence information
B4DWR5 AlphafoldDB sequence information
Sequence:
>B4DWR5 1-449
MKKEQLLELPEQQEGHLKHLEQQEGQLKHPEQQEGQLELPEQQEGQLELPEQQEGQLELPEQQEGQLELPEQQEGQLELP
EQQEGQLELPEQQEGQLELSEQQEGQLELSEQQEGQLELSEQQEGQLKHLEHQEGQLEVPEEQMGQLKYLEQQEGQLKHL
DQQEKQPELPEQQMGQLKHLEQQEGQPKHLEQQEGQLEQLEEQEGQLKHLEQQEGQLEHLEHQEGQLGLPEQQVLQLKQL
EKQQGQPKHLEEEEGQLKHLVQQEGQLKHLVQQEGQLEQQERQVEHLEQQVGQLKHLEEQEGQLKHLEQQQGQLEVPEQQ
VGQPKNLEQEEKQLELPEQQEGQLKHLEKQEAQLELPEQQVGQPKHLEQQEKHLEHPEQQDGQLKHLEQQEGQLKDLEQQ
KGQLEQPVFAPAPGQVQDIQPALPTKGEVLLPVEHQQQKQEVQWPPKHK
MRF results:
Region 1:46 -217, 20 aa length, 16 units
-------QLELPEQQEG---
-------QLELPEQQEG---
-------QLELPEQQEG---
-------QLELPEQQEG---
-------QLELPEQQEG---
-------QLELSEQQEG---
-------QLELSEQQEG---
-------QLELSEQQEG---
-------QLKHLEHQEG---
-------QLEVPEEQMG---
-------QLKYLEQQEG---
-------QLKHLDQQEKQPE
LPEQQMGQLKHLE-------
---QQEGQPKHLE-------
---QQEGQLEQLEEQE----
------GQLKHLEQQEGQL-
Region 2: 222 - 398, 20 aa length, 9 units
HQEGQLGLPEQQVLQLKQLE
KQQGQPKHLEEEEGQLKHLV
QQEGQLKHLVQQEGQ---LE
QQERQVEHLEQQVGQLKHLE
EQEGQLKHLEQQQGQLEVPE
QQVGQPKNLEQEEKQLELPE
QQEGQLKHLEKQEAQLELPE
QQVGQPKHLEQQEKHLEHPE
QQDGQLKHLEQQEGQLKDLE
TAPAS results:
AlphaFold results trimer:
PF09528 - Ehrlichia_rpt
PF09528 Protein family information
T1L1A4
T1L1A4 Interpro sequence information
T1L1A4 AlphafoldDB sequence information
Sequence:
>tr|T1L1A4|T1L1A4_TETUR Uncharacterized protein OS=Tetranychus urticae OX=32264 GN=107369337 PE=4 SV=1
MRFTIVLALCFIGAASASSLNKRSFLDDIQNNTQNAFHAFEQFGQTFNEKVQEALKNLLS
AFGNKNSSAEASVVVEKRATNPLQLINDLDDPAQFAQTFLKVLLDLATGQGRRKRDIAED
LKKFSEEAKHNAEEALKKLFSFLEQFKSQSSESTEASVVVEKRATNPLQLINDLDDPAQF
AQTLLKVLADIATGQGRRKRDIAEDLKKFSDEAKHNAEEALKKLFSFLEQFKPQSSESTE
APVVVEKRATNPLVLFNDLSQQDLGKFAQDFLKVLADIATAQG
MRF results:
Region 1:35 - 283 , 99 aa length, 3 units
NAFHAFEQFGQTFNEKVQEALKNLLSAFGNKNS-SAEASVVVEKRATNPLQLINDL--DDPAQFAQTFLKVLLDLATGQGRRKRDIAEDLKKFSEEAKH
NA------------EEALKKLFSFLEQFKSQSSESTEASVVVEKRATNPLQLINDL--DDPAQFAQTLLKVLADIATGQGRRKRDIAEDLKKFSDEAKH
NA------------EEALKKLFSFLEQFKPQSSESTEAPVVVEKRATNPLVLFNDLSQQDLGKFAQDFLKVLADIATAQG-------------------
TAPAS results:
Alpha Fold results of cutted region:
CrossBeta results:
AlphaFold results trimer:
Q6W7F7
A dimmer model has been tried, no representative model conservation was obtained
Q6W7F7 Interpro sequence information
Q6W7F7 AlphafoldDB sequence information
Sequence:
>tr|Q6W7F7|Q6W7F7_EHRCH 120 kDa immunodominant surface protein (Fragment) OS=Ehrlichia chaffeensis OX=945 PE=4 SV=1
VSQPSLEPFVAESEVSKVEQEETNPEVLIKDLQDVASHESGVSDQPAQVVTERENEIESH
QGETEKESGITESHQKEDEIVSQSSSEPFVAESEVSKVEQEKTNPEVLIKDLQDVASHES
GVSDQPAQVVTERENEIESHQGETEKESGITESHQKEDEIVSQSSSEPFVAESEVSKVEQ
EETNPEVLIKDLQDVASHESGVSDQPAQVVTERENEIESHQGETEKESGITESHQKEDEI
VSQSSSEPFVAESEVSKVEQEETNPEVLIKDLQDVASHESGVSDQPAQVVTERESEIESH
QGETEKESGITESNQKEDEIVSQPSSEPFVAESEVSKVEQEETNPEVLIKDLQDVASHES
GVSDQPAQVVTERESEIESHQGETEKESGITESHQKEDEIVSQPSSEPFVAESEVSKVEQ
EETNPEILVEDLPLGQV
MRF results:
Region 1:2-401,80 aa length, 5 units
SQPSLEPFVAESEVSKVEQEETNPEVLIKDLQDVASHESGVSDQPAQVVTERENEIESHQGETEKESGITESHQKEDEIV
SQSSSEPFVAESEVSKVEQEKTNPEVLIKDLQDVASHESGVSDQPAQVVTERENEIESHQGETEKESGITESHQKEDEIV
SQSSSEPFVAESEVSKVEQEETNPEVLIKDLQDVASHESGVSDQPAQVVTERENEIESHQGETEKESGITESHQKEDEIV
SQSSSEPFVAESEVSKVEQEETNPEVLIKDLQDVASHESGVSDQPAQVVTERESEIESHQGETEKESGITESNQKEDEIV
SQPSSEPFVAESEVSKVEQEETNPEVLIKDLQDVASHESGVSDQPAQVVTERESEIESHQGETEKESGITESHQKEDEIV
TAPAS results:
PF10529 - Hist_rich_Ca-bd
PF10529 Protein family information
We found a group of proteins in which the repeat region protein is in most of the protein length
P23327
P23327 Interpro sequence information
Sequence:
>P23327 1-699
MGHHRPWLHASVLWAGVASLLLPPAMTQQLRGDGLGFRNRNNSTGVAGLSEEASAELRHHLHSPRDHPDENKDVSTENGH
HFWSHPDREKEDEDVSKEYGHLLPGHRSQDHKVGDEGVSGEEVFAEHGGQARGHRGHGSEDTEDSAEHRHHLPSHRSHSH
QDEDEDEVVSSEHHHHILRHGHRGHDGEDDEGEEEEEEEEEEEEASTEYGHQAHRHRGHGSEEDEDVSDGHHHHGPSHRH
QGHEEDDDDDDDDDDDDDDDDVSIEYRHQAHRHQGHGIEEDEDVSDGHHHRDPSHRHRSHEEDDNDDDDVSTEYGHQAHR
HQDHRKEEVEAVSGEHHHHVPDHRHQGHRDEEEDEDVSTERWHQGPQHVHHGLVDEEEEEEEITVQFGHYVASHQPRGHK
SDEEDFQDEYKTEVPHHHHHRVPREEDEEVSAELGHQAPSHRQSHQDEETGHGQRGSIKEMSHHPPGHTVVKDRSHLRKD
DSEEEKEKEEDPGSHEEDDESSEQGEKGTHHGSRDQEDEEDEEEGHGLSLNQEEEEEEDKEEEEEEEDEERREERAEVGA
PLSPDHSEEEEEEEEGLEEDEPRFTIIPNPLDRREEAGGASSEEESGEDTGPQDAQEYGNYQPGSLCGYCSFCNRCTECE
SCHCDEENMGEHCDQCQHCQFCYLCPLVCETVCAPGSYVDYFSSSLYQALADMLETPEP
MRF results:
TAPAS results:
B4DUM3
B4DUM3 Interpro sequence information
Sequence:
>B4DUM3 1-454
MGHHRPWLHASVLWAGVASLLLPPAMTQQLRGDGLGFRNRNKDVSTENGHHFWSHPDREKEDEDVAKEYGHLLPGHRSQD
HKVGDEGVSGEEVFAEHGGQARGHRGHGSEDTEDSAEHRHHLPSHRSHSHQDEDEDEVVSSEHHHHILRHGHRGHDGEDD
EGEEEEEEEEEEEEEASTEYGHQAHRHRGHGSEEDEDVSDGHHHHGPSHRHQGHEEDDDDDDDDDDDDDDDVSIEYRHQA
HRHQGHGIEEDEDVSDGHHHRDPSHRHRSHEEDDNDDDDVSTEYGHQAHRHQDHRKEEVEAVSGEHHHHVPDHRHQGHRD
EEEDEDVSTERWHQGPQHVHHGLVDEEEEEEEITVQFGHYVASHQPRGHKSDEEDFQDEYKTEVPHHHHHRVPREEDEEV
SAELGHQAPSHRQSHQDEETGHGQRGSIKEMSHHPPGHTVVKDRSHLRKDDSEE
MRF results:
TAPAS results:
AlphaFold trimer results:
PF12778 - PXPV repeat (3 copies)
PF12778 Protein family information
Q127N3
Q127N3 Interpro sequence information
Sequence:
>Q127N3 1-147
MKTAIKTNRSVATAGAAAALAVAALGFAGAAQARDDVYWSVGVGSPGVSVNVGNAYPVYTPAPVYVQPAPVYYQPAPV
YVRPAPVYYQPAPVFVQPRPYYGPPQVVYVQPGNRHGWHKKHGRDHDDDRGYRGGYGYRQGYAPVYYQR
MRF results:
Region 1: 61 97 9 aa length, 5 units
PAPVYVQ--
PAPVYYQ--
PAPVYVR--
PAPVYYQ--
PAPVFVQPR
Region 2:129-140, 6 aa length, 2 units
GYRGGY
GYRQGY
TAPAS results:
AlphaFold trimer results:
PF14585 - CagY_I
PF14585 Protein family information
There is already a structure, Xmer to be stable
6odi
https://www.ebi.ac.uk/interpro/structure/PDB/6odi/#table
PF14912 - Testicular haploid expressed repeat
PF14912 Protein family information
There is already a structure, Xmer to be stable
8snb
https://www.ebi.ac.uk/interpro/entry/pfam/PF14912/structure/PDB/#table
PF15287 - KRBA1
PF15287 Protein family information
A0A452ID02
A0A452ID02 Interpro sequence information
Sequence:
>A0A452ID02 1-1256
MEENYQLLISLGQPVPTLALLALAVESEAAGSQIRGVSGEPELASDSSSSEGEELAFPEDDPDVGGFWDSRQTEEDGCPT
GDEEGAHQGSLHLSALMKLVKEIPEFLLGNLKAPVEPAEAADSEAEMGSERAYADVKPEVTPETPPPLDLENCLVEASVN
RPNHPDTPSSCLSTSSTERAPLRRLYAEVAPENSPLQGLLNCLKEIPVHRPRHPNMQSPGAQGDVEHKGVVGEVKSLCAA
AGTAENSPLQGLLNCLKEIIVHKPHHPHPSPCKSAKGSTRGDSGKRRLESEDGSSSVEVKTEVTAGDPQDPGLESCPSAR
SVSKASPPAMPASRSPRNRAEGEAGGRSLFGEGAVKREGAAESSLAQGLLSCWRDVISPCHPARPACSSPTSSAHWRTEQ
RGLEPGPWRSPGEEAVLEDSPLRGLENCLKDIAVTSPCCSHLPASRSAQRALGERPGRGAGSLSAEEMMPRTSPLHDLAN
CLQETPVSTSWASRVLARNAGADVSSRRPATVTVRSCCGEDISTETSPLRGLENCLRDIPVNSPHMPASISLTSATQRDM
GQRRPGAGTRRSLREDINAENSPLQGLENCLKDIPVPCHNQSNTPSSRSSLNSSPAPQGRLETAGWPVKTEGSVSEVTPP
LQGLENCLKDIPMVRLRGSRETPSSFCTSKAPGEVEQRRPVPRPWRACAAELSPENSPLRGLEKCLQEISVPRPQTPSAP
ATAGVVGSRQGDTARRRPETGHWDWHGSDKRQAENLPGLEEVAAPSSCPAQTPPSSTRGDAERQEQDVHTRSSGSKDVTV
RSSPLSWLNCVKELTADRVTPSSPPACAAQGDVTLQGAESRGRTLSEEEVMPADAPRYGPATCPQGTHGTGPSRSRTPSD
TLPARDAPGHRCPKRPGTGVKRPHTEARTDAARPPSTSSCTSSEDGGQKADAEWEEFPKRHCSTAALSPWECFRWESRTP
LDLRVERSMIEAVLSEKLDRVSQDFMAMCRDVSSMQSRVAQLERDSRGWALELAALQKGNKRLSETVRRLESRCHMLENR
AHRNSLRLAGLPEGAEGGDPVAFLQRTLPTVLNLPADWPPLEIESVRRVHGGAHWDPATRPRALLFRLLRFSDKLAIMRA
VRKRTEPLTCGGAKVALFPDVCPKLCRRRGAQYAAVRRLWRAAELRLGTQPSGCCHDRARGHWEPLPSPLGRAPTADXCR
RTGEQSHQRAVTESGGLGAAGSAHPPLSLKVRLHSAPEITAPAGSGLELSRFPDCS
MRF results:
Region 1:93 - 914 ,80 aa length, 15 units
LSALMKL--VKEI--PEFLL----GNL---KAP---VEPAEAADSEAEMGSERA-YA----D-VKP---EV-TPETPP--
PLDLENC--LVEA--S---V----NRP---NHP---DTPSSCLST--SSTERAP-----LRR-LYA---EV-APENSP--
LQGLLNC--LKEI--P---V----HRP---RHP---NMQSP--GAQGDVEHK-G-VVGEVKS-LCA---AAGTAENSP--
LQGLLNC--LKEI--I---V----HKP---HHPHPSPCKSAKGSTRGDSGKRRL-ESEDGSS-SVEVKTEV-TA-GDPQD
-PGLESCPSARSV--SK--A----SPP---AMP---ASRSPRNRAEGEAGGRSLFGEGAVK----R---EG-AAESS-LA
-QGLLSC--WRDVISP---C----H-P---ARP---ACSSPTSSAHWRTEQRGL-EPGPWRS-PGE---EA-VLEDSP--
LRGLENC--LKDI--A---V----TSPCCSHLP---ASRS----AQRALGE-RP-GRGAGSL-SAE---EM-MPRTSP--
LHDLANC--LQET--P---V----STS---W-----ASRVLARNAGADVSSRRP-ATVTVRSCCGE---DI-STETSP--
LRGLENC--LRDI--P---V----NSP---HMP---ASISLTSATQRDMGQRRP-GAGTRRS-LRE---DI-NAENSP--
LQGLENC--LKDI--P---VPCH-NQS---NTP---SSRSSLNSSPAPQGRLET-AGWPVKT-EGS----V-SEVTPP--
LQGLENC--LKDI--P---MVRLRGSR---ETP---SSFC-TSKAPGEVEQRRP-VPRPWRA-CAA---EL-SPENSP--
LRGLEKC--LQEI--S---VPRP-QTP---SAP---ATAGVVGSRQGDTARRRP-ETGHWDW-HGS---DKRQAENLP--
--GLE------EV--A---A----PSS----CP---A-QTPPSSTRGDAERQEQ-DVHTRSS-GSK---DV-TVRSSPL-
--SWLNC--VKEL--T---A----DRV----TP---SS-PPACAAQGDVTLQGA-ESRGRTL-SEE---EV-MPADAPRY
--GPATC--PQGT--H---G----TGPSRSRTP---SDTLPARDAPGHRCPKRP-GTGVKRP-HTE---AR-TDAARP--
Region 2:1021-1143, 46 aa length, 3 units
KRLSETVRRLESRCHMLENRAHRNSLRLAGLPE-GAE-GGDPVAFL
QRTLPTVLNLPADWPPLEIESVR---RVHGGAH------WDPATRP
RALLFRLLRFSDKLAIM--RAVRK--RTEPLTCGGAKVALFPDVCP
TAPAS results:
A5PL33
A5PL33 Interpro sequence information
Sequence:
>A5PL33 1-1030
MRENYETLVSVGTAELLPLSAFLSPSEPGRAVGGGSHADEGQEPAGCGDPQGGQPRHSLHLTALVQLVKEIPEFLFGEVK
GAMDSPESESRGASLDGERASPEAAAAREPCPLRGLLSCLPDGPTSQPHLATTPTDSSCSSGPTGDGVQGSPLPIKTADK
PWPTRKEGPGALGGEPSPPTHSPSRRKSHRGQERGTSEAGISPGNSPLQGLINCLKEILVPGPRHPETSPSFLPPLPSLG
TSRLTRADLGPGSPPWAVKTEAVSGDCPLQGLLHCLKELPEAQDRHPSPSGVGNRRLQENPGAWKRGSGGPGYLLTPPPH
PDLGAGGLLSVKMENSWVQSPPGPASCQPGRQPLSPSATGDTRGVPQPSWGPEAQAASASSSPLEALEACLKGIPPNGSS
PSQLPPTSCSQNPQPGDSRSQKPELQPHRSHSEEATREPVLPLGLQSCVRDGPSRPLAPRGTPTSFSSSSSTDWDLDFGS
PVGNQGQHPGKGSPPGSSPLQGLENCLKEIPVPVLRPAWPCSSAADRGPRRAEPRNWTADKEGLRAEACESARLGQGRGE
APTRSLHLVSPQVFTSSCVPACHQRGFKDPGATRPGVWRWLPEGSAPKPSPLHCLESALRGILPVRPLRFACVGGPSPSP
SPGSSSSFSGSEGEDPRPEPDLWKPLPQERDRLPSCKPPVPLSPCPGGTPAGSSGGSPGEDPRRTEPRYCSGLGAGTAQD
PCPVSQLEKRPRVSEASRGLELGHGRPRVAAKTHERLLPQGPPELPSESPPPELPPPEAAPPVLPASSLQPPCHCGKPLQ
QELHSLGAALAEKLDRLATALAGLAQEVATMRTQVNRLGRRPQGPGPMGQASWMWTLPRGPRWAHGPGHRHLPYWRQKGP
TRPKPKILRGQGESCRAGDLQGLSRGTARRARPLPPDAPPAEPPGLHCSSSQQLLSSTPSCHAAPPAHPLLAHTGGHQSP
LPPLVPAALPLQGASPPAASADADVPTSGVAPDGIPERPKEPSSLLGGVQRALQEELWGGEHRDPRWGAH
MRF results:
Region 1: 60-717,15 aa length, 6 units
HLTALVQLVKEI-P--EFLFGEVKGA----MDSPES-ESRG--ASLD--G--E-RAS--PEAAAAREP-CP-L--RGLLSC----LPD----G----P--TSQPH-L-AT--T-PTDSSCSSG--PTGDGVQGSPLPIKTADKPWPTRKEG-PG--
-----------------ALGGEPSPP----THSPSR---RK--SHRG--Q--E-RGT--SEAGISPGN-SP-L--QGLINC----LKEILVPGPRH-P--ETSPSFL-PP--L-PSLGT-SRL--TRADLGPGSP--------PWAVKTEAVSGDC
PLQGLLHCLKEL-P--EAQD---RHP----SPSGVG-NRRL--QENP--GAWK-RGSGGPGYLLTPPP-HPDLGAGGLLSV----KME----N----SWVQSPPG-P-AS--CQPGRQPLSPS--ATGDT-RGVPQP---SWGPEAQAASA-SS-S
PLEALEACLKGI-P--PNGSSPSQLPPTSCSQNPQPGDSRSQKPELQ--P--H-RSH--SE-EATREPVLP-L---GLQSC----VRD----G----P---SRP--L-APRGT-PTSFSSSSS--TDWDLDFGSPVG-NQGQHPGKGSP---PGSS
PLQGLENCLKEI-P--V--------P----VLRPAW-PCSS--AADR--G--PRRAE--PRN-WTADK-EG-L--RA-EACESARLGQ----GRGEAP--TRSLH-LVSP--Q-VFTSSCVPACHQRGFKDPGATRPGVWRWLPEGSAPK--P--S
PLHCLESALRGILPVRPLRFACVGGP----SPSPSP-GSSS--SFSGSEG--E-DPR--PEPDLWK-P-LP-Q--E----------RD----RL---P--SCKP-----P--V-PL-SPCPGG--TPAGSSGGSPGEDPRRTEPRYCSGLG-AG-T
Region 2:636 -649, 2 aa length, 7 units
PS
PS
PS
PG
SS
SS
FS
TAPAS results:
PF15788 - DUF4705
PF15788 Protein family information
B4DF06
B4DF06 Interpro sequence information
Sequence:
>B4DF06
MLLPPGSLSRPRTFSSQPLQTKLMTHNGLFRPIPYVTAASADEATASQQPPQAQLHRYNGLFRPSSCLPAFSPGPELSQV
DLTRPRSCFFAASPGPAPASWWPLQAQPLPPVSLYSPNVCLTADSSRPASTSLWTPQAKLPTFQQLLHTQLLPPSGLFRP
SSCFTRAFPGPTFVSWQPSLARFLPVSQQPRQAQVLPHTGLSTSSLCLTVASPRPTPVPGRHLRAQNLLKSDSLVPTAAS
WWPMKAQNLLKLTCSGPAPASCQHLQAQPLPHGGFSRPTSSSWLGLQAQLLPHNSLFWPSSCPAHGGQCRPKTSSSQTLQ
AHLLLPGGINRPSFDLRTASAGPALASQGLFPGPALASWQLPQAKFLPACQQPQQAQLLPHSGPFRPNS
MRF results:
Region 1:61 - 145 ,32 aa length, 3 units
LFRPSSCLPAFSPGPE-----------LSQVD
LTRPRSCFFAASPGPAPASWWPLQAQPLPPVS
LYSPNVCLTADSSRPASTSLWTPQAKLPTFQQ
Region 2:342 - 363 , 11 aa length, 2 units
GPALASQGLFP
GPALASWQLPQ
TAPAS results:
Q6ZQT7
Q6ZQT7 Interpro sequence information
Sequence:
>Q6ZQT7 1-251
MQPGGTAGPEEAPMREAEAGPPQVGLSRPTCSLPASSPGPALPPGCVSRPDSGLPTTSLDSAPAQLPAALVDPQLPEAKL
PRPSSGLTVASPGSAPALRWHLQAPNGLRSVGSSRPSLGLPAASAGPKRPEVGLSRPSSGLPAAFAGPSRPQVGLELGLE
EQQVSLSGPSSILSAASPGAKLPRVSLSRPSSSCLPLASFSPAQPSSWLSAAFPGPAFDFWRPLQAQNLPSSGPLQARPR
PRPHSGLSTPS
MRF results:
Region 1: 111-152, 21 aa length, 2 units
VGSSRPSLGLPAASAGPKRPE
VGLSRPSSGLPAAFAGPSRPQ
TAPAS results:
PF18727 - ALMS_repeat
PF18727 Protein family information
The sequences are long, more than 3500 amino acids there is no alphafold model
A0A8I3P0L2
A0A8I3P0L2 alphaFold model do not exist
A0A8I3P0L2 Interpro sequence information
Sequence:
>A0A8I3P0L2 1-4373
TYISINKFLFLGDTSKGGIAEITQSSLKPGITTTRESDTGSLLSLFPEDFPQLALRSPQEITIGQHSDTLHQQELVGSHK
TEETPKVSTVPKLDDQNTGISTVPSSSYSQRGKPSILHQQSLPDSYLAEEALKVAAVPEPTDQKTSISTVLPGSYSLGEK
HCIFYPQTLPESHLTEEAVRVSAFSGLADQKTDIPTVLPSSYSLREKHNIFYQQALPDSHLTEEAVRVSAVPGPADQKTR
IHIVLPGSHSLGEKHKIFCQQALPNSHLTKETLKVSAVPGPVEQKSVIPIVLPGSYLLGEKRNIFHPPTLPESHLTEEAV
RVSAAVPGSVDQKTGIPTVLPGSFSLGEKASIFHQQALPESHLTKEALRVSAVPGPIDQKTGIPTVLPGSYSLGENCNIF
QPQTLPDGHLTGEAVRVSTVPGPVDQKTGIPTVLSGSYSLGEKRNIFHPQTLPGIHLTEEAQRVLAVPGPADEKTGIPTV
LPGSYSLGEKRNIFHPQTLPSIHLTEEAQRVLAVPGPADEKTGIPTGLAGSYSLGEKRNIFYPQTLPQSHLTEEALKVLA
GPGPVDQKTGIPTILPGSYSLGERRNIFHPENLRDSHLTEEALRVSGVPSPADQKTDIPAGLAGSYSLGEKRNIFYPQTL
SQSHLIEEAIRVSAFPGPADQKTGIRTGLAGSYSLGEKCNIVHSETLPDNHLTEGTQRVLAVPGPVDQKTGIPTGLAGSY
SLGEKRNIFHPENLPESPLTEEALKVLAGPGPADQKTGIPIGLPGSYSLGEKHHIFHSENLPDSHLTEEAVRVSAVPSPA
DQKTGIPTVLPGSYSLGEKCNIFHPENLPDSHLTEEALKVLAVPGLADQKTGIPTVLPGSYSLGEKHHIFHTKNLADNPL
TEEAIRVSAFPGPVDQKTDIPTGFPGSYSLEEKSNIVHPEILLDSLLTEEAVRVLAVPGPDDQKADVPTGLPGSYSLGEK
CNIVQPETLPDSHLTEEAVRVSAVPGPVDQKTGIPTGLPGSYSLGEKHSIFHPEILPDNHLTEEAVRVLTVPGPPDQKTD
RPTGHPGSYSPREKHNIFYPQTLPESPRTEEALRVSAVPGPVDQKTGRPTVLPGSYSPGEKHHILHPETLPDSHLTEESL
KISTVPVPTDQRTEKIIVPSASLSQREKHVIFSQQQLSDGDLTAQVLKASVAPGPADQNIGLPTLSSSSYSLGEKHCICY
QQALLDSHLIEQAQKVAAVPRPADQKTRIPLASSTSYLQGERPHIFCQQTLPESDLTEQALKYSAPGSAEQKTGIPTLTS
TSYSHREKSSISNQQELPDSPLAEQAPKVPAVPGPAEKKSGSLSEASNFSSRREKHSIFYQQEFLGSSLIEPAQKVSPVP
GPTDQKPEIPTVTSTYSHVEKPFIFYPQGLPDSPLPEEALKVTAVSEPTDQQTGTPVVPSSSYSPGEKPIIFYPQGLTDV
YLTKEALKVSAISGSADWKTGIPTVSSTSYSNREKPIIFYPQGLTDSQLPQEALNISAIPGPADQKTGLPSEPSSSYSLR
EKPIIFYPQDLTNSQVPQAALKVSAIPGPADQKTGLPLEDSSSYSPREKPIIFYPQGLTDSQLPQAALKLSAIPGPADQK
TRLPSEPSSSYSFREKPIIFYPQGLTDSQLPQEALNVSAIPEPADQKTELPSEPSTSYSPREKPSIFYPQDLTDSQLPQE
PLNISAIPGPADQKTGLPSESSSSYSPREKPIIFYSQGLIDGQVPQVALKVSATPGLADQKTGLPSEPSSSYSPREKPII
FYPQGLTDSQLPQEALKVSAIPGPGDQKTGLPSEPSSSYKPSIFYPQDLTDSQLPQEPLNISAIPGPADQKTGLPSESSS
SYSPREKPIIFYSQGLIDGQVPQVALKVSATPGLADQKTGLPSEPSSSYSPREKPIIFYPQGLTDSHLPQKALKVSAILG
PGDQKTGLPSEPSSSYSHREKSNIFYAQEFPGSHLTEEALKVSAFSGIGDQKTGIPTVLSSSYSLGGKPIIFYQQALSDR
HLTDEALNVSASSGPADQETGIPTVSSVSYSHRERPSILYQQPFSDNQLAIAALKVSAVSGSDDQKTRKPTITSASYSER
EKPIIYHQQLPDLTQESLNVFRIPGLGDQRTGITAVTSTTYSHREKPVISYQQELPAPNEGALKVLGAPGSADQQSGIRF
GPSTSYSHRKNPIFSYLESPDITEETLKISAVSGPGDQKTGIHIIPSSSYSYREKDSIFYQEELPDVTEAALKVFALPGP
ADQKTEIPIGPSSSYSHEEKLKISPVILPDDQETELLTAPLSFYSKREKPKISTVIGSDNQKTPLLTVLHNSYSQKVKPG
IFLQHQLSDKHQSENILKISAVSEPIDVNSGIPISLSSSYSHREKSNNFYPQELPDKHLGKGALKVSTIPLPADQKSLLP
TAPSSFSHREQPDIFCQQDFPDRHLTQDALMFSSGVGQADQITGLSTVTPGTYSYSEKQKLVSDHVQMLIDNLDSSNSSV
TSNSMPLNSQADGRVIISKPESSSFEDVRSEEIQDRSSGSKTLKEIRTLLMEAENIALKRCNFPAPLVPFRDVSDISFIQ
SKKVVCFKEPLTADEYNGDLPQRQPFIEESPSNKCIQKDISTQTNLKCQRGIENWEFISSTTVRSPLQEAESKARVTVDE
TCRQYRAAKSVMRSEPEGYSGTIGNKIVIPMMTIIKSDSSSDASSCSWDSNSLESVSDVLLNFFPYSSPKTSLTDSREEG
VSESDDGGGSSVDSLAAHVRNLLKCESSLNHAKQILRNAEEEECRVRARAWNLKFNLAHECGYSISELNEDDRRKVEEIK
AKLFSHERTTDLSKGLQSPRGIGCKPEAVCSHIIIESHEKGCFRTLTAEQPQLDSHPCVFRSADPSDMIRGQRSPSSWRT
RHIDLSKSLDQCNPHFKVWNSLQLRSHSPFQNFAADDFRISQGLRMPFHEKIDPWLSELVEPASVPLEEMDCHSSSQMLP
PEPMKKFTTSITFSSHRHSKCFSDSSVLKVGVTEGSQCTGASVGVFNSHFTEEQNPPRDLEQRTSSPSSFKIVSHSPDKA
VTILAESSRQSPKLSVEHSQQEEKFLERSDFKSSDSEPSTSTKCSNVKEVHFSDNHTFISMSRPSSTLGVKEKNVTITPD
LSSHIILEQRQLFEQSKAPHADHHVRKHHSPPPQHQDYVAPNLPCRIFLEKQELFEQSKAPHLDHQMRENHSPFLQGQDY
IASDLPSSIFLEQRQLFEQSKAPDVDHMGKYHSPLPQVQDYVVEKNNQHKFKSYISNMINVEAKFDNVISQSAPSQCTLV
TSTSASTPPSNRKALSCFRITLYPKTPSKLDSGTLDKRFHTLDPASKTRMNSEFNSDLQTISSRSLEPTSKLLASKPIAQ
NQESLGFVGPKSSPDFQVVQSPLPDSNDISQDLKSILFQNNQIVTSKQTQVNISDLEGYSSPEGTPVSADRSSEGIKAPF
SAFPGKLSSDAVTQITTESPGKTMFSSEIFINTKDRGLAISEPSTQKLGKGPVKFASSSSVQQITHPHGTDGSNDAIAPD
FPAEVLGTRDDDLTVPANIKHKEGIYSKRVVPKASLLVGRKTPQKDNADAQVQVSITDDENLSDKNQKKEIYTKKAVTKA
AQPEEESLQKASKGSSDAAAAEHSARLQDIKLESLPDTKAIKQKEEILNKRTFPKEAWKEDKESLQIDIAESRCHSEFEN
TTHSVFRSAKFYFHHPVHLPSDQDFCHESLGRSVFMRHSLKDFFQHHPDKQREHTSLPSPRQNVEKTKTDYTRIESLSIN
VNLENDVMHTAKSRARDNPKSDKQLNDQKRDHKVTPEPTAQHTVSLNELWNRYQERQRQQRPPQFGDRKELSLVDRLDRL
AKLLQNPITYSLRTSESTQDDSRGERDVKEWSGRQQQQKSKLQKKKRYKSLEKFHKNAGELKKSKMLSTHQAGKSNQIKI
EQIKFDKYILRKQPDFHYRNNTSSDSRPSEESELLTDTATNLLSTTTSPVESDILTQTDREVTLQERSSSISTIDTARLI
QAFGHERVCLSPRQIKLYSSITDHQRRYLERRSKKNKKALNMNHPQMTSEHTRRKHIQVADHVISSDSVSSSTSSFWSSS
STLCNMQNVQMLNKAVQAGNLEIVNGVKKHTRDVGMTFPTPSSSEARIEEDSDMTSWSEEKIEEKRLLTNYLGDKKLRKN
KHSCCEGVSWFVPVENVKSEPKKENLPKLHGPGICWFAPITNTKPWREPLREQNWQGQHVDGHRPLAGPDRERLRPFVRA
TLQESLHLHRPDFISRSGERIKRLKLIVQERKLQNMLESEREALFNVSREWQGYRDPTHLLPKKGFLDARKSRPIGKKEM
IQRSKRIYEQLPEVQRKREEEKRRLEYKSYRLRAQLFKKKVTNQLLGRKVPWN
MRF results:
TAPAS results:
The output of TAPASS was not shown because of the length of the protein, so some amino acids in the C-term were eliminated
![]()
PF02095 - Extensin-like protein repeat
PF02095 Protein family information
P13993
P13993 Interpro sequence information
Sequence:
>P13993 1-230
MASLSSLVLLLAALILSPQVLANYENPPVYKPPTEKPPVYKPPVEKPPVYKPPVENPPIYKPPVEKPPVYKPPVEKPPVY
KPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVY
KPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPIYKPPVEKPPVYKPPYGKPPYPKYPPTDDTHF
MRF results:
Region 1: 44 - 223,10 aa length, 18 units
VEKPPVYKPP
VENPPIYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPVYKPP
VEKPPIYKPP
VEKPPVYKPP
YGKPPYPKYP
TAPAS results:
Q43414
Q43414 Interpro sequence information
Sequence:
>Q43414 1-227
PVYKPPVEKPPVYKPPIEKPPVYKPPVEKPPVYKPPVEKPPVYKPPIEKPPVYKPPVEKPPIYKPPVEKPPVYKPPVEKP
PVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPVYKPPVEKPPIYKPPVEKPPVYKPPIEKP
PVYTPPVEKPPVYKPPIEEPPVYKPPVEKPPVYGPPYEKPPHYPGYPPYEKPPHHPGYPPADDDNRF
MRF results:
Region 1: 2 - 211, 12 aa length, 21 units
VYK--PPVEKPP
VYK--PPIEKPP
VYK--PPVEKPP
VYK--PPVEKPP
VYK--PPIEKPP
VYK--PPVEKPP
IYK--PPVEKPP
VYK--PPVEKPP
VYK--PPVEKPP
VYK--PPVEKPP
VYK--PPVEKPP
VYK--PPVEKPP
VYK--PPVEKPP
VYK--PPVEKPP
IYK--PPVEKPP
VYK--PPIEKPP
VYT--PPVEKPP
VYK--PPIEEPP
VYK--PPVEKPP
VYG--PPYEKPP
HYPGYPPYEK--
TAPAS results:
PF02218 - Repeat in HS1/Cortactin
PF02218 Protein family information
Q9VDF4
Q9VDF4 Interpro sequence information
Sequence:
>Q9VDF4 1-559
MWKASAGHQIQATSAASAEDDDWETDPDFVNDVSEQEQRWGSKTIDGSGRTAGTIDMDKLREETEQADLDKKKQLLKDQN
AGYGYGGKFGVEKDRMDKSAVGHDYQGKVGKHASQKDYSDGFGGKFGVQEDRKDKSAVGWDHVEKVEKHASQKDYATGFG
GKFGVQSDRVDKSAVGWDHIEKVEKHESQKDYSKGFGGKFGVQEDRKDKSAVGWDHKEAPQKHASQVDHKVKPVIEGAKP
SNLRAKFENLAKNSEEESRKRAEEQKRLREAKDKRDREEAAKKTVAENTPRTSTEAPPPKGSRAAIQTGRTGGIGNAISA
FNQMQSPVSETPPARKEPIIIPKAQPVKIELEAKEEPTASTTSAAVAPTPTVVPAREPETAPVAKAAAPPPDVVPQIEVE
TVDTPPRSEPQSPVYVPTPQPEVHAQVQVQPEPQPQADPEPVVEEEPLYQNQAEIKAASPLPPTNGTVSEAVAPSGTATV
PEEAIYANSDNLADYLEDTGIHAIALYDYQAADDDEISFDPDDVITHIEKIDDGWWRGLCKNRYGLFPANYVQVVGQNS
MRF results:
Region 1: 44 - 223,10 aa length, 5 units
DKKKQLL---KDQNAGYGYGGKFGVEKDRMDKSAVGH
DYQGKVGKHASQKDYSDGFGGKFGVQEDRKDKSAVGW
DHVEKVEKHASQKDYATGFGGKFGVQSDRVDKSAVGW
DHIEKVEKHESQKDYSKGFGGKFGVQEDRKDKSAVGW
DHKEAPQKHASQVDHKV----KPVIEGAKPSNLRAKF
TAPAS results:
PF03057 - Repeat in HS1/Cortactin
PF03057 Protein family information
PF03057
PF03057 Interpro sequence information
Sequence:
>A0A0B2V1U5 1-535
MFSLVIGSSFQQLYQAATPTGPVLGPSRNTHLPQSWVIKPKRSTPLDEKRTAPIACRGRQMTAFLEPVALLDGLSIWLLI
ALLLTSFVEALYSSCCCCRRKKKKKKKVKKKTNDNEKSGNKDGEQENDGQADAGAPPAAPPAAPKPPDKGGIAGTFDPNY
QTLAGMGQDIFGADKKAGGGGGGAVGGGGPPKPPAAGGMAGTYDPNYQTLAGMGQDIFGADKKCGGGGGAAPQVPQAPKP
GAGGMAGTYDPNYQTLAGLGQDVFGADKKVGGGGGGPPQAPKPGGGGMAGTYDPNYQTLAGLGQDVFGADKKAAGGGGGG
AGPIRAPENAGAKAGTYDPNYQTLAGIGGDVFGADKKKPAAFGGADGIKVPQNAGAKAGTYDPNYQTLAALDNNVFGEDK
KAKAGGGGGAANIKVPQNAGQKAGTYDPNYQTLAALDNNVFGEDKKAKGGGGGGAGGGIRAPENIGAKAGTYDPNYQTLA
AVGGDVFGADKKKPAGGGGFRTPENQAAKAGTYDPNYQTLAALGNDVFGADKKKF
MRF results:
TAPAS results:
PF03991 - Copper binding octapeptide repeat
PF03991 Protein family information
Q7KYY8
Q7KYY8 Interpro sequence information
Sequence:
>Q7KYY8 1-81
PQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWG
Q
MRF results:
TAPAS results:
PF04649 - Mycoplasma hyorhinis VlpA repeat
PF04649 Protein family information
Q9L8V9
Q9L8V9 Interpro sequence information
Sequence:
>Q9L8V9 1-384
MKKSIFSKKLLVSFGSLVALAAIPLIAISCGQTDNNSSQSQQPGSGTTNTSGGTNSSGSTNGTAGTNSSGSTNGSGNGSN
SETNTGNKTTSESNSGSSTGSQAGTTTNTGSGSNSESGMNSEKTENTQQSEAPGTNTGNKTTSESNSESSTGSQAGTTTN
TGSGSNSESGMNSEKTENTQQSEAPGTKTENTQQSEAPGTKTENTQQSEAPGTNTGNKTTSESNSGSSTGSQAGTTTNTG
SGSNSESGMNSEKTENTQQSEAPGTKTENTQQSEAPGTKTENTQQSEALGTNTGNKTTSESNSGSSTGSQAGTTTNTGSG
SNSESGMNSEKTENTQQSEAPGTNTGNKTTSESNSESGMNSEKTENTQQSEAPGTKTENTQHTS
MRF results:
TAPAS results:
PF04671 - Erythrocyte membrane-associated giant protein antigen 332
PF04671 Protein family information
W7FNF1
W7FNF1 Interpro sequence information
Sequence:
>W7FNF1 1-518
STTEEIVEKVGSVSEEIIVEEVSASEEIVEEGSVTEEVVEEEKLINEVGETESVTEEIVQKEVSDAEEVLGQEGSMNEEI
LEKESIVEEIVGPEGSVTEEIVDHGSFAEEVKEEELVTEEAVQYEGSVTEEIKEEESITENEAIEESAFAEIIEEKGPNT
DEIVKEEGLDTEEIVNEVSVTDEVIEEEKLVNEQIVGEERSVTEKPVEVERSATEDLVEEEASVTEKVSVHEGSTTEQIL
DESVAEEIVEEEVSVDDKIIEEEVSVDEVVEEEGSVIEEIVEEEESVPEEILEEELSGSEEVLEDEWVTDAFMGQEGSVI
EEIEEIVDGEGSITEEIVEDGSANEKIVEEEPSRVEEVLGKEGFVIEEIIEEGSVIEQVEDTKTVSEKSEESSAIEEVKE
VKEEESISEKIVEKEESVTEEIVRQEESTTEKIVKDVSPTEDFVEQTDSVTEKVIEQEGSNTEVAEDVEEKESASDEHEQ
EDVSVNAQVTYEKKSVTKEIVDEVSRTEEIVEENGSKS
MRF results:
TAPAS results:
PF03482 - sic protein repeat
PF03482 Protein family information
Q9JNA7
Q9JNA7 Interpro sequence information
Sequence:
>Q9JNA7 1-363
MNIRNKIENSKTLLFTSLVAVALLGATQPVSAETYTSRNFDWSGDDWPEDDWSGDGLSKYDRSGVGLSQYGWSKYGWSSD
KEEWPEDWPEDDWSSDKKDETEDKTRPPYGGALGTGYEKRDDWRGPGTVATDPYTPPYGGALGTGYEKRDDWGGPGTVAT
DPYTPPYGGALGTGYEKRDDWRGPGTVATDPYTPPYGGALGTGYEKRDDWGGPGTVATDPYTPPYGGALGTGYEKRDDWR
GPGHIPKPENEQSPNPSHIPEPPQIEWPQWNGFDGLSSGPSDWGQSEDTPRFPSEPRVTEKPQHTPQKNPQESDFDRGFS
AGLKAKNSGRGIDFEGFQYGGWSDEYKKGYMQAFGTPYTPSAT
MRF results:
TAPAS results: