Michael Tress
@michaeltress.bsky.social
47 followers 10 following 110 posts
Scientist
Posts Media Videos Starter Packs
michaeltress.bsky.social
... selection in mammals. I guess any of the three is OK, but if I had to choose (and APPRIS has to choose), I would be inclined to choose ENST00000355480 over the 1319 aa version, given the detailed RNAseq evidence. The longest isoform is alternative for me. Another one to change for APPRIS.
michaeltress.bsky.social
... evidence for the other two, the longest transcript almost exclusively in spinal cord, ENST00000355480 in liver and artery where it is supposed to be most expressed. Not much proteomics support here either, though there are some non-tryptic peptides. All three isoforms are under purifying ...
michaeltress.bsky.social
Also, I had forgotten about this one. These two isoforms differ at the N-termanal and there's a third (from ENST00000355480) that has 1519 residues. RNAseq strongly supports the shorter N-terminal (1339aa), but then it would. I can't see proteomics evidence for this isoform. There is some RNAseq ...
michaeltress.bsky.social
Thanks for all the work you put into this, it has been very useful. We think it has confirmed that @appris.bsky.social principal selection is mostly working as it should, and it has helped find problematic genes. We have changed 5 principal isoforms and the labelling of alt. isoforms in 6 more.
michaeltress.bsky.social
MIA2 is a struggle because there isn't much data for any of the annotated N-terminals. The RNAseq support for the shorter transcripts is not totally believable, especially since they have practically no peptide support.
michaeltress.bsky.social
There are times when you can only use one isoform/transcript per gene, so some rules have to be applied.

But there are certainly plenty of genes that clearly have multiple important isoforms, which is why APPRIS has TRIFID scores. Even then you get isoforms like the one in RPGR that buck the trend
michaeltress.bsky.social
APPRIS went with the 79aa isoform that has RNAseq as well as proteomics support.
michaeltress.bsky.social
We had already changed the APPRIS principal manually to the brain specific AS variant. We think the MANE Select variant should have a different N-terminal.
michaeltress.bsky.social
MANE and APPRIS both support 235 aa, so all the major annotations in agreement.
michaeltress.bsky.social
Haha. It is a bit messy, but I can assure you I have seen MUCH worse. Superficially it appears to have been two genes bolted together, but it appears to exist across mammals at least. RNAseq supports 792aa and 804aa isoforms, but it would, they are shorter. I think probably all three are used.
michaeltress.bsky.social
Agree that this looks likely. But the first ATG is conserved all the way across mammals, so I can't see this being changed any time soon. Especially since GENCODE no longer have anyone to point this out.

The 2nd ATG is conserved too and its Kozak sequence is largely untouched, unlike the first ATG
michaeltress.bsky.social
The APPRIS principal has the extra exon. We will change it to agree with the MS transcript
michaeltress.bsky.social
APPRIS and MANE now support a 339aa principal isoform. It has an extra exon vs. the transcript that produces 321 aa isoform. RNAseq suggests that both 339 and 321aa isoforms are equally valid, the exon is largely skipped in brain, and not elsewhere.
michaeltress.bsky.social
Both APPRIS and MANE have 801 aa isoform as principal, so all in agreement here now.
michaeltress.bsky.social
Yeah, no support at all for the MANE Select. RNAseq and conservation support the 73aa UniProt/APPRIS isoform too.
michaeltress.bsky.social
RNAseq is in agreement. APPRIS had already switched its principal to the 431aa isoform, MANE still chooses the alternative which has some (low) expression in brain.
michaeltress.bsky.social
APPRIS/MANE have both already switched to the correct 525aa isoform
michaeltress.bsky.social
The upstream ATG is protein coding conserved across primates, but is only expressed in testis (and no peptides as noted), so the 287aa isoform is clearly principal. APPRIS changed.
michaeltress.bsky.social
APPRIS had already switched to the 374aa isoform, but the RNAseq data does not agree with the proteomics data. It has inclusion and skip at about 50-50.
michaeltress.bsky.social
I like this one! You are absolutely right that there is more evidence for the 805aa isoform. But there are two NAGNAG splice events in RASAL1 and the extra amino acid exon has way more support in splice events. Which makes the 806aa isoform the principal. APPRIS will change to reflect this.
Two NAGNAG events marked with arrows.
michaeltress.bsky.social
However, this paper shows strong evidence for three N-terminal AS variants in KCNIP1 in different regions in the brain, the 227aa isoform (UniProt), 216aa (MANE) and 225aa (APPRIS principal), so I am inclined to leave things as they are.
michaeltress.bsky.social
I am not sure what to make of this one. There are a lot of conserved N-terminal exons. It is brain expressed. I wouldnt put much stock in the proteomics data (non-existent for any isoform) nor Alphafold (no structure for any). The RNAseq evidence strongly supports the 227aa variant. 1/2
michaeltress.bsky.social
It does have RNAseq support though (inclusion is about 50/50).
michaeltress.bsky.social
LCOR is one of only about 20 annotated splice variants in which one set of Pfam domains is swapped for another set (cf CUX1, DST, etc). Different main isoforms is OK.