-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BA.2.75 lineage with ORF1b:V1706I [~600 seq as of 2022-08-18] #965
Comments
… updated from BA.2.75 Resolves #965
Added new lineage BA.2.75.3 from #965 with 104 new designations and 3 updated from BA.2.75
We are at ~1000 sequences now. I've been looking at designatable lineages in there with the benefitial mutations: S:346T, S:486S and S:490S It looks like we have the following branches:
Of the above, the lineage with S:F486S on the polytomy is worthy of immediate designation - the rest still needs to wait whether it grows to be significant. @silcn @AngieHinrichs any thoughts? |
…ed from BA.2.75 and BA.2.75.3
Added new lineage BM.1 from #965 with 45 new designations and 8 updated from BA.2.75 and BA.2.75.3
In the 2022-08-29 UShER tree, the BA.2.75.3 > T23019C (S:F486S) branch has only 3 sequences:
-- <<80 sequences... can you send some example IDs? I'll try to figure out why they aren't showing up there. |
@AngieHinrichs It should be all these in here: 615e53c Usher really struggles with BA.2.75 because it has so particularly many reverted sequences :/ I look at the splits in Nextclade - very low tech, just coloring etc I'm curious what you find... It could be that the order is the other way round in Usher: first 346 then 486 but I'm pretty sure 486 happened first. The order is off because of reversions. |
Yep, reversions. Of the 51 samples you just added to lineages.csv for BM.1, 22 are found in the 2022-08-29 UShER tree.
|
Thanks for the investigation @AngieHinrichs! Seems like BA.2.75 is pretty much a nightmare scenario for Usher. Maybe there should be a separate Usher build with much stricter reversion requirements? Anything that has even a single reversion gets thrown out? It would have fewer sequences but be cleaner! |
Hi @corneliusroemer. I'm very busy in real life at the moment so I can't spend much time looking at sequences. When the numbers were still small I was manually removing the reversions from the fastas in order to see the correct placements on the Usher tree, but now I would need to automate that and I don't have the time to write a script. Don't think we can know which of 346T and 486S came first in the lineage with both (plus 490S). Given this, I would say the designation should match the order of the Usher tree, but C3857A followed by A3857C looks clearly wrong to me, and I don't think fixing the reversions will fix that placement. @AngieHinrichs if you shout loudly enough at Usher can you get it to accept that it's wrong? :) |
With each new major-wave variant, the problem of amplicon dropout / assembly pipelines causing false reversions has progressively worsened. There was a little bit of a problem with reversions causing a mini-Alpha, then quite a few mini-Deltas so we added branch-specific masking, and then with BA.1 every primer scheme had something or other knocked out and if it weren't for wanting to catch recombinants I'd be masking all the major defining mutations... anyway, yep.
I agree. It's one of those cases where there's just not enough info for a purely parsimony-based approach to sort out, but usher has a tie-breaking algorithm based on number of descendants of a node that often works to settle things out once there are more sequences, if I remove some sequences and add them back. That's about the only form of "shouting" I have at this point. Every once in a while I think maybe it's time for a manual node-moving utility function though (with checks to make sure that the move doesn't change alleles assigned to any descendants). [We added |
What about having two trees? One that's heavily masked for general usage
and one that's not so heavily masked for recombinant detection?
…On Fri, Sep 2, 2022, 08:42 Angie Hinrichs ***@***.***> wrote:
Seems like BA.2.75 is pretty much a nightmare scenario for Usher.
With each new major-wave variant, the problem of amplicon dropout /
assembly pipelines causing false reversions has progressively worsened.
There was a little bit of a problem with reversions causing a mini-Alpha,
then quite a few mini-Deltas so we added branch-specific masking, and then
with BA.1 every primer scheme had something or other knocked out and if it
weren't for wanting to catch recombinants I'd be masking all the major
defining mutations... anyway, yep.
C3857A followed by A3857C looks clearly wrong to me, and I don't think
fixing the reversions will fix that placement.
I agree. It's one of those cases where there's just not enough info for a
purely parsimony-based approach to sort out, but usher has a tie-breaking
algorithm based on number of descendants of a node that often works to
settle things out once there are more sequences, if I remove some sequences
and add them back. That's about the only form of "shouting" I have at this
point. Every once in a while I think maybe it's time for a manual
node-moving utility function though (with checks to make sure that the move
doesn't change alleles assigned to any descendants). [We added matUtils
mask --move-nodes that fixes a particularly egregious situation of
mutations and reversions that matOptimize produced for a limited time, but
stopped short of adding a general purpose just-move-it function.]
—
Reply to this email directly, view it on GitHub
<#965 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF77AQN7TYN7ME7PHPBZR33V4GOVXANCNFSM566URYHQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
To make better sense of the multiple independent occurrences of spike mutations of interest within BA.2.75* like for example S:346T and others (see #961) it may make sense to designate the big branches that are coming off the BA.2.75 polytomy.
GISAID query:
NSP14_V182I,NSP3_S403L
We've already got one designated: BA.2.75.1 with S:574 in #907
Another big lineage with international spread is the branch with ORF1b:V1706I (G18583A) that accounts for ~20% of sequences within BA.2.75*
In any case, even if this branch is not designated, at least this issue draws attention to the existence of this branch. There will definitely be multiple child lineages in due course - which will get called BA.2.75.X if this lineage here isn't designated or get their own alias if this one here is designated.
GISAID query that should catch most of these: NSP14_V182I,E_T11A,NSP3_S403L
Proposed lineages on this branch include (maybe I missed some):
covSpectrum query: https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?variantQuery=nextcladePangoLineage%3ABA.2.75*+%26+G18583A+&variantQuery1=nextcladePangoLineage%3ABa.2.75*+%26++ORF1b%3AV1706I&
The text was updated successfully, but these errors were encountered: