Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replication and sharding configuration at system level #2386

Open
samos123 opened this issue Nov 26, 2022 · 3 comments
Open

Replication and sharding configuration at system level #2386

samos123 opened this issue Nov 26, 2022 · 3 comments
Labels

Comments

@samos123
Copy link
Contributor

Currently replication and sharing is configured at the class level, however it would be nice if that was done at the system level. Potential use case where the DB administrator is not the person that creates the classes however the person that creates the classes does not care or does not know about replication and sharing.

Quoting @etiennedi on slack:
Pro class level
I’m aware of many users that use classes for multi-tenancy, i.e. one or multiple classes per tenant. For this it makes a lot of sense to be able to configure this separately. A large tenant may need a completely different config than a tiny one, etc.

Pro systems level
Besides the “different personas” argument you make, another one would be our situation on the WCS/managed offering. There the promise to the user is that they don’t have to worry about anything infra-related and in fact whether or not they choose HA even has an effect on the pricing. So, we need something that is not at the class level.

I have two potential ideas, I like the second one better than the first, but the first one is much cheaper to build. So it could be a matter of going with option 1 at first and option 2 later:
Option 1: System level default, optional class-level restriction
The idea is that an administrator could set defaults for this config and then the user can override them. If we don’t want the “class creator” to override this we could also protect those settings. Then the class creator has to omit them or face an error message. <-- cheap to build, non-breaking would solve your issue. Doesn’t solve the multi-tenancy issue.
Option 2: Introduce proper namespaces
If we look at Cassandra, for example, replication is configured at the Keyspace level. A keyspace is a collection of various tables. I could imagine that we could introduce the same concept within Weaviate as well. Create one “keyspace” per tenant and have complete isolation between keyspaces. We could then configure infra-related things such as sharding and replication at the keyspace level. The big downside to this idea is that it’s a pretty hefty change to the current structure. It would either come with breaking changes or some mandatory migrations at the version upgrade. But we hear more and more about multi-tenancy so it could be a very clean long-term option.

@etiennedi etiennedi added the planned-1.18 ETA Week 09, 2023 label Nov 29, 2022
@etiennedi
Copy link
Member

I put the 1.18 label on this for now. If we still have capacity on the v1.17 timeline this would be a nice-to-have there.

@samos123
Copy link
Contributor Author

I don't have a need for this myself so would be good to get some other user feedback on whether this is needed or maybe they would like to see it differently.

@byronvoorbach byronvoorbach changed the title Replication and sharing configuration at system level Replication and sharding configuration at system level Dec 7, 2022
@etiennedi etiennedi added backlog and removed planned-1.18 ETA Week 09, 2023 labels Dec 20, 2022
@etiennedi
Copy link
Member

Putting back in the backlog for now. Please upvote if there is demand. This is cheap to add, so we can always pull it in, but for now, we have things that are much more in demand for v1.18.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants