we have a bunch of workers in different network zones with different sources (syslog,s2s,tls etc.)
How would you prefer to proceed with the groups management?
We got 2 options here:
- All workers in a single group with all sources (if you deploy a new source, all workers need a restart? Not good at all)
- A group for every network zone, with a worker, with dedicated sources (we think it is really uncomfortable to work with a lot of groups#, because every group has its own configuration)
Thanks for feedback and suggestions
I’ve asked myself the same question a few times, and I think it boils down to “whatever works better for you”.
In the end, having seperate groups is the clean, “proper” way to do this. Different boxes do different things. However, depending on how much stuff they have in common, you might get a lot of overhead copying the same configs to the groups over and over, and keeping them in sync. Packs can help for some of this, but until comes up with a way to have some shared config for all or multiple groups, it will be some work.
So, you’re pretty much stuck inbetween pragmatic and proper, and as I don’t know your environment and how much duplicate config there would be, it’s hard to decide.
Overall, I would say the two main reasons for creating different Worker Groups are Geography and Workload. Geography means it may make sense to keep the sources and destinations to the local worker nodes. Workload means you may want to have different workers just for Replay from S3 or process all your Syslog data. This blog might help with these decisions: https://cribl.io/blog/worker-groups-what-are-they-and-why-you-should-care/
Agree @xpac - we have converted a few worker groups at different sites into a single worker group and it bugs me - however it is the only way I can easy ensure configuration is the same between the workers at the different sites. Having multiple worker groups and trying to ensure that Cribl configuration (incl. lookups) is the same between them would be time consuming and prone to human error.
Hi, thank you for sharing your thoughts.
We currently have 20 workers in different network zones, so there will be around 15 single groups.
It’s really annoying to copy and paste all configs from one to another group, same goes for packs etc.
On the other hand, in a single group, if you make changes on sources or deploy a new one all other workers face a restart which will cause data loss. Especially if you have UDP sources and no Load Balancer infront of.
We currently think about a default group with all sources,destinations,pipelines etc. without workers, and behind that different groups, with the filtered configs.