On Friday, I have been started to play with FMN
Currently, there is a fedmsg consumer that listens to the messages coming from all over the Fedora infrastructure, then based on the preferences set in FMN's web UI it decides whether to send a notification and how.
There has been thoughts on reworking the process to allow splitting it over multiple nodes.
The idea is to do something like this:
+-> worker -+ these senders | | just do simple I/O | | +-> worker -+ +-> email sender | | | | | | fedmsg -> fmn consumer -> redis +-> worker -+-> redis -+-> IRC sender | | | | | | +-> worker -+ +-> GCM sender | | | | +-> worker -+
My question was how to divide the message coming among the different worker. So I adjusted the consumer a little to forward each message received to a different redis channel.
The code looks something like:
i = random.randint(0, self.workers-1) log.debug('Sending to worker %s' % i) print(self.redis[i]) self.redis[i].publish('%s' % i, json.dumps(raw_msg))
We're randomly picking one of the worker from the N workers we know are available (for my tests: 4).
Sounds simple enough right? But will it spread the load between the workers evenly?
So over the week-end I left my test program running.
This is the output collected:
- worker 0: 126468 messages received
- worker 1: 126908 messages received
- worker 2: 126993 messages received
- worker 3: 126372 messages received
This makes a total of 506741 messages received over the week-end and the load is spread among the workers as such:
- worker 0: 24.95713% of the messages
- worker 1: 25.04396% of the messages
- worker 2: 25.06073% of the messages
- worker 3: 24.93818% of the messages
Looks good enough :)
Next step, splitting the code between fmn.consumer, fmn.worker and fmn.backend (the one doing the IO) and figuring out how to deal with the cache.