Le blog de pingou

To content | To menu | To search

Tag - Fedora-Infra

Entries feed - Comments feed

Monday, May 9 2016

Playing with FMN

On Friday, I have been started to play with FMN

Currently, there is a fedmsg consumer that listens to the messages coming from all over the Fedora infrastructure, then based on the preferences set in FMN's web UI it decides whether to send a notification and how.

There has been thoughts on reworking the process to allow splitting it over multiple nodes.

The idea is to do something like this:

                                +-> worker -+          these senders
                                |           |          just do simple I/O
                                |           |
                                +-> worker -+          +-> email sender
                                |           |          |
                                |           |          |
fedmsg -> fmn consumer -> redis +-> worker -+-> redis -+-> IRC sender
                                |           |          |
                                |           |          |
                                +-> worker -+          +-> GCM sender
                                |           |
                                |           |
                                +-> worker -+

My question was how to divide the message coming among the different worker. So I adjusted the consumer a little to forward each message received to a different redis channel.

The code looks something like:

            i = random.randint(0, self.workers-1)
            log.debug('Sending to worker %s' % i)
            self.redis[i].publish('%s' % i, json.dumps(raw_msg))

We're randomly picking one of the worker from the N workers we know are available (for my tests: 4).

Sounds simple enough right? But will it spread the load between the workers evenly?

So over the week-end I left my test program running.

This is the output collected:

  • worker 0: 126468 messages received
  • worker 1: 126908 messages received
  • worker 2: 126993 messages received
  • worker 3: 126372 messages received

This makes a total of 506741 messages received over the week-end and the load is spread among the workers as such:

  • worker 0: 24.95713% of the messages
  • worker 1: 25.04396% of the messages
  • worker 2: 25.06073% of the messages
  • worker 3: 24.93818% of the messages

Looks good enough :)

Next step, splitting the code between fmn.consumer, fmn.worker and fmn.backend (the one doing the IO) and figuring out how to deal with the cache.

Thursday, November 19 2015

Introducing mdapi

I have recently been working on a new small project, an API to query the information stored in the meta-data present in the RPM repositories (Fedora's and EPEL's).

These meta-data include, package name, summary, description, epoch, version, release but also changelog, the list of all the files in a package. It also includes the dependencies information, the regular Provides, Requires, Obsoletes and Conflicts but also the new ones for soft-dependencies: Recommends, Suggests, Supplements and Enhances.

With this project, we are exposing all this information to everyone, in an easy way.

mdapi will check if the package asked is present in either of the updates-testing, updates or release repositories (in this order) and it will return the information found in the first repo where there is a match (and say so) So for example: https://apps.fedoraproject.org/mdapi/f23/pkg/guake?pretty=True*

shows the package information for guake in Fedora 23, where guake has been updated but the latest version is in updates not updates-testing. Therefore it says "repo": "updates".

The application is written entirely in python3 using aiohttp which is itself based on asyncio, allowing it to handle some load very nicely.

Just to show you, here is the result of a little test performed with the apache benchmark tool:

    $ ab -c 100 -n 1000 https://apps.fedoraproject.org/mdapi/f23/pkg/guake
    This is ApacheBench, Version 2.3 <$Revision: 1663405 $>
    Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
    Licensed to The Apache Software Foundation, http://www.apache.org/
    Benchmarking apps.fedoraproject.org (be patient)
    Completed 100 requests
    Completed 200 requests
    Completed 300 requests
    Completed 400 requests
    Completed 500 requests
    Completed 600 requests
    Completed 700 requests
    Completed 800 requests
    Completed 900 requests
    Completed 1000 requests
    Finished 1000 requests
    Server Software:        Python/3.4
    Server Hostname:        apps.fedoraproject.org
    Server Port:            443
    SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,4096,128
    Document Path:          /mdapi/f23/pkg/guake
    Document Length:        1843 bytes
    Concurrency Level:      100
    Time taken for tests:   41.825 seconds
    Complete requests:      1000
    Failed requests:        0
    Total transferred:      2133965 bytes
    HTML transferred:       1843000 bytes
    Requests per second:    23.91 [#/sec] (mean)
    Time per request:       4182.511 [ms] (mean)
    Time per request:       41.825 [ms] (mean, across all concurrent requests)
    Transfer rate:          49.83 [Kbytes/sec] received
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:      513  610 207.1    547    1898
    Processing:   227 3356 623.2   3534    4025
    Waiting:      227 3355 623.2   3533    4024
    Total:        781 3966 553.2   4085    5377
    Percentage of the requests served within a certain time (ms)
      50%   4085
      66%   4110
      75%   4132
      80%   4159
      90%   4217
      95%   4402
      98%   4444
      99%   4615
     100%   5377 (longest request)

Note the:

    Time per request:       41.825 [ms] (mean, across all concurrent requests)

We are below 42ms so (0.042 second) to retrieve the info of a package in the updates repo and that's while executing 100 requests at the same time on a server that is in the US while I am in Europe.

  • Note the ?pretty=True in the URL, this is something handy to view the JSON

returned but I advise against using it in your applications as it will increase the amount of data returned and thus slow things down.

Note2: Your mileage may vary when testing mdapi yourself, but it should remain pretty fast!

Wednesday, February 25 2015

About SourceForge and anitya

There are a couple of reports (1 and 2) about anitya not doing its job properly for projects hosted on sourceforge.net.

So here is a summary of the situation:

A project X on sourceforge.net, for example with a homepage sourceforge.net/projects/X, releases multiples tarball named, X-1.2.tar.gz, libX-0.3.tar.gz and libY-2.0.tar.gz.

So how to model this.

The original approach taken was: the project is named X, so in anitya we should name it X and then the sourceforge backend in anitya allows to specify a Sourceforge project allowing to search X, libX or libY in the rss feed of the X project on SourceForge. Problem: when adding libX or libY on anitya, the project and homepage are all X and sourceforge.net/projects/X, while this is actually used to make project uniques in anitya (in other words, adding libX and libY won't be allowed).

So this is the current situation and as you can see, it has problems (which explains the two issues reported).

What are the potential solutions?

1/ Extend the unique constraint

We could include the tarball name to search for in the unique constraint, which would then change from: name+homepage to name+homepage+tarball

2/ Invert the use of name and tarball

Instead of having the project name be X with a tarball name libX, we could make the project be libX and the tarball be X.

This sounds quite nice and easy, but looking at the project currently in anitya's database, I found projects like:

        name         |                          homepage                           |                  tarball
                     +                                                             +
 linuxwacom          | http://sf.net/projects/linuxwacom/                          | xf86-input-wacom
 brutalchess (alpha) | http://sourceforge.net/p/brutalchess                        | brutalchess-alpha
 chemical-mime       | http://sourceforge.net/projects/chemical-mime               | chemical-mime-data

So for these, the tarball name would become the project name and they would be pretty ugly.

I am not quite sure what is the best approach for this.

What do you think?

Monday, December 15 2014

Fedora 21 release day, 7 days later

Last week Tuesday, we released the 21st version of Fedora. The morning of the release we noticed that the load of some of the proxies was running very high. So we started checking our monitoring for the incoming traffic. A week later, this is an overview of the traffic on our proxies over the last ten days (so 3 days before the release and 7 days since).


The third one is quite impressive and looking at more of these graphs we can see a similar pattern where the traffic for F21 release really bumped on release day and the following two days and is now slowly recovering.

If you want to see more of these pretty pictures/graphs, check our collectd

Friday, December 12 2014

Infra FAD 2014 - Part 2: Ansible

Part 1: MirrorManager

It has been two days since I came back and others have already reported about our progress (Ralph, kevin day 0 & 1, kevin day 2, kevin day 3, kevin day 4 and finally, kevin day 5) but I wanted to came back on it as well :)

So seven of us from the Fedora Infrastructure team meet up in Raleigh in the Red Hat office there. We had Matt Domsch for the first couple of days to help us understanding and apprehending how MirrorManager works (see Part 1).

The second part of the FAD was dedicated around moving forward the infrastructure task of moving away from puppet in favor of Ansible. This is led to the most productive week we ever had on our Ansible git repo. I have been able to start porting things like varnish or haproxy while Ralph was doing the heavy lifting on working on porting the proxies themselves. Patrick worked on porting the nameservers and managed to actually re-install them using Ansible (and moving them to RHEL7 while at it). Smooge has been poking at the setup for fedorapeople.

With all that we also managed to get MirrorManager2 in staging and Luke wrote some awesome unit-tests for mirrorlist which already allowed us to make still some small optimizations.

All in all, I have to say that I have had a great time. I have the feeling that we achieved a lot of what we wanted to do and that we have been really efficient at it :-)

To remain critical about the organization. I think I agree with Ralph that for the next FAD we should be extra-careful to really organise some sort of social event. We have had strange hours (having lunch at 3pm or even 5pm once) and the one afternoon where we said we would take off we ended up working... Being involved in the organization while not on site makes it difficult to find something nice for the social event, but I think we/I should have tried harder to find something nice to do.

Anyway, like I said, I have a great time and I'm thankfull to everyone that have been able to make it to Raleigh, to the OSAS team at Red Hat that funded most of this FAD and to Ansible for inviting us for dinner on Friday evening :-)

Thanks a bunch folks!


Saturday, December 6 2014

Infra FAD 2014 - Part 1: MirrorManager

The last two days have been quite busy for the Fedora infrastructure team. Most of us are indeed meeting up in Raleigh, in the Red Hat tower down-town and together with Matt Domsch, the original developer of MirrorManager, we have been on MirrorManager2.

It was really great for us that Matt could join. MirrorManager is pretty straight forward in theory but also full of small details which can make it a hard to understand fully. Having Matt with us allowed us to ask him as many questions as we wanted. We were also able to go with him through all the utility scripts and all the crons that make MirrorManager working.

The good surprise was that a significant part of the code was already converted for MirrorManager2, but we still found some crons and scripts that needed to be ported.

So after spending most of the first day on getting to understand and know more about the inner processes of MirrorManager, we were able to start working on porting the missing parts to MirrorManager2.

We also took the opportunity to discuss with Matt, Luke and David how things should look like for atomic and Ralph was able to make the first changes to make this a reality :-)

So yesterday evening we had all the crons/scripts (but one in fact that one isn't needed for MM2) converted to MirrorManager2 \ó/

That was a good point to stop and go quickly to the Red Hat Christmas party before meeting Greg who invited us for a dinner sponsored by Ansible. We had a really nice meal and evening, thanks Greg, thanks Ansible!

Today started the second part of the FAD: Ansible, but more on that later ;-)

Saturday, August 9 2014

Flock 2014 - day 1 to 3

Today is the fourth day of flock. As usual the last three days have been really nice. I got to go to a number of interesting conferences and could even present a couple of project that I am or will be working on.

I assisted to the conference from Luke on how pushing updates in Fedora will look like in the coming months. Bodhi 2 is the new version of the application we use to manage our updates in Fedora. Luke and others have been working hard on it but the work they did is really impressive! Bodhi 2 looks better from all angles, UI, Infra, Workflow. Apparently the timeline is to get it deployed before the end of the year but after the release of Fedora 21, so stay tuned it's arriving ;-)

I have been able to assist on the presentation about python 3 in Fedora. I must say that this is looking promising and there are some new shiny things in python 3 that I am already looking forward for (most notably the possibility to have keyword only arguments in functions, this is going to be sweet).

On Thursday, I gave a presentation about the future Fedora Review Server (we couldn't find a better name for it and people seemed to like it :-)), more on that later.

The same day, Adimania presented a little bit his feeling and the state of things with regards to Ansible in the Fedora Infrastructure. I think it was a nice summary of why we are moving and what we like about Ansible.

Thursday afternoon, I went to the talk about NoSQL in Fedora Infrastructure. More than a state of things, it was a plead that we should consider and keep in mind the NoSQL technologies for the Infra and not fear using them where they make sense. Yograterol did a nice job presenting the different NoSQL technologies and clearly we should consider them where it makes sense. Thinking further about it with Raplh we thought that using MongoDB for datagrepper might be interesting, we should benchmark this :)

Finally yesterday I was able to present a little project I have been working on for a little bit progit, I will blog about this in the near future so keep in touch ;-)

Then I attended the talk from Kevin about the present and future of the Fedora infrastructure. This was a good overview of the different irons we have in the fire at the moment and those that near the fire aren't yet too hot. One thing is sure, I am really looking forward having our bugzilla hooked up on fedmsg!

The joint session on Fedora.next chaired by our dear FPL was also quite interesting and provided a very nice overview of what the different working group are currently up to. It was nice to see things moving forward, if some parts are still a little shady, I guess it won't remain this way for long anymore.

Yesterday afternoon, was a session on EPEL.next. There are still a number of concerns and questions about how things could or should be in EPEL. Some things are good and some could be improved, there are some generic idea (such as having a new repo: EPIC which would contain more rapidly evolving software or more recent version of software compared to what is currently in EPEL), but there again the devil is in the details and there will need to be some more thoughts and work before we can see this live.

I guess this is it for the talks, I attended a few more but I can't possibly detail them all here :-)

Next time, more info on what we actually got done during these few days!

Thursday, January 30 2014

Fedocal upgrade

This morning I update fedocal to its latest version: 0.4.2, this update brought quite a number of changes, among them:

  • New UI, closer to pkgdb2, nuancier and koji
  • Add location to meeting, so that #fedora-meeting should no longer be a calendar but a location
  • Improved list view
  • Improved window to add a meeting
  • Improved calendar view where the full day meetings are separated from the other meetings
  • Store the meeting in the specified timezone instead of UTC (allows to have a meeting at 2pm/14:00 Paris time all year long, despite of DST)
  • Enable viewing the agenda in another timezone
  • Enable browsing the dates without using the small monthly calendar

I took the opportunity to re-generate the database to make sure all the fields were in sync with the DB model planned. The data was then copied over from the old DB to the new one, which gave some stats about fedocal:

10 calendars added
18 reminders added
236 meetings added

Check it out!

Monday, March 18 2013

Fedora-Infra: Did you know? -- pkgdb-cli

Did you know?

With pkgdb-cli you can give someone commit rights without her/him asking for it?

The command will look like:

pkgdb-cli update <package> <acl> <user> <branch> --approve

For example:

pkgdb-cli update packagedb-cli commit toshio devel --approve

Can be handy if you work in a team!

Monday, March 11 2013

Fedora-Infra: Did you know? -- copr-cli

Did you know?

You know copr right ? This tool to create easily your own yum repo building the RPMs in a new cloud instance everytime. It allows you to give access to nightly builds or a new version of a software backported from rawhide.

Well, if you have been playing with the development instance, know that in the sources there is a CLI tool!

Some example of what it can do if run directly from the sources:

$ python copr_cli/main.py -h
usage: copr-cli [-h] [--version] {create,list,build} ...

optional arguments:
  -h, --help           show this help message and exit
  --version            show program's version number and exit

    list               List all the copr of the provided
    create             Create a new copr
    build              Build packages to a specified copr

And a small demo:

$ python copr_cli/main.py list toshio
name                |description                                                                                           |repos |instructions
fas2                |RPMS for the Fedora Account System run in Fedora Infrastructure.                                      |      |
packagedb           |Package Database that manages ownership of packages for Fedora.  Includes both the client and server. |      |
flowhub             |flowhub adapts the gitflow workflow to github repositories                                            |      |
python-fedora-devel |                                                                                                      |      |

Some documentation on how to setup copr-cli is present in the README in the source

Monday, March 4 2013

Fedora-Infra: Did you know? -- apps.fedoraprojects.org

Did you know?

You cannot remember what's this application you've seen on the Fedora infrastructure at the latest FUDCon?

You cannot find the URL of that specific application of the Fedora infrastructure?

You are looking for an overview the applications run by the Fedora infrastructure?

apps.fedoraproject.org is there for you.


Developped by Ralph Bean apps.fedoraproject.org gives you an overview, introduction and links to all the application run by the Fedora infrastructure.

Monday, February 25 2013

Fedora-Infra: Did you know? -- HyperKitty sends email

Did you know?

Do you know about HyperKitty? It is this new interface for mailing-list archives offering a forum-like experience for those that prefer forum over mailing-list.

The latest version (which you can see online on the development server) allows you to

  • create a new thread in the list as you would create a new thread in a forum
  • reply to an email sent on the list as you would reply to a post in a forum
  • tag posts on the list
  • like/dislike post sent on the list
  • get an overview of the list activity (see for the Fedora-devel mailing list)


Want more info?