Le blog de pingou

To content | To menu | To search

Général

Entries feed

Saturday, December 6 2014

Infra FAD 2014 - Part 1: MirrorManager

The last two days have been quite busy for the Fedora infrastructure team. Most of us are indeed meeting up in Raleigh, in the Red Hat tower down-town and together with Matt Domsch, the original developer of MirrorManager, we have been on MirrorManager2.

It was really great for us that Matt could join. MirrorManager is pretty straight forward in theory but also full of small details which can make it a hard to understand fully. Having Matt with us allowed us to ask him as many questions as we wanted. We were also able to go with him through all the utility scripts and all the crons that make MirrorManager working.

The good surprise was that a significant part of the code was already converted for MirrorManager2, but we still found some crons and scripts that needed to be ported.

So after spending most of the first day on getting to understand and know more about the inner processes of MirrorManager, we were able to start working on porting the missing parts to MirrorManager2.

We also took the opportunity to discuss with Matt, Luke and David how things should look like for atomic and Ralph was able to make the first changes to make this a reality :-)

So yesterday evening we had all the crons/scripts (but one in fact that one isn't needed for MM2) converted to MirrorManager2 \ó/

That was a good point to stop and go quickly to the Red Hat Christmas party before meeting Greg who invited us for a dinner sponsored by Ansible. We had a really nice meal and evening, thanks Greg, thanks Ansible!

Today started the second part of the FAD: Ansible, but more on that later ;-)

Thursday, November 27 2014

Python multiprocessing and queue

Every once in a while I want to run a program in parallel but gather its output in a single process so that I do not have concurrent accesses (think for example, several process computing something and storing the output in a file or in a database). I could use locks for this but I figure I could also use a queue.

My problem is that I always forget how I do it and always need to search for it when I want to do it again :-) So for you as much as for me here is an example:

# -*- coding: utf-8 -*-

import itertools
from multiprocessing import Pool, Manager


def do_something(arg):
    """ This function does something important in parallel but where we
    want to centralize the output, thus using the queue
    """
    data, myq = arg
    print data
    myq.put(data)
    myq.task_done()


data = range(100)
m = Manager()
q = m.Queue()
p = Pool(5)
p.map(do_something, itertools.product(data, [q]))


with open('output', 'w') as stream:
    while q.qsize():
        print q.qsize()
        item = q.get()
        print item
        stream.write('%s\n' % item)
    q.join()

There are probably other/better ways to do this but that's a start :-)

Wednesday, October 15 2014

Fedora-Infra: Did you know? The package information are now updated weekly in pkgdb2!

The package database pkgdb2 is the place where is managed the permission on the git repositories.

In simple words, it is the place managing the "who is allowed to do what on which package".

For each package, when they are created, the summary, the description and the upstream URL from the spec file are added to the database, which allow us to display the information on the page concerning the package. However, until two weeks ago, this information was never updated. That means that if you had an old package whose description had changed over time, pkgdb would present the one from the time the package was created in the database.

Nowadays, we have a script running on a weekly basis and updating the database. Currently, this script relies on the information provided by yum's metadata on the rawhide repo. This means that packages that are only present in EPEL or that are retired on rawhide but present in F21, will not have their information updated. This is likely something we will fix in the future though.

In the mean-time, you can now enjoy a pkgdb with summary and description information for almost all packages!

As an example, checkout the fedocal page, you can now see a link to the upstream website, a short summary and a little longer description of the project.

Also, to give you a little hint on the amount of updates we did:

The first time we ran the script:

 16638 packages checked
 15723 packages updated

Last week's run:

 16690 packages checked
 50 packages updated

Saturday, August 9 2014

Flock 2014 - day 1 to 3

Today is the fourth day of flock. As usual the last three days have been really nice. I got to go to a number of interesting conferences and could even present a couple of project that I am or will be working on.

I assisted to the conference from Luke on how pushing updates in Fedora will look like in the coming months. Bodhi 2 is the new version of the application we use to manage our updates in Fedora. Luke and others have been working hard on it but the work they did is really impressive! Bodhi 2 looks better from all angles, UI, Infra, Workflow. Apparently the timeline is to get it deployed before the end of the year but after the release of Fedora 21, so stay tuned it's arriving ;-)

I have been able to assist on the presentation about python 3 in Fedora. I must say that this is looking promising and there are some new shiny things in python 3 that I am already looking forward for (most notably the possibility to have keyword only arguments in functions, this is going to be sweet).

On Thursday, I gave a presentation about the future Fedora Review Server (we couldn't find a better name for it and people seemed to like it :-)), more on that later.

The same day, Adimania presented a little bit his feeling and the state of things with regards to Ansible in the Fedora Infrastructure. I think it was a nice summary of why we are moving and what we like about Ansible.

Thursday afternoon, I went to the talk about NoSQL in Fedora Infrastructure. More than a state of things, it was a plead that we should consider and keep in mind the NoSQL technologies for the Infra and not fear using them where they make sense. Yograterol did a nice job presenting the different NoSQL technologies and clearly we should consider them where it makes sense. Thinking further about it with Raplh we thought that using MongoDB for datagrepper might be interesting, we should benchmark this :)

Finally yesterday I was able to present a little project I have been working on for a little bit progit, I will blog about this in the near future so keep in touch ;-)

Then I attended the talk from Kevin about the present and future of the Fedora infrastructure. This was a good overview of the different irons we have in the fire at the moment and those that near the fire aren't yet too hot. One thing is sure, I am really looking forward having our bugzilla hooked up on fedmsg!

The joint session on Fedora.next chaired by our dear FPL was also quite interesting and provided a very nice overview of what the different working group are currently up to. It was nice to see things moving forward, if some parts are still a little shady, I guess it won't remain this way for long anymore.

Yesterday afternoon, was a session on EPEL.next. There are still a number of concerns and questions about how things could or should be in EPEL. Some things are good and some could be improved, there are some generic idea (such as having a new repo: EPIC which would contain more rapidly evolving software or more recent version of software compared to what is currently in EPEL), but there again the devil is in the details and there will need to be some more thoughts and work before we can see this live.

I guess this is it for the talks, I attended a few more but I can't possibly detail them all here :-)

Next time, more info on what we actually got done during these few days!

Friday, July 25 2014

The Joy of timezones

Today, I was looking at fedocal as I found out it could not import its own iCal files.

Well, to be exact, the import worked fine but then it was not able to display the meeting. The source of the issue is that the iCal output is relying on timezone name such as EDT or CEST while fedocal actually expects timezone to be of type US/Eastern or Europe/Paris.

So I went looking for a way to convert the acronyms to real timezone.

I finally found out the following script:

import pytz
from datetime import datetime

timezone_lookup = dict()
for tz in pytz.common_timezones:
    name = pytz.timezone(tz).localize(datetime.now()).tzname()
    if key in timezone_lookup:
        timezone_lookup[name].append(tz)
    else:
        timezone_lookup[name] = [tz]

for key in sorted(timezone_lookup):
    print key, timezone_lookup[key]

Which led me to discover things like:

  IST ['Asia/Colombo', 'Asia/Kolkata', 'Europe/Dublin']

The Indian Standard Time and the Irish Standard Time have the same acronym

but also:

  EST ['America/Atikokan', 'America/Cayman', 'America/Jamaica', 'America/Panama', 'Australia/Brisbane', 'Australia/Currie', 'Australia/Hobart', 'Australia/Lindeman', 'Australia/Melbourne', 'Australia/Sydney']

So how to handle this?

The only solution I could came up with is relying on both the acronym and the offset between that timezone and UTC

Adjusted script:

import pytz
from datetime import datetime

timezone_lookup = dict()
for tz in pytz.common_timezones:
    name = pytz.timezone(tz).localize(datetime.now()).tzname()
    offset = pytz.timezone(tz).localize(datetime.now()).utcoffset()
    key = (name, offset)
    if key in timezone_lookup:
        timezone_lookup[key].append(tz)
    else:
        timezone_lookup[key] = [tz]

for key in sorted(timezone_lookup):
    print key, timezone_lookup[key]

And corresponding output:

...
('EST', datetime.timedelta(-1, 68400)) ['America/Atikokan', 'America/Cayman', 'America/Jamaica', 'America/Panama']
('EST', datetime.timedelta(0, 36000)) ['Australia/Brisbane', 'Australia/Currie', 'Australia/Hobart', 'Australia/Lindeman', 'Australia/Melbourne', 'Australia/Sydney']
...
('IST', datetime.timedelta(0, 3600)) ['Europe/Dublin']
('IST', datetime.timedelta(0, 19800)) ['Asia/Colombo', 'Asia/Kolkata']
...

So much fun...

Wednesday, July 23 2014

New package, new branch, new workflow?

If you are a Fedora packager, you are probably aware of the new pkgdb.

One question which has been raised by this new version is: should we change the process to request new branches or integrate new packages in the distribution.

The discussion has occurred on the rel-eng mailing list but I'm gonna try to summarize here what the process is today and what it might become in the coming weeks.

Current new-package procedure:
  1. packager opens a review-request on bugzilla
  2. reviewer sets the fedora-review flag to ?
  3. reviewer does the review
  4. reviewer sets the fedora-review flag to +
  5. packager creates the scm-request and set fedora-cvs flag to ?
  6. cvsadmin checks the review (check reviewer is a packager)
  7. cvsadmin processes the scm-request (create git repo, create package in pkgdb)
  8. cvsadmin sets fedora-cvs flag to +
New procedure
  1. packager opens a review-request on bugzilla
  2. reviewer sets the fedora-review flag to ?
  3. reviewer does the review
  4. reviewer sets the fedora-review flag to +
  5. packager goes to pkgdb2 to request new package (specifying: package name, package summary, package branches, bugzilla ticket)
  6. requests added to the scm admin queue
  7. cvsadmin checks the review (check reviewer is a packager¹)
  8. cvsadmin approves the creation of the package in pkgdb
  9. package creation is broadcasted on fedmsg
  10. fedora-cvs flag set to + on bugzilla
  11. git adjusted automatically

Keeping the fedora-cvs flag in bugzilla allows to perform a regular (daily?) check that there are no fedora-review flag set as + that have been approved in pkgdb and whose fedmsg message hasn't been processed.

Looking at the number, it looks like there are more steps on the new procedure but eventually, most of them can be automated.

New branch process

For new branches, the process would be very similar:

  1. packager goes to pkgdb2 to request new branch
  2. requests added to the scm admin queue
  3. cvsadmin checks the request (requester is a packager...)
  4. cvsadmin approves the creation of the branch in pkgdb
  5. branch creation is broadcasted on fedmsg
  6. git adjusted automatically

Tuesday, July 8 2014

1 year

Today is the first anniversary of the day we said good-bye to a good friend.

There has been a number of tributes in the couple of months following his disappearance, and there are still some once in a while. Personally, I hardly spend a week without remembering him or asking myself "What would Seth say?".

Good bye old friend, may your wisdom lead us.

Thursday, June 26 2014

Faitout, 1000 sessions

A while back, I introduced faitout on this blog.

Since then I have been using it to tests most if not all the project I work on. I basically use the following set-up:

DB_PATH = 'sqlite:///:memory:'
FAITOUT_URL = 'http://209.132.184.152/faitout/'
try:
    import requests
    req = requests.get('%s/new' % FAITOUT_URL)
    if req.status_code == 200:
        DB_PATH = req.text
        print 'Using faitout at: %s' % DB_PATH
except:
    pass

This way, if I have network, the tests are run with faitout and thus against a real postgresql database while if I do not have network, they run against a sqlite in memory database.

This set-up allows me to work offline and still be easily able to run all the unit-tests as I change the code.

What the point of this blog was actually more to announce the fact that despite it's limited spread (only 25 different IP addresses have requested sessions), the tool is used and it has already reached the 1,000 sessions created (and dropped) in less than a year.



If you're not using it, I am inviting you to have a look at it, I find it marvelous in combination with Jenkins and it does help finding bugs in your code.

If you are using it, congrats and keep up the good work!!

Tuesday, June 17 2014

Fedocal 0.7

This morning I released fedocal version 0.7.1.

With this version comes a number of new features which I thought would be nice to advertise a little :-)

The main calendar view & the menu

The main calendar view has had two additions:

  • a pop-up stipulating if there are meetings present that week that are not displayed in the current window (for example, if you're seeing the meetings from 8am to 6pm and there is a meeting at 7pm, or at 4am).
  • shortcuts to interact more easily with the calendar. These shortcuts contains three actions: Add a meeting, switch to list/calendar view, iCal feed.

The menu now highlights the calendar you are looking at to make things easier on you.

popup-highlight-shortcuts.png

The list view

When viewing a calendar as list, fedocal will now automatically scroll down to the display the meetings of today or the future meetings.

In addition, this page also has the three shortcut buttons mentioned above (add meeting, switch view mode, iCal feed).

autoscroll-shortcuts.png

The detail view

We have added three new features in the page showing the details of a meeting

  • permalink: when the user clicks on the pop-up showing the details of a meeting the url is updated to provide a permalink to that specific meeting. This allows one to copy/paste the url and send it to someone.
  • countdown: with the help from mpduty we have added a countdown in the meeting detail view showing the remaining time before the meeting starts. This can nicely circumvent the timezone conversion if you are not logged in fedocal and want to know when a meeting starts
  • UTC titles: if you hover over the dates/times with your mouse, the date/time will be shown in UTC which is always handy as in our community UTC remains quite often the most used way to communicate date/time.

detail_view2.png



I would like to take here the opportunity to thank kparal, ralph, willo, red and lbrabec for their bug reports and RFE that led to all these changes which I think are making fedocal 0.7.1 its best release so far :-)

Friday, June 6 2014

Small update on dgroc

It has been a little while since I last spoke about dgroc, the daily git rebuild on copr program.

The problem is that it kinda already lost its name...

Thanks to the great work of Miro Hrončok dgroc now supports mercurial as well :)

Miro took that opportunity to add a generic structure making it easy for other source configuration management software to be added!

Miro also fixed dgroc to take into account the release number used and automatically bump it upon rebuild.

So if you wanted to use dgroc but could not because you project was using mercurial, well now you have no excuses anymore!

Friday, April 11 2014

Presentations at FOSDEM and DevConf 2014

This year I attended both FOSDEM and DevConf and at both conference I was given the possibility to give a presentation.

At FOSDEM, together with the Debian developer Nicolas Dandrimont, we gave a presentation about fedmsg for both the Fedora and the Debian infrastructure.

At DevConf, I gave a presentation about Automation in the Fedora lan presenting all the tools available to our developers to help them do their best work.



Both presentations have been made available now :)

Thursday, April 10 2014

Back on LGM 2014

Last week I have had the opportunity to attend my first Libre Graphic Meeting conference, this year located in Leipzig (Germany). Not being much of a graphic person, I must say that I was sometime lost a bit in some of the talks (being during a presentation or at the coffee corner), but on the other side I have learned a lot! I discovered a whole new side of the Open-Source Software community working on low-level tools and algorythms for image and video manipulation. Meeting these people with such a deep understanding of computer science and photo, for example, was a really extremely enriching experience.

I have learned about image manipulation and the cool effects provided in gimp by the g'mic project. I have met some of the dev of the darktable and inkscape projects, these guys are doing a remendous work, kudos!

On the Friday, we had a presentation about the gooseberry project by the Blender foundation. If you have not pledged to help them, go do it now!. Their project is amazing and need more help!

Another amazing talk was Sebastian Koenig presenting us his work on reviving a medial manikin with Blender. Basically the story is that a museum in Leipzig (the GRASSI Museum für Angewandte Kunst Leipzig) had this old maniki (22.5cm tall) which used to be animated, you could move her arms, legs, fingers or toes but it got old and stucked. So the museum in collaboration with the university did some sort of scan of the manikin and they asked Sebastian Köning to see if he could reconstruct the manikin using Blender. The resulting movie is now on display in the museum next to the actual manikin. I found this research amazing both from a technical and a historical point of view! And icing on the cake, since they had the mesh to reconstruct the animation on Blender they have also been able to make a 3D printed replicate of the manikin giving it back its ability to move.

On the last day, I was able to attend a workshop offered by Tobias Ellinghaus darktable developer and Patrick David about image manipulation on darktable and gimp. I had to leave before the end but the workshop was really, really interesting. Patrick did a live image editing demo in gimp, performing it on photo taken just a few minutes before. I need to practice this a little bit otherwise I'm going to forget all the good tips that were given.

I could also discovered some cool project related to DSLR such as the Magic Lantern which provides a new OS for Canon DSLR, awesome right? There is also the entangle project allowing to remotely control your DSLR, quite handy for macro or astro photos.

I was also given the possibility to contribute to the party. gnokii and I gave a presentation about nuancier as a FOSS contributing and voting application for wallpapers. I think it was well-received and there is apparently already interest to update it to support font. In addition, I took the opportunity, during a lightning talks session, to present HyperKitty and its demo instance which also seem to have brought quite some interest.



All in all, this was my first LGM meeting, I learned a lot about the whole libre graphic ecosystem, met a lot a new people and was given the opportunity to introduce a couple of projects dear to me. I really enjoyed it and would advise it to anyone interested, even remotely, into libre graphic.

Thank you for anyone that helped me attend or enjoying it (Gnokii, Ryan, Garrett, Chris and all the others ;-))!

Thursday, March 20 2014

Introducing dgroc

dgroc for ''Daily Git Rebuild On Copr''.

copr is a build system made publicly available to Fedora contributors and allowing to provide package repository for packages that are not or cannot be part of the standard Fedora repositories. There are multiple reasons a package is allowed in copr but not in the standard repositories, for examples:

  • bundled libraries in the sources that have not been cleaned
  • unstable version
  • version introducing too many changes to be introduced to a stable Fedora release
  • packages that are in the process of being integrated into Fedora but have not yet been approved


The use-case for dgroc is the second point on this list: unstable version.

I know some of us out here are crazy testers and for two projects I was interested in having daily builds, this allows easy install/update (just run yum/dnf) and easy testing.

What dgroc does is providing an easy way to automatically build packages on copr from a git repository.

It works fairly simply:

  • Create a ~/.config/dgroc file and include in it some basic, generic information that will be needed either to update the spec file, make the source rpm available or build on copr:
[main]
username = me
email = my_email@example.com
copr_url = https://copr.fedoraproject.org/
upload_command = cp %s /var/www/html/subsurface/
upload_url = http://my_server/subsurface/%s
#no_ssl_check = True # no longer need now that copr has a valid ssl cert


  • Then for each project you have to define at least three information, for example for subsurface:
[subsurface]
git_url = git://subsurface.hohndel.org/subsurface.git
git_folder = /tmp/subsurface/
spec_file = ~/dgroc/subsurface.spec

Eventually, you can specify a patch_files argument that will be a comma-separated list of patches that are need to build the project.

All what dgroc does from there is:

  • clone the git repo if it is not already in the filesystem
  • run a git pull to get the latest changes
  • generate a new tarball (in the rpm %_sourcedir)
  • update the spec file (release, source0 and changelog)
  • generate the source rpm
  • move that source rpm somewhere to make it available to copr (see the upload_command in the config file
  • start the build on copr



I have been running dgroc for both subsurface and guake and it seems to work fine :)

The project isn't packaged yet but I thought I would announce it in case there are people interested in testing it and reporting bugs and RFE.

Hope you like it! :)

Tuesday, February 4 2014

Evolution of our contributors in Fedora

As you may know, Fedora is under-going a rather large change with Fedora.Next proposition/evolution. One of the point that Fedora.Next addresses is the loss of users observed in Fedora for few years.

The statistics page on the wiki as well as my own representation of the same numbers are both out-dated so we don't really have a clear view on this.

However, since October 2012, all the messages sent onto the fedmsg bus are being stored in datanommer. And that's information we can use to see how we're doing with regards to our contributors.

I asked and got a dump of the datanommer database (the data is anyway publicly accessible in datagrepper) and ran my traditional script on it to gather some numbers.

I generated on a daily, weekly and monthly basis the graph of the number of (distinct) active contributors we have.

Here are the results :-)

contributors_daily_stats.png

contributors_weekly_stats.png

contributors_monthly_stats.png

Three interesting periods:

  • the period without any messages in March 2013 is the period where the bus was down and we did not realize it (since we have a nagios check for the bus)
  • on the daily graph you can really see the dip created by Christmas and its holidays
  • on the monthly graph, the pick in August 2013 coincides with Flock and the launch of badges!


Looking at the graph generated by this week in fedora, in December we launched COPR and it seems that the number of posts on the planet has also quite increased in December and January. Does this alone explain the bump we observe here?

Monday, February 3 2014

Dynamic point of contact assignment (2)

Recently I spoke about dynamic point of contact assignment in pkgdb2, I had generated some global stats about general changes but nothing specific, so here is a little more information about the impact this would have for packagers:

 652 packagers lose at least one package
 In average they lose 6.82668711656 packages
 392 packagers gain at least one package
 In average they gain 11.7168367347 packages
 
 Top 10 packagers losing packages:
    iarnell loses 305 packages : 385 -> 80
    spot loses 185 packages : 348 -> 163
    jplesnik loses 173 packages : 214 -> 41
    than loses 167 packages : 185 -> 18
    steve loses 129 packages : 137 -> 8
    eseyman loses 97 packages : 190 -> 93
    psabata loses 76 packages : 150 -> 74
    jwrdegoede loses 70 packages : 173 -> 103
    corsepiu loses 53 packages : 112 -> 59
    mathstuf loses 51 packages : 63 -> 12
 
 Top 10 packagers gaining packages:
    ppisar gains 1353 packages : 295 -> 1648
    rdieter gains 241 packages : 135 -> 376
    kalev gains 127 packages : 43 -> 170
    limb gains 123 packages : 234 -> 357
    pghmcfc gains 114 packages : 176 -> 290
    pbrobinson gains 106 packages : 58 -> 164
    remi gains 101 packages : 212 -> 313
    mizdebsk gains 97 packages : 54 -> 151
    spot gains 93 packages : 348 -> 441
    petersen gains 75 packages : 141 -> 216

Graphically this is how it looks:

packages_lost_per_packagers.png

packages_won_per_packagers2.png

What about relative lost/gained, so instead of the number of packages let's look at the % of packages lost/gained relative to the number of packages that this user is the current POC:

 Top 10 packagers losing packages (in %):
    <... 142 packagers...> lose 100.0% packages
    steve loses 94% packages
    than loses 90.2% packages
    silfreed loses 90% packages
    packaging-team loses 89% packages
    yyang, sxw lose 87.5% packages
    amdunn, rocha, kylev lose 85% packages
    pvrabec, huwang, jdornak, hvad lose 83% packages
    timn loses 82% packages
    mathstuf loses 81% packages
 
 Top 10 packagers gaining packages (in %):
    paragn gains 1325.0% packages
    atkac gains 1100.0% packages
    crobinso gains 1000.0% packages
    hegdevasant gains 700.0% packages
    aalvarez gains 550.0% packages
    ppisar gains 459% packages
    airlied gains 433% packages
    pmachata gains 417% packages
    jkastner, jpeeler gain 400% packages
    adamwill gains 350% packages

Below is a graphical distribution of the packagers by the percentage of packages they lose:

packages_lost_per_packagers_pc.png

So from this we can conclude:

  • We have two packagers standing out with regards to the number of packages they would gain with a dynamic POC assignment
  • We have one packager standing out as it appears he/she has a lot of packages but is not working on them so much anymore
  • Some packager are both losing and gaining packages
  • We need to filter out the groups maintaining packages, but with the move to pkgdb2 it will be easier to filter them out
  • Quite a number of person are losing all their packages with dynamic POC assignment. There are also two peaks around 50% and 33%.
  • Amusingly, ppisar who gains the most packages is not the one gaining the most packages relatively to the number of packages he already owns


As before I put the script I wrote to gather these stats on my cgit

Friday, January 31 2014

Conferences & talks -- Part 1: FOSDEM 2014

February is always a busy month and this year will be no exception.

Later today I am leaving for Brussel to attend FOSDEM over the week-end, with as added bonus a presentation on Sunday.

The presentation will be about fedmsg and its ecosystem and one of the particularity of this talk is that it will be done together with Nicolas Dandrimont (his blog) who is a Debian developer.

Think about how awesome it is to have in the distribution track a talk about a technology and its possibilities given by two person from two different distributions.

I must say I am looking forward this presentation :-)

Thursday, January 30 2014

Fedocal upgrade

This morning I update fedocal to its latest version: 0.4.2, this update brought quite a number of changes, among them:

  • New UI, closer to pkgdb2, nuancier and koji
  • Add location to meeting, so that #fedora-meeting should no longer be a calendar but a location
  • Improved list view
  • Improved window to add a meeting
  • Improved calendar view where the full day meetings are separated from the other meetings
  • Store the meeting in the specified timezone instead of UTC (allows to have a meeting at 2pm/14:00 Paris time all year long, despite of DST)
  • Enable viewing the agenda in another timezone
  • Enable browsing the dates without using the small monthly calendar



I took the opportunity to re-generate the database to make sure all the fields were in sync with the DB model planned. The data was then copied over from the old DB to the new one, which gave some stats about fedocal:

10 calendars added
18 reminders added
236 meetings added



Check it out!

Thursday, January 23 2014

Dynamic point of contact assignment

Recently, while working on pkgdb2 I had a RFE for "dynamic" ownership.

The idea was to automatically change the owner based on who is actually working on the package.

With the change from "owner" to "point of contact" of a package, I thought that this might be an interesting idea. Of course in order to assess the feasibility and to investigate if it is really a good idea, we need some stats.

So I wrote a script that retrieves all the packages present in rawhide in Fedora. For each package it takes the last 100 actions (git commits, koji build and bodhi updates) and order the contributors from the most the least active. The script then checks the most active user versus the owner/point of contact of the package.

There is the output after running for 6h35:

14546 packages retrieved
14546 packages checked
85 packages w/ no package information
2877 packages w/ ausil as active point of contact
7132 packages won't change their point of contact
4451 packages will change of point of contact

I had to put appart ausil as he is the one doing the Mass-Rebuild and as such would become the point of contact of too many packages that have no other activity than Mass-Rebuild.

I still have the matrix of data available to extract more information about the distribution of the packages among the packager but I thought I would share this first.



What do you think?

Wednesday, December 18 2013

Fedora packagers activity

Following up on the thoughts about activity on our packages using the last build date I was curious to investigate the activity of our packagers.

So here again, I wrote a script that uses FAS to retrive the list of people in the packager group. For each of these person, it then queries datagrepper for their last fedmsg message, thus retrieving the date of their last activity.

Graphically it looks like this: On the X axis is presented the number of packager whose last activity was on that day, on the Y axis is how many days ago that day was.

last_packager.png

Converted to a log scale, we get: On the X axis is the log of the number of packager whose last activity was on that day, on the Y axis is how many days ago that day was.

last_packager_log.png

On both graph the peak at the end represent the number of packagers for which no activity could be found on datagrepper.



To provide some more numbers:

  • There are 1476 user in the packager group
  • 224 were active today (day 0)
  • 878 (59.5%) were active over the last 30 days
  • 386 (26.2%) were not active for the last 100 days
  • 296 (20%) were not active for the last 200 days
  • 253 (17%) had no activity registered by fedmsg.
  • The oldest activity registered is from 308 days ago.

Tuesday, December 17 2013

Fedora package build history

Recently I have been thinking about a way to do mass-rebuild but only of packages that have not been built in a while (since the last release?).

At the moment, we only do mass-rebuild when there is a specific need to, for example a new version of GCC.

This is a very specific process which is ran over multiple days and just rebuilds all the packages. As a results, some packages that are of very low maintenance may just seat around, un-touched until the next mass-rebuild.

I was wondering if we could not simply take all the packages on rawhide and run, say once a month (or once a week, every day?), check when their last successfull build was and if older than X (to be defined), do a simple scratch build of the package. We could query koji or fedmsg via datagrepper to get the date of the last successful build of the package.

So technically it is duable, in theory it makes sense but the question is, in practice does it?

The first check to assess this is simply looking at the distribution last successfull dates of the packages.

So I wrote a small script querying the packagedb to get the list of all the packages and then queries datagrepper to retrieve the date of the last successful build. The number of days between this date and today is then computed and the output provides the number of packages that have been rebuild on each day.

Graphically it looks like this: On the X axis is presented the number of packages built on that day, on the Y axis is how many days ago that day was.

last_build.png

Converted to a log scale, we get: On the X axis is a log of the number of packages built on that day, on the Y axis is how many days ago that day was.

last_build_log.png

To provide some more statistics:

  • 14397 packages were checked
  • 49 packages were built yesterday (day 0, when the data was gathered),
  • 1 package has not been successfully built since 271 days ago
  • 66 packages have not been sucessfully re-built for 200 days or more
  • 11418 packages have not been sucessfully re-built for 100 days or more
  • The two peaks that can be seen are from 132 and 133 days ago (last mass-rebuild?)



Is this something worth persuing? Should we automatically re-build packages after a while and report in case the build fails?

What do you think?

- page 3 of 10 -