Recently I have been thinking about a way to do mass-rebuild but only of packages that have not been built in a while (since the last release?).

At the moment, we only do mass-rebuild when there is a specific need to, for example a new version of GCC.

This is a very specific process which is ran over multiple days and just rebuilds all the packages. As a results, some packages that are of very low maintenance may just seat around, un-touched until the next mass-rebuild.

I was wondering if we could not simply take all the packages on rawhide and run, say once a month (or once a week, every day?), check when their last successfull build was and if older than X (to be defined), do a simple scratch build of the package. We could query koji or fedmsg via datagrepper to get the date of the last successful build of the package.

So technically it is duable, in theory it makes sense but the question is, in practice does it?

The first check to assess this is simply looking at the distribution last successfull dates of the packages.

So I wrote a small script querying the packagedb to get the list of all the packages and then queries datagrepper to retrieve the date of the last successful build. The number of days between this date and today is then computed and the output provides the number of packages that have been rebuild on each day.

Graphically it looks like this: On the X axis is presented the number of packages built on that day, on the Y axis is how many days ago that day was.

last_build.png

Converted to a log scale, we get: On the X axis is a log of the number of packages built on that day, on the Y axis is how many days ago that day was.

last_build_log.png

To provide some more statistics:

  • 14397 packages were checked
  • 49 packages were built yesterday (day 0, when the data was gathered),
  • 1 package has not been successfully built since 271 days ago
  • 66 packages have not been sucessfully re-built for 200 days or more
  • 11418 packages have not been sucessfully re-built for 100 days or more
  • The two peaks that can be seen are from 132 and 133 days ago (last mass-rebuild?)



Is this something worth persuing? Should we automatically re-build packages after a while and report in case the build fails?

What do you think?