About SourceForge and anitya
By Pierre-Yves on Wednesday, February 25 2015, 10:17 - Général - Permalink
There are a couple of reports (1 and 2) about anitya not doing its job properly for projects hosted on sourceforge.net.
So here is a summary of the situation:
A project X on sourceforge.net, for example with a homepage sourceforge.net/projects/X, releases multiples tarball named, X-1.2.tar.gz, libX-0.3.tar.gz and libY-2.0.tar.gz.
So how to model this.
The original approach taken was: the project is named X, so in anitya we should name it X and then the sourceforge backend in anitya allows to specify a Sourceforge project allowing to search X, libX or libY in the rss feed of the X project on SourceForge. Problem: when adding libX or libY on anitya, the project and homepage are all X and sourceforge.net/projects/X, while this is actually used to make project uniques in anitya (in other words, adding libX and libY won't be allowed).
So this is the current situation and as you can see, it has problems (which explains the two issues reported).
What are the potential solutions?
1/ Extend the unique constraint
We could include the tarball name to search for in the unique constraint, which would then change from: name+homepage to name+homepage+tarball
2/ Invert the use of name and tarball
Instead of having the project name be X with a tarball name libX, we could make the project be libX and the tarball be X.
This sounds quite nice and easy, but looking at the project currently in anitya's database, I found projects like:
name | homepage | tarball + + linuxwacom | http://sf.net/projects/linuxwacom/ | xf86-input-wacom brutalchess (alpha) | http://sourceforge.net/p/brutalchess | brutalchess-alpha chemical-mime | http://sourceforge.net/projects/chemical-mime | chemical-mime-data
So for these, the tarball name would become the project name and they would be pretty ugly.
I am not quite sure what is the best approach for this.
What do you think?
Comments
I think I like option #2 but I don't think I fully understand the scope of it yet. Would making that change *only* affect sourceforge project listings? Or would it affect all project listings?
#1 seems like it would introduce confusion in the code since (some number of) things have been written expecting the unique constraint to combine only the project name and the homepage.
Option #2 would only affect the sourceforge backend yes, it's the only backend (iirc) that uses the version_url field to store info about the name of the tarball (currently).
It's the easiest option and clearly it would work for most things, but that does give us some ugly project names :-/
Saw the proposal from scop at: https://github.com/fedora-infra/ani... ?