Dear Open Hub Users,
We’re excited to announce that we will be moving the Open Hub Forum to
https://community.blackduck.com/s/black-duck-open-hub.
Beginning immediately, users can head over,
register,
get technical help and discuss issue pertinent to the Open Hub. Registered users can also subscribe to Open Hub announcements here.
On May 1, 2020, we will be freezing https://www.openhub.net/forums and users will not be able to create new discussions. If you have any questions and concerns, please email us at
info@openhub.net
To anyone following this discussion, the issue turned out to be Ohloh's infrastructure attempting to determine where an enlistment begins by looking through the commit log and ultimately failing.
In
... [More]
our case, what failed was an svn log call that was taking so long on the server side that the eventually the network connection would time out and fail. Either the server (Sourceforge hardware) was too slow, our history was too big (50k+ revisions), SVN is poorly optimized for reverse logs, or some combination thereof. The request was taking more than 5 minutes, so the network socket would shut down.
Ohloh's request asks svn to extract all commits in reverse order (-rHEAD:1) so that the backend server can just send the oldest (--limit 1). This is commonly documented as the way to obtain the first revision for a branch. However, sorting all commits in reverse order ends up taking too long. I created a pull request on Github that should fix this issue that will hopefully help some other big projects as well that might be running into this issue. It's not fast, but it should succeed by requesting the log and having the receiver identify the last entry.
Cheers!
[Less]
[edited]
Sorry, read your message wrong the first time and see now that you added the reverse alias for us.
I see now that rbowers is listed in the will be attributed to list, but just not in the
... [More]
Contributions by list. I'd missed that detail because we wanted the attribution to go the other way: rbowers -> ronaldbowers
I'm not sure I see the committer vs ohloh ID distinction, though, because most of our other Contributions by aliases are most definitely not pointing to ohloh IDs. Moreover, rbowers is a committer as shown by the 68 commits so I'd think he should be listed, no?
[Less]
Okay, I've investigated the differences between the CVS and SVN enlistment, compared them to my notes before the SVN enlistment, and think I now have a pretty good grasp of all the differences
... [More]
possibly even explaining the 7% increase (34 years) of overall effort that is reported for the SVN repository. None of the differences are of concern, fortunately (kudos!), but I did think you might find the differences interesting as they pertain to how Ohloh indexes and attributes commits for an enlistment -- there are some rather interesting differences between having a CVS repository and an SVN repository. In brief, the things I noted are as follows:
SVN property changes are being attributed as source code changes. This is possibly the bulk of the 7% difference as it adds minimally a +1-1 change to every commit due to the CVS revision numbers being stored as SVN properties (an option we opted for during cvs2svn). With 28700 SVN commits, that's minimally -28700+28700 line changes being counted.
Ohloh does a much better job at collapsing related CVS commits into one change event (presumably based on time and commit message) than cvs2svn does. Ohloh counted 27661 CVS commits and reports approximately 28689 SVN commits. I would presume you're taking SVN commits as-is given SVN performs atomic commit transactions, so you don't need/try to collapse them. Might be useful to do the same collapse for all repository types based on the log/timeframe for consistency, though it's certainly a minor difference.
Perhaps entirely unrelated to another change that was just made, but Ohloh now correctly finds our BSD licensed files. All of our file counts doubled perfectly except for our BSD license count. Before with CVS, it counted 2 (which was quite wrong). Now it counts 171 files, presuming Ohloh's BSD detection didn't change this week. That number is a bit higher than my back-of-the-envelope quick grep counts, but it's within the ballpark.
As already noted, it's interesting that the SVN commit log messages all seem to include a trailing newline whereas the CVS commit log messages do not.
The Lines Modified metric was one that I really couldn't account for but did find it exceptionally interesting in the magnitude of the differences. Most of the user commit counts were about +3% and line change counts were about +- 3% different. Would hypothesize that the difference is simply the uncollapsed SVN commits. The only one that raised an eyebrow was where one user (johnranderson) gained an astonishing +269K lines credited to them, where everyone else was over or under by a few thousand lines or less for the SVN enlistment.
Cheers!
Sean
[Less]
Howdy Robin!
Actually, I noticed about a week ago right after you'd started that all the stats were updated! I seem to find myself visiting the site for various statistics more and more frequently.
... [More]
Needless to say, I was quite delighted. The stats for BRL-CAD look much better now.. THANK YOU so much for all your hard work on running the update.
Comparing the final ohloh user stats that you pulled with the stats that I extracted, the numbers were very close.. They weren't matching, though, as I would have expected for at least the older devs that are no longer active so I dug deeper. Fortunately when I did, though, I was able to account for all of the differences and got exactly the same counts for all the devs I compared. The difference was that my stats collapsed all identical consecutive log messages contrary to ohloh's (better) time-based collapse. Once I took that into account, the numbers matched up.
From a larger sanity check standpoint, I counted 25836 unique commits where ohloh has 26554 identified, which is pretty much what I would expect given the time-based culling differences. Another sanity check was to look at the year's contribution of the developers (which started this whole inquiry), which also seems much more reasonable now.
Thank you again for putting the effort into this and getting the update rolled in. If there is a donate now button or other means to convey my gratitude and appreciation, please let me know.
Cheers!
Sean
[Less]
ogourmet,
There is nothing that prevents you or anyone else from adding the Tcl/Java project yourself. Having the ohloh project admins/devs add other people's projects when it's been as open and
... [More]
flexibly designed towards community involvement would be a certain waste of their time. If you want the project added, add it.
Cheers!
[Less]
Robin,
Thanks for the quick reply and reference to the commits listing. I do see now that there are a speckling of Attic files being counted from that far back, but it is missing a plethora of
... [More]
commits. For example, the commits listing shows exactly one commit for 1984 (second to last page). However, if I ask CVS for history on just one file:
.. [ from an existing unpruned checkout ] ..
~/brlcad morrison$ cvs log rt/main.c | grep 1984
date: 1984/11/30 04:04:13; author: mike; state: Exp; lines: +12 -12
date: 1984/11/29 07:04:33; author: mike; state: Exp; lines: +6 -6
date: 1984/11/27 06:59:47; author: mike; state: Exp; lines: +58 -32
... [ trimmed ] ...
~/brlcad morrison$ cvs log rt/main.c | grep 1984 | wc
27 267 1881
There's 27 commits in 1984 on just that one file from mike. There are veritably hundreds of files like that with commit traffic missing for most of the 80's and some of the 90's, and that span several authors' activity. It's not clear to me what the correlation is where brlcad/jove/ would be read but not (any of) the files in brlcad/rt/ nor those in a couple dozen other directories that are similarly in the Attic. Hopefully it might give you a lead on where to look though.
Thanks again for the reply and great work. The developer analysis idea you mentioned sounds like a great feature to say the least. :-)
Cheers!
Sean
[edited due to formatting issues, apologies on all the mods.. a preview option would be nice.. ;-)]
[Less]