Forums : Technical Issue Help

Dear Open Hub Users,

We’re excited to announce that we will be moving the Open Hub Forum to https://community.blackduck.com/s/black-duck-open-hub. Beginning immediately, users can head over, register, get technical help and discuss issue pertinent to the Open Hub. Registered users can also subscribe to Open Hub announcements here.


On May 1, 2020, we will be freezing https://www.openhub.net/forums and users will not be able to create new discussions. If you have any questions and concerns, please email us at info@openhub.net

Something wrong with the line counter for F#

The line counter for F# says that -4000 lines have been submitted and there is a 200% comment ratio. That doesn't make any sense.

Seems like a bug to me!

alecgorge about 15 years ago
 

Hi alecgorge,

I don't (yet) think there is a problem in the F# counter itself. Rather, this unusual total is a result of Ohloh's delta-based line counter.

Ohloh's line counter is cumulative: as each new commit is processed, we compute the delta to the lines of code, and add this to a running total for each project. We do it this way for efficiency.

The problem here is that Ohloh's F# parser is relatively new. Most F# projects were added to Ohloh before we had a parser for them. So when all of the old commits in these repositories were initially counted, Ohloh did not recognize any F#, and incorrectly found 0 total lines of code.

Now, as time goes forward and new commits arrive, we do correctly count the deltas to the F#, but since we never counted the original code, the resulting totals on our system are incorrect, and possibly even negative (which would indicate that more F# has been deleted than added in the last few weeks).

The fix is for us to fully re-process the entire commit history for all of the repositories that contain any F#. This is a big job, and it will take a while. And, admittedly, it's a partially manual process, and we haven't been staying on top of it.

Once the full recount of all F# projects is complete, the grand total should correct itself.

Thanks,
Robin

Robin Luckey about 15 years ago
 

Well, that makes a lot of sense!

If you can you might want to put a little info link after the count if it is negative so that people can understand why.

alecgorge about 15 years ago
 

This analysis does not explain what can be observed.

If it was indeed a result of deltas, then a new project with only one single commit should not decrease the line count for a language because it only has positive deltas since no code has ever been deleted from the project.

I have added a test project with 10K lines in a single commit to see what happens to the total line count of the language and whilst the project itself shows positive 10K line count, the line count for the language decreased by 10K.

https://www.ohloh.net/topics/4690

This shows that the above analysis is not correct. Something else is wrong.

As an interim solution, I suggest that the language totals should be calculated from positive subtotals only.

foreach project {
foreach language {
if (project->language->loc > 0)
language->loc = language->loc + project->language->loc;
else
log(project %i has negative linecount and was excluded from totals\n, project->id);
}
}

This way all the data that is not consistent is excluded from the totals, which makes perfect sense. You should never use data that is verifiably incorrect for your totals.

Most importantly, the comment ratio for the language will no longer be 0% when it isn't 0%.

trijezdci almost 15 years ago
 

I favour a harsher system. Throw an error if the line count is negative. Hard. Don't catch it. That way when the system fails you'll:

  1. know exactly where the error occurs;
  2. be forced to fix it without making excuses or putting it off.
ttmrichter almost 15 years ago
 

ttmrichter: see my amended pseudo-code above (adding a log statement)

trijezdci almost 15 years ago
 

I prefer thrown hard errors. The Erlang approach, dontcha know. ;)

ttmrichter almost 15 years ago
 

To me the most important thing is that the comment ratios work. If there are some troubled projects that are excluded from the analysis, then why should everybody else whose project doesn't have troubles be impacted by that?

There should however be an Ohloh.net status page that shows the troubled projects and their statistics per language, so that people who do not have any access to the log files will be able to look into the problem and perhaps contribute to fixing it.

trijezdci almost 15 years ago