Dear Open Hub Users,
We’re excited to announce that we will be moving the Open Hub Forum to
https://community.blackduck.com/s/black-duck-open-hub.
Beginning immediately, users can head over,
register,
get technical help and discuss issue pertinent to the Open Hub. Registered users can also subscribe to Open Hub announcements here.
On May 1, 2020, we will be freezing https://www.openhub.net/forums and users will not be able to create new discussions. If you have any questions and concerns, please email us at
[email protected]
We have data files in our project that are directly #included by C code. This causes the LOC count to be overestimated, and as a result the project is improperly shown as having Few source code comments
.
Some time ago I tried to rename the .h files to .data but it didn't change the LOC count. Is there a way (by using a different extension, or by means of a special comment marker, for instance) to tell ohloh to ignore a given file?
Hmm, that's odd. Renaming a file to *.data should cause that file to be missed by our line counter, since that's not an extension Ohloh recognizes.
There is currently no way to instruct Ohloh to ignore certain files. We've brainstormed around a robots.txt-style file, but haven't had the time to implement this.
You can try using our line counter (labs.ohloh.net) on your local drive and see what numbers it comes back with. It has some pretty detailed capabilities which might help you figure out where the overcount is coming from, and why the *.data rename did not change the totals.
ohcount actually gives the correct answer on a fresh trunk checkout (the project is libcaca): around 22,000 lines of code. The .h -> .data change happened back in October 2007 (revision 1445) and ohcount gives seemingly correct answers in both cases (31,000 lines for r1444 and 19,000 lines for r1445). However, Ohloh's codebase history graph shows an increase in lines of code instead of the expected decrease for that period.
I am afraid it might prove difficult to debug the issue, because the SVN repository recently changed during a merge with other projects (the history was kept, though). If a dump of the previous repository might be helpful, let me know. Otherwise, don't bother: it's not that important after all :-)
Also, ohloh can misidentify a small C project as being mostly shell script
if it happens to be using autoconf tools (it sees that big ol' configure
script I guess.)
Ohloh should probably ignore autoconf stuff, at least the machine generated parts, or maybe notice that it's autoconfiscated, as this can be a sign that something has been used on more than one platform, which may be worth noticing. Instead it says, mostly shellscript.
Heh.
Re a robots.txt-style file for code, checkout: http://www.google.com/help/codesearch_packagemap.html
It's Google's 'packagemap' format for their code-search engine.
Perhaps extend the 'type' tag with an autogenerated=true
attribute. Now you can exclude autogenerated files from code/comment ratio calculations.
Another thing which might be useful to Ohloh is the 'license' tag, which is more fine grained as it can apply to sub-directories and files within a project. (Currently Ohloh complains when a project contains files with different licenses.)
The license thing is a good point. Several of my packages contain ogg audio files licensed under a CC license, while the program that uses them is GPL'ed. They're also tagged with vorbiscomment with a LICENSE=
attribute. Might be worth making ohloh able to read vorbis, mp3, and whatever other audio/video file tags are commonly used.
In my opinion, I really think a simple file like the Git .gitignore or the contents of the SVN ignore property would be preferable because it is simple and a fairly common format, especially for the users of Ohloh.
For instance, if I had a folder called dependencies/ that I used but didn't develop and wouldn't want to contribute to the code count, I would have a simple .ohlohignore file in the containing directory with the following content:
dependencies/
It would be simple to just use glob syntax for files to ignore.
Extremely Late Reply:
Matt,
while .gitignore seems appropriate at the first glimpse, wouldn't you also lose version control upon those files?
My personal problem (which is why I scroll the forums right now) is that I have a directory stock
beneath trunk, where all the garbage of my projects lands, but which might get useful again one day. I'd like ohloh to ignore that folder, but I wouldn't like to delete them from the repository or gitignore them.
Greetings, seb.