Forums : Feedback Forum

Dear Open Hub Users,

We’re excited to announce that we will be moving the Open Hub Forum to https://community.blackduck.com/s/black-duck-open-hub. Beginning immediately, users can head over, register, get technical help and discuss issue pertinent to the Open Hub. Registered users can also subscribe to Open Hub announcements here.


On May 1, 2020, we will be freezing https://www.openhub.net/forums and users will not be able to create new discussions. If you have any questions and concerns, please email us at [email protected]

Statistics for Drupal are flawed

Hey,

the LOC statistics for Drupal are definitely flawed somehow.
Ohloh reports 21,103 LOC for Drupal core but there are definitely more.

The reason might be that many PHP files in Drupal have a different extension - namely .module, .engine and .theme.
Without counting these, the statistics for Drupal are worthless, unfortunately, as more than half of Drupal's code lives in .module files.

As of today's Drupal CVS HEAD:

find drupal -type f ( -name *.php -o -name *.inc ) -exec egrep -vh '^$' {} \; | wc -l

27220

(= number of non-blank lines in .php and .inc files)

find drupal -type f ( -name *.module -o -name *.theme -o -name *.engine ) -exec egrep -vh '^$' {} \; | wc -l

27351

(= number of non-blank lines in .module, .theme and .engine files)

So, Ohloh is basically ignoring half of Drupal's code.

One solution would be to make either the file types that are used to calculate the LOC or the filetype->languate mapping a project-specific setting.

Frando over 17 years ago
 

I'm afraid making file extensions project specific would allow many people to cheat. Since PHP files always contain <?php could it be a solution to search for the begin tag in non-.php files? Of course, there's also <? and <% but these are disabled by default and are no guarantee the file contains PHP (it could also be XML or ASP).

Dietrich Moerman over 17 years ago
 

Greetings all,

Our detector uses file extensions and their contents to try to determine the language contained. As Frando suspects, we do NOT currently recognize .module, .theme and .engine files as php.

Dietrich - we have some disambiguation logic to try and tell if a file should be treated as X or Y. So, the rule COULD be something like:


if extension =~ /.module|.theme|.engine/ AND file.contents =~ /<?php/

I'm willing to try it out. These changes are always tricky cause we run this stuff against millions of files - there's always outliers that make life difficult. Frando, Dietrich - what do you think?

Jason Allen over 17 years ago
 

I think this would be a nice solution. :)

Dietrich Moerman over 17 years ago
 

Yup, that should work. All PHP files must contain <?php, so checking against that sounds like the best thing to do.

Here's a complete list of file endings that Drupal uses at the moment for PHP files:

.php
.inc
.module
.theme
.engine
.schema
.install
.profile

This applies to both Drupal (core) and Drupal (contributions).

Maybe just checking all text files against <?php would be the easiest and most future-proof?

Thanks for your efforts in fixing this!

Frando over 17 years ago
 

Wouldn't it be easier to use some mime magic on the non-binary files to figure out what they are?
The unix 'file' utility does a good job in figuring out the file type:

file index.php includes/common.inc modules/system/system.module

Gives

index.php: PHP script text

includes/common.inc: PHP script text

modules/system/system.module: PHP script text

elmuerte over 17 years ago
 

The file utility does exactly what the introduced fix in Ohloh does, it reads the file looking for a PHP open tag. I tried this out myself.

$ file tagadelic.module
tagadelic.module: PHP script text

After removing the PHP open tag:

$ file tagadelic.module
tagadelic.module: ASCII C++ program text, with very long lines

So, I think the original solution is the best (no need to use third-party and *NIX only binaries).

Dietrich Moerman over 17 years ago
 

Any news here?

Frando over 17 years ago
 

bump

Frando over 17 years ago