Discussion:
How to find out how often a port file has been downloaded?
Marko Käning
2010-04-17 23:55:10 UTC
Permalink
Hi,

I was just thinking that it might be helpful to know which ports are most actively used by the community.

Is there a way to determine how often ports got actually installed?

That would be perhaps some help when it is up to decide which ports need special care, don't you think?

Greets,
Marko
Ryan Schmidt
2010-04-18 00:48:22 UTC
Permalink
On Apr 17, 2010, at 18:55, Marko Käning wrote:

> I was just thinking that it might be helpful to know which ports are most actively used by the community.
>
> Is there a way to determine how often ports got actually installed?
>
> That would be perhaps some help when it is up to decide which ports need special care, don't you think?

Yes, I think it would be useful. When I suggested it a few years ago someone thought this would be an invasion of privacy (tracking who installed what ports) so nothing was done. So currently there isn't a good way to track this. Our server admin could look at the web server logs for distfiles.macports.org to see how often a distfile was downloaded, but this will not include users who happened to download the distfile from a different server, nor account for users who have either downloaded but not installed the software, or those who subsequently uninstalled it.

If and when MacPorts ever gets the functionality to distribute binary archives, the first part of that problem may become a little easier, since those binary archives will certainly be downloaded from a MacPorts server, and not the software developer's server. Though, if we mirror the binary archives on several servers, then we would have to collate that information from those multiple sources.

Or we could channel all downloads through a central redirection script, like SourceForge does. Thus we could count downloads (like SF), and perhaps even implement a better geolocation system for downloading from nearby servers (like SF). (Our current ping-based approach has some drawbacks.) Tracking "total number of downloads" for a port isn't perhaps the most useful, but breaking it out by version would help, as would being able to see, say, how many downloads today, in the past week, in the past month, etc. This would help gauge a port's popularity.
Joseph Holsten
2010-04-18 01:59:53 UTC
Permalink
Ryan Schmidt wrote:
> Marko Käning wrote:
>>
>> Is there a way to determine how often ports got actually installed?
>
> Or we could channel all downloads through a central redirection script, like SourceForge does. Thus we could count downloads (like SF), and perhaps even implement a better geolocation system for downloading from nearby servers (like SF). (Our current ping-based approach has some drawbacks.) Tracking "total number of downloads" for a port isn't perhaps the most useful, but breaking it out by version would help, as would being able to see, say, how many downloads today, in the past week, in the past month, etc. This would help gauge a port's popularity.

Sounds a bit like the new rubygems (nee gemcutter) distribution setup. There's a minimal redirector that gives a hook for tracking, and that passes the user on to amazon cloudfront or s3 to host the actual file.

ruby gems in action: http://rubygems.org/
the man^H^H^Hcode behind the curtain: http://github.com/qrush/gemcutter

The actual bits are in app/metal/hostess.rb, which determines whether to use the CDN (cf) or static (s3) file hosting.

Frankly, I'd love making a derivative of rubygems just for end user access. Might be good for winning back some user base from homebrew also.
--
http://josephholsten.com
Scott Haneda
2010-04-18 05:19:08 UTC
Permalink
On Apr 17, 2010, at 5:48 PM, Ryan Schmidt wrote:

> Yes, I think it would be useful. When I suggested it a few years ago someone thought this would be an invasion of privacy (tracking who installed what ports) so nothing was done.

Maybe it could be a first run option? The conf file for MacPorts is consulted on every run I would imagine, there could be a flag, report_stats = BOOL, which if did not exist, a question was asked of the user, an answer is given, and the flag is set.

It could not give valuable data as to exact counts, (some will opt out) but if the data was shown as percentages, it would be valuable as an average, to see which ports were downloaded compared to others.

This could be very valuable, if it was determined that what most think is an obscure port was heavily used, it could be looked at more closely, and perhaps made as perfect as could be as far as ease of install, up to date'ness etc.

My gut tells me Apache, php, and MySql are at the top, but who knows, that could be completely wrong.

* Maybe it would be possible to log install failures as well, if a port is seen as failing in high percentages, and the stats were of course public, someone could then preemptively look at the port and possibly fix it before the problem even hits the mailing lists or bug tracker.

I would certainly opt in, I see no reason to capture the users IP unless you want stats on location, which is probably not at all worth the privacy discussions we would then have to have. Without user identifiable information sent over the wire, I would not even flinch were I not asked to opt into this.
--
Scott * If you contact me off list replace talklists@ with scott@ *
Marko Käning
2010-04-18 18:13:07 UTC
Permalink
> This could be very valuable, if it was determined that what most think is an obscure port was heavily used, it could be looked at more closely, and perhaps made as perfect as could be as far as ease of install, up to date'ness etc.
Yep ...

> My gut tells me Apache, php, and MySql are at the top, but who knows, that could be completely wrong.
Who knows...

> * Maybe it would be possible to log install failures as well, if a port is seen as failing in high percentages, and the stats were of course public, someone could then preemptively look at the port and possibly fix it before the problem even hits the mailing lists or bug tracker.
Exactly.

I see that I hit a nerve with this post. :)

But well, if there is no support build into port at the moment I guess this might be quite a task...

But I figure it might be worth the effort, because it would really be a good guide for MacPorts' maintainers and admins.
Bradley Giesbrecht
2010-04-18 15:29:41 UTC
Permalink
On Apr 17, 2010, at 5:48 PM, Ryan Schmidt wrote:

>
> On Apr 17, 2010, at 18:55, Marko Käning wrote:
>
>> I was just thinking that it might be helpful to know which ports
>> are most actively used by the community.
>>
>> Is there a way to determine how often ports got actually installed?
>>
>> That would be perhaps some help when it is up to decide which ports
>> need special care, don't you think?
>
> Yes, I think it would be useful. When I suggested it a few years ago
> someone thought this would be an invasion of privacy (tracking who
> installed what ports) so nothing was done. So currently there isn't
> a good way to track this.

Wouldn't doing an md5 of identity (machine serial or something) or
some other identity protection be adequate?
You have to register with apple to get dev tools.

I would think that knowing this information would be helpful enough in
prioritizing ticket fixes that the privacy tradeoff would be an
acceptable cost.


// Brad
Jeremy Lavergne
2010-04-18 15:36:55 UTC
Permalink
Just make it opt in.

"Bradley Giesbrecht" <***@pixilla.com> wrote:

>
>On Apr 17, 2010, at 5:48 PM, Ryan Schmidt wrote:
>
>>
>> On Apr 17, 2010, at 18:55, Marko KÀning wrote:
>>
>>> I was just thinking that it might be helpful to know which ports
>>> are most actively used by the community.
>>>
>>> Is there a way to determine how often ports got actually installed?
>>>
>>> That would be perhaps some help when it is up to decide which ports
>>> need special care, don't you think?
>>
>> Yes, I think it would be useful. When I suggested it a few years ago
>> someone thought this would be an invasion of privacy (tracking who
>> installed what ports) so nothing was done. So currently there isn't
>> a good way to track this.
>
>Wouldn't doing an md5 of identity (machine serial or something) or
>some other identity protection be adequate?
>You have to register with apple to get dev tools.
>
>I would think that knowing this information would be helpful enough in
>prioritizing ticket fixes that the privacy tradeoff would be an
>acceptable cost.
>
>
>// Brad
>_______________________________________________
>macports-dev mailing list
>macports-***@lists.macosforge.org
>http://lists.macosforge.org/mailman/listinfo.cgi/macports-dev
>
Loading...