The Programming Language Tag Clouds

Hi there.

Not a very serious project, but I'm currently building tag clouds describing various programming languages at http://plwords.herokuapp.com/.

In case some of you want to contribute their feeling / knowledge about common programming language, some tag clouds really need more contributions!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Sweet (Everyone here should try to contribute)

That's a very neat project to help document and differentiate languages. I've contributed a couple for Haskell and E. I look forward to seeing the tag clouds you can generate as it grows.

what a painful user interface, not doing that any more

proving i'm human should not be required once per language for which i submit tags.

They might be mining

They might be mining captchas for resell. I've been reading too many conspiracy theories lately.

good point, I'll try to

good point, I'll try to adress this soon.

Breaking up phrases seems

Breaking up phrases seems not such a good idea. Phrases like "strongly typed" get broken up into "strongly" and "typed", "ad hoc" into "ad" and "hoc", "very wonderful" into "very" and "wonderful" asf. But what's the use of "strongly", "ad" or "very" as tag words?! Either restrict to single words, or leave words of a phrase together.

You are right... and not the

You are right... and not the first to identify the issue.

The fact is that I originally made a mistake (or is it one?) of allowing people to use "short sentences" in addition to single words.

At this stage (> 150 entries) not breaking short sentences 1) gives very bad visual results and 2) reduces the overal frequency of the most common words (tag clouds are no longer interesting in such a scenario).

So, I'm still looking for a solution here. I don't like the idea of requiring single words only and I doubt people will actually comply (unless the formulary force them of course). What about filtering adverbs, coordination conjunctions and the like?

I would do two things. For a

I would do two things. For a specific language I would keep phrases together, maybe breaking up on conjunctions, even if visual entries get bigger that way. For me, color and spacial orientation do enough to keep things readable and interesting, while nonsensical entries annoy.
For the overall frequency I would filter much more (adverbs etc.) to restrict to only the essential words, even if that incurs some false negatives.