wordpress plugins
Yesterday i wrote my first couple wordpress plugins. One you can find already on every post at the bottom. If you click ‘Show wikified noun phrases ‘ you’ll see a select appear with all noun phrases my little script extracted. Clicking on a noun phrase will point you to wikipedia. The script is pretty simple, through regular expressions it gets out nouns and noun phrases (yeah I know, it’s not optimal yet). Those nouns are compared to a (for now outdated) local database of wiki entries. If such an entry is found the noun(phrase) is made into a link and placed into the select. Pretty cool huh? I think something like this can give a user more information about something related to the post. Any ideas and suggestions are more than welcome. The source will be made available soon, after some more tweaking.
As a second plugin i made a simple script which makes an array of words in the post and the amount of times it appears in the post. The idea behind this is that recurring words might be good tags to classify this post in. I’m thinking how to let my wordpress make good use of tags. Maybe tags could replace categories and could, through something like touchgraph, make you find related information on our blog pretty easy. This script could also be used to insert tag statements for Technorati, to link it to del.icio.us … The possibilities are legio
Again, any ideas are welcome!
18 Comments »
RSS feed for comments on this post. TrackBack URL





grmbl, the showHide script doesn’t seem to work on firefox windows. It does work on firefox mac and linux … Strange …
Dude, really cool! I’m thinking it might be an idea to remove common nouns from the words that will be wikified. I like your solution though because it solves the problem I raised before that it might drown relevant links in wikipedia links.
Also, I’m not sure what you mean by “the showHide script doesn’t seem to work on firefox windows” … it seems to be working for me, unless I am missing something.
Anyway, let’s talk about it l8r
http://esl.about.com/library/vocabulary/bl1000_list1.htm
well, i thought it didn’t work on win firefox but it was just on http://www.justlol.org where the javascript wasn’t included. It is solved now.
I was indeed planning to get out words from standard stoplists and maybe indeed the most common words in english. Also i should have noun phrases extracted by a system learnt through a part of speech tagger so it gets out more nounphrases than currenlty (only more than one noun if two following nouns have capital letters).
Another thing you could do is compare noun phrases to a list of existing wikipedia articles. Check out:
http://en.wikipedia.org/wiki/Wikipedia:Quick_index
Other ways to browse:
http://en.wikipedia.org/wiki/Wikipedia:Browse (for example if you only want to access certain topics)
dude, that’s exactly what i’m doing …
ok, and i’m sure you looked at http://www.antisleep.com/wikipedizer/api/wikipedizer.php.txt aswell:
// Match proper noun phrases.
preg_match_all(”/[A-Z][a-zA-Z]+(\s[A-Z][-a-zA-Z]+)+/ms”, $result, $propernounphrases);
// Match acronyms. (performance seems to go through the floor if we do these in one pass.)
preg_match_all(”/[A-Z][A-Z][A-Z]+/ms”, $result, $acronyms);
// Merge and de-duplicate.
$phrases = array_unique($propernounphrases[0] + $acronyms[0]);
// Open up a db connection and whittle our list down against the real titles.
$connection = mysql_connect (”localhost”, “wikipedia”);
if ($connection == null)
that’s where i got the idea
i just grabbed a list of common words from the net and let the script filter them out of the noun phrases. As you wished. Good idea
dude, it’s listing all the words above the article now, that can’t be the idea…
hehe now there’s an error, i guess you’re working on it
yep, working on it, should have some development wordpress up somewhere.
The script works pretty neat now but for some strange reason it doesn’t get rid of ‘the’ although it is the most common word … mmhhh … dinner first
also it should actually stem the words or something like that. e.g. ‘thing’ is in the most common word list but ‘things’ isn’t …
How about a plugin that rejects texas-holdem posts :))
Well, i had to change ip, i got plugged out, now i am plugged in
No more texas sh*t again. Oh, there is this guy called bush …
since you’re thinking about a redesign, i kind of like this design:
http://www.stopdesign.com/log/
not bad at all
Instead of writing a website where people could post urls and wikify them, i think it would be awesome if you wrote a firefox extension that would take the current page and wikify it. Of course if you did this and many people start using it then that could become a real bandwidth bitch and mess up your server.
maybe if you write the plugin and it works well the people at mozilla can help you out.
of course if i can help you with this i’d be more than willing