Boy, have I got a project for the weekend!
While running ideas and vague concepts related to my tag-categorization wishlist of the other day, I figured it was worth poking around in the Movable Type Support Forums to see if I could find anything of use. A search for ‘keywords’ led me to one thread, which then led me to these posts by ishbadiddle — and that looks to be (nearly) exactly what I’ve been looking for!
Here’s his blog entry on his keyword subject indexing work:
My thinking about the Semantic Web was influenced by Paul Ford’s piece on the subject, which imagines the power of Google harnessing the Semantic Web to make even more money. There’s a good article on the Semantic Web on wikipedia. Basically, it’s adding metadata (data about the data) to web pages. In our case, it’s simply adding “subject” data to each blog post, and then harnessing that to create an index of posts that relate to that subject. Think of it this way: the Category system is like the Table of Contents of a book, listing chapter headings. The Keyword system is like the Index of a book, one that is constantly updated.
So, plan for the upcoming weekend:
Print out ishbadiddle’s instructions, download and install the required plugins (ifEmpty, Loop, Compare, Collate, and Regex), hack the search functions, and then start pounding away on my templates.
About the one downside I can see to this is that I may have to go back to static rendering of my pages rather than the dynamic rendering I’m using now, but I’m okay with that (it’s all a tradeoff anyway, there’s pros and cons to each approach).
It’ll be fun to get into geek mode for a little while as I work on this. I just hope I don’t break anything while I’m working on it…
“Steamroller (Steaming Pig)” by Pigface from the album In Dust We Trust (1997, 3:22).
2 Trackbacks
Progress: Related Entries
The keyword index will work, but I’ve got a lot of work to do on my keywords before I can bring it live. In the meantime, I’ve re-implemented Adam Kalsey’s ‘related entries’ code, listing five similar entries in the sidebar of each individual entry page.
Technorati Tags
Change of plans as far as my keywords/tags project goes. Thanks to George’s TechoratiTags plugin, I’m now listing tags in the metadata for each post, just underneath the title. The tags are drawn from the keywords for each entry, and clicking on any on…
9 Comments
Today is another day when I am thankful I came across your blog. I have been fiddling with my own fledgling blog and leaving aside my current technical hitches, I am really getting into this stuff. I have even started learning basic html (very very late starter!). So, thanks! Again!
That’s a pretty slick idea. But my problem with it is that you have to manually enter every keyword, probably resulting in atleast a few typos and misspellings and such. So, an idea: use subcategories.
More specifically, have two top-level categories: labeled “categories” and “keywords.” Beneath the “categories category is the normal category structure you use. Beneath the keywords category is all of your keywords—and you can even sort them into subcategories to perhaps make finding/sorting them easier. You’d have to create two different category archives, one for “categories” and one for “keywords,” too.
And then applying keywords to each post becomes quite a bit more consistent and a little easier.
Of course, I haven’t tried any of this!
The problem with using categories or subcategories for what I’ve got in mind is the near-infinite number of potential keywords. My goal is to be able to classify posts in much the same way as I do my photos on flickr (adding keywords for subjects, people mentioned, etc.) — for an idea of how many possible keywords that can lead to, take a look at the tags I’ve used so far on flickr, or ishbadiddle’s subject index. That’s a lot of keywords, and it’s an ever-expanding list as new topics come up. While it may be possible to just keep adding sub-category after sub-category, it’d soon make for a ridiculously large menu to have to scroll through.
You’re right, the list of categories would become unmanageable pretty quick. That said, I’m still stuck on using categories to do this. (At least for me) it would make applying exactly the same tag much easier. Plus, I’m already using the keyword field!
I found a piece of JavaScript AutoComplete code that could be integrated on the Edit Entry page fairly easily (I think), but that would only be helpful for the first category/keyword. You’d have to venture into the Assign Multiple Categories dialog to add more, and that would have the same scrolling problem. Unless that were hacked, too. Maybe the “available” category pane could become the AutoComplete JavaScript control, then hit the “add” arrow to add multiple categories/keywords.
But if you’re going to be adding multiple categories/keywords to every entry, you don’t want to have to always enter that Assign Multiple Cateogries dialog—even if it is hacked to work well. I guess an Assign Multiple Categories section needs to be added to the Edit Entry screen.
For a while I’ve been thinking about instituting a keyword search/subject index on my site, but I haven’t come up with a good way to do it. Maybe I’m headed in the right direction?
Admittedly, it’d be nice if there were some form of auto-completion (whether that be through categories or through some other method), but I’m really not terribly concerned about mangling the keywords I put in. I’m by nature somewhat anal about categorizing things, and if I ever do fat-finger something and mistype a keyword, it wouldn’t be terribly difficult to fix it later on.
Ooh, idea, though I have no idea how easy or realistic it would be to hack into the MT interface.
Just above the keywords field on the entry editing screen, add a short ‘input’ field that has the JavaScript Autocomplete code attached to it. That JavaScript code would need to be populated with the entire list of keywords of course (one of the potential bottlenecks). Next to the ‘input’ field is a ‘add keyword’ button (that would be the default button when you hit the ‘return’ key if you’re in that ‘input’ field, so you don’t end up triggering the ‘post’ button while trying to input keywords).
At that point, you could click into the ‘input’ field, start typing, and the auto-complete code would fill in the rest of the keyword. Once you have the right keyword selected, hit ‘return’, and it gets dumped into the main Keywords field.
Any thoughts? Doable?
It might be doable, but that gets into stuff well beyond my abilities, too. I really don’t have a grasp on Perl.
Actually, I think it would all be pretty easy (and slick)—except for the part about the entire list of keywords being available. You’d have to extract all of the keywords from the database for all (or just some? and which ones?) of the weblogs in your installation—which I think means you have to start thinking about Author permissions and such… The more I think about it, the more complex it gets!
I wonder… maybe it could be simpler. How about if there were an Index Template that just listed all the keywords in all the blogs you selected, which gets published as just plain text. Then, in the Edit Entry template, the JavaScript AutoComplete loads that keyword list! The only part of that I’m unsure of is if/how JavaScript can open and parse a file like that. Hmm. That might work.
Regarding your comments on my site about the weighted list, yes, the list can be organized as you want; I think without much work at all.
Let me know if you need any help with the subject coding — I’ve tried to be complete with the instructions but I don’t think anyone else has tried them yet. Good luck!
One way to make autocomplete work better would be to use “Google Suggest”-like javascript/xmlhttprequest magic but I think there is even simpler way to do it.
I have my own implementation of “related entries” but instead of categories/keywords I settled for topics/tags. Topic has actual content and is like WikiWiki topic (ex. LocDigitalFuture) and tag is a topic that has no page yet and works similarly to del.icio.us or flickr tags (fun project would be to integrate it so tags are automatically propagated to del.icio.us …).
When I edit or add new content I can add set of tags manually but there is something that works even better: I made my engine to suggest list of possibly related tags to add based on analysis of existing posts and their tags. Works very well - I rarely now have to manually enter tags.