The subject says it all.

Let's say I would like to know what are my most often used tags in my collection, but I have got 2000 books and I don't feel like tagging them all to find out what are my most used tags.

Hence "Auto-tagging", that is, to grab the tags that others have already assigned to a book and tag your whole collection from them.

As this feature of auto-tagging is not available at the moment, has anybody programmed any sort of automated script, Greasemonkey, perl script... that does this task?

jan 5, 2007, 7:36pm

It sounds like you're asking for a few different things here.

If you want to find your most-frequently used tags, just click on the "tags" tab at the top of the page. Your tags are listed by frequency on the left.

I'm not sure why you'd want to take someone else's tag and assign it to your whole collection. But you could just use the "power edit" function to add a tag to any number of books in your catalogue.

jan 5, 2007, 10:43pm

I will try to explain myself better.

I don't want to find out what my most-frequently tags are based on mines, because I haven't entered any.

What I would like is to assign each book of my collection the tags that others have already assigned to it.

For example, let's say that I have got The C Programming Language, but I'm lazy and I don't want to tag it. Instead I would like to use the tags that others have used for it:

c(37) C (Programming Language)(1) C Language(1) c programming(2) c++(1) c/c++(1) classic(3) computer programming(2) computer science(11) computers(23) Computers - Programming(1) cs(2) deliciouslibraryexport(1) electronics(1) english(1) file#3(1) geek(1) Language Manual(1) loc:localhardcopy(1) nonfiction(17) own(2) programming(58) Programming C(2) programming languages(4) read(3) reference(11) science(2) soft(1) software development(2) Technical(3) textbook(6) wishlist(1) work

The idea would be to automate that to "auto-tag" my entire collection, so that, in the end, I could find out what tags are the most common in my books.

jan 5, 2007, 11:14pm

Okay, I understand now.

The only difficulty I can see is that many (maybe even most) people don't confine their tags to purely descriptive ones. For instance, in your example, the tag "read" is used more than "computer programming". Now, the latter will be true for all copies of the book, including yours, but you may have just purchased it and not read it yet. On top of that, you'll have tags that may have the same meaning, but are worded differently (as, in your example, "computer programming" and "Computers-Programming"), creating a situation where you'll wind up with duplicative tags.

So wouldn't you still have to go through all your books and remove irrelevant and redundant tags? And might that not be more work than entering your books and then tagging them via the Power Edit function?

jan 6, 2007, 11:21am

Thanks for your reply lilithcat,

It's true, there would be some duplicated tags, and even non-descriptive tags as "read" or deliciouslibraryexport(1), as it's the case above with The C Programming Language.

I wouldn't mind that though, as, in the big scheme of things, after auto-tagging several thousand books, those wouldn't show as relevant. Let's say that I am happy with a 90% accuracy, not aiming at 100%. :)

A possible workaround would be to fine tune the auto-tagging algorithm to only grab the, let's say, 5 most used tags for each book. In the case above:

programming(58), c(37), computers(23), nonfiction(17), science(11), reference(11)

That would be really descriptive. Not perfect, but 90% perfect. :)

Just an idea, I wonder if someone has ever considered the usefulness of this and might have programmed it already.


jan 15, 2007, 10:28am

Wouldn't autotagging partially defeat the purpose of the tagging feature, at least from the perspective of other users than yourself in that your tags no longer would be a part of the aggregate whole of all tags, but rather an amplification of the already existing aggregate?

If tags on a particular book could be marked as "authorative" (i.e. programming, c, programming languages on The C Programming Language, for example), and auto-tagging only appended those tags to your books, then I'd love to have autotagging!

Otherwise I'm not as sure.

jan 17, 2007, 1:29pm

It is a GOOD question, because it shows that there'd be the implicit fear that by allowing autotagging, tags would decrease its quality.

2 things here.

1 - People who don't care about tagging, wouldn't tag anyway. So autotagging wouldn't prevent him from tagging what they would have never done.
2 - If 1000 people autotagged using the tags that only 100 cared to input, that would have absolutely *NO* effect on the quality of the tags used to describe the books. Statistics speaking.

As only the top 5 most used tags would be used to autotag, and those would be actually the best tags agreed by the minority who care to tag, it would actually be that minority who would still hold the AUTHORITY to properly describe a book.

There you go.

mrt 20, 2007, 1:26am

Wouldn't it make more sense instead of using an auto-tagger tool just to ask Tim to build you a Tag Cloud page that shows your books and everyone's tags? I would assume that the code to do that would be fairly similar to the existing cloud pages for everyone's tags on every book and your tags on your books.

mrt 20, 2007, 12:50pm

It seems from your question that you don't actually want to tag your books, but you just want to see a tag cloud of all the tags that others have for your books? If that is the case, then I agree with 8, ask Tim for that feature (because he has already disagreed with the idea of auto-tagging).

mrt 20, 2007, 12:56pm

there is already a tag cloud for the book on the social information page for the book. I doubt whether making it available on the catalogue page would have much support.

mrt 20, 2007, 1:14pm

He doesn't want a tag cloud for a particular book, he wants one for all the books in his library.

mrt 20, 2007, 4:02pm

If I recall correctly, Tim opposes the autotagging feature suggested above because it would reduce the statistical values of the tags. I am sure I read a post of his explaining this somewhere but I do not care to search for it just now.