Sorting changed, but the same

DiscussieNew features

Sluit je aan bij LibraryThing om te posten.

Sorting changed, but the same

1timspalding
Bewerkt: nov 25, 2007, 2:33 am

I've changed how book titles are sorted. The proximate reason was to allow members to correct for special cases—when simple rules about not alphabetizing "the" don't work well (see this post and the example A is for Ox). Anyway, once I know the system is working, we can start allowing that. I think the best method will be to allow members to insert some sign within the title, showing where the sorting should start, if not the default. I'm thinking it should be two vertical lines, eg., "||A is for Ox." Your suggestions for user-interface would be appreciated.

The system is now in place. Nothing should have changed. If it has, I'll notice it after a day or two of running, or when someone tels me about bad sorting behavior.

Anyway, although this will make a few users very happy, and underscore LT's committment to quality cataloging, the real reason is other. By not storing two versions of the title—one for real and one to sort by—but just storing the title and a character number, we'll save a few gigabytes on the database. Considering our relentless growth, jumping back a month or two on file size is going to be a welcome relief for the monkeys.

2thorold
nov 25, 2007, 4:41 am

3timspalding
nov 25, 2007, 8:51 am

Ah, good. I got it. It had to do with an extra space at the end, which shouldn't have been counted.

I'm fixing any that suffered from this problem. Should be good in a half-hour.

4rebeccanyc
Bewerkt: nov 25, 2007, 9:50 am

I am getting books starting with the word "I" (e.g., I Married a Communist by Philip Roth sorting before titles beginning with the number "1". And for some reason, the guidebook Montana (Compass American Guides) is sorting between two titles starting with the word "I", The Magic Mountain sorting between The Age of Napoleon and, in this order, The Baked Apple, Tales of Mendele the Book Peddler, and The Alhambra. These are just some of the errors on just the first two pages of my catalog.

Reverse order is just as bizarre, with Mythology and The Mystery of Numbers appearing between Zen and the Art of Motorcycle Maintenance and Youth and the Bright Medusa.

Edited to add: Interestingly, this is not reproducible, although there are still numerous errors. When I resorted on title, I got a guidebook on Rome, Rome and Environs (Blue Guide), 3d ed. and The Rubber Band sorting between two titles beginning with the word "I".

5AngelaB86
nov 25, 2007, 11:42 am

I think this new system may be responsible for why my ...And Now Miguel is listed first in my catalog, when it was originally listed in the A's. Aaaack!

6AnnaClaire
nov 25, 2007, 12:52 pm

Wait, how do I fix my alphabetization? I tried changing the title of Les Misérables to "Les ||Misérables" per my understanding of the first message, but that just added a double pipe to the title and didn't move it.

7khms
Bewerkt: nov 25, 2007, 1:27 pm

I think Tim meant that this part isn't implemented yet. (That's why "Nothing should have changed" and "I'm thinking it should be ... Your suggestions for user-interface would be appreciated".)

ETA He already has two suggestions from me in the other thread he mentioned. And one from himself that he's apparently abandoned after I didn't like it (to put it mildly).

8timspalding
nov 26, 2007, 3:24 am

Hey. I'm working on this. Most problems should be solved, but clearly there are some odd wrinkles. For example, rebeccanyc's titles had extra spaces between words, which I didn't consider but am now.

9rebeccanyc
nov 26, 2007, 8:00 am

Tim, it looks fine now. Not sure what you meant by extra spaces between words, but I will certainly try to look for these in the future. (By the way, titles added through the LOC usually have a space between the title and the colon that separates the title from the subtitle; e.g., "This is the title : and this is the subtitle." When I remember, I take this out during editing.)

10timspalding
nov 27, 2007, 12:27 am

>9 rebeccanyc:

No, don't worry about it. Fundamentally, the forms allow you to put two spaces between a word. HTML always removes it visually, but I wasn't accounting for the possibility.

The main problem now is that, fundamentally, these should be the same. They're not always now

10,000 Maniacs
10000 Maniass

Pnin : A Novel
Pnin. A Novel

T

11henkl
nov 27, 2007, 3:36 am

>9 rebeccanyc:

When there is no space between the title and the colon, I add one.

12GreyHead
nov 27, 2007, 4:41 am

Just wondering if this has also fixed the feature where LT treated 'Pnin' and 'pnin' as different works?

13khms
nov 27, 2007, 2:14 pm

Hmm. Is there a reason Pnin wasn't touchstoned?

14timspalding
nov 27, 2007, 2:54 pm

Actually, yes. I prefer only to use touchstones when it enriches the experience. In this case, you're not really motivated to check out a title because I use it as an example of alphabetization. Worse, it mucks up the "conversations" part of the work page. When you are on that page you want to know what people are saying about some interesting work, not what discussions about alphabetization touch on the title of the work.

15khms
Bewerkt: nov 27, 2007, 3:05 pm

Well, in this case, *I* wanted to know what this book was everyone mentioned, because I had never heard of it before. That is, getting from the discussion to the work. Turned out to be something completely unexpected, too.

I looked at "discussions about your books", but as the result seems to be about half of all talk threads (probably less, but it feels like that), I judged that one to be pretty useless for me.

I followed the "discussions about this work" link a few times, but found that usually those discussions I looked at didn't say anything interesting about the work, so this also failed to register under "interesting features" for me.

Which leaves the one in the first paragraph, which *is* something that (usually) looks useful to me.

ETA: I've just started to register those books, and I'm freaking 426 of over 300,000 members? That doesn't feel right!

16AngelaB86
nov 27, 2007, 3:14 pm

Could someone please explain to me (in idiot-proof "a 5 year old could do this" words) how to make ...And Now Miguel get off the top of my catalog (before the # titles) and back in the As where it belongs? I don't know how to do that tall lines thing...

17readafew
nov 27, 2007, 3:22 pm

Most US keyboards it's on the key over the Enter Key Shift '\'

18reading_fox
nov 28, 2007, 8:59 am

SO where should 20,000 leagues under the sea be filed?

Currently it sorts as 20
rather than 20000
or the probable ideal option of twe - this latter might be a bit difficult to code though?

19lorax
nov 28, 2007, 2:02 pm

18

"currently it sorts as 20"

So strip the comma from the title of your copy.

"this latter might be a bit difficult to code"

Not sure whether it's really ideal or not, but it looks as if it's been done, sort of (the Perl module Math::BigInt::Named). It's not perfect (you need to fiddle with your code a bit to make it work, it only supports English and German, and "twelve" is spelled wrong) but at least someone trying to do this wouldn't need to reinvent the wheel.

Unless of course they're coding in something other than God's Chosen Language. :)

20timspalding
nov 28, 2007, 2:24 pm

Could someone please explain to me (in idiot-proof "a 5 year old could do this" words) how to make ...And Now Miguel get off the top of my catalog (before the # titles) and back in the As where it belongs? I don't know how to do that tall lines thing...

Don't do anything for today, okay? I'm going to play. It should work for that title, but it's not. I need to figure out why.

Unless of course they're coding in something other than God's Chosen Language. :)

If Perl is God's chosen language, Nietzsche was right.

Sorry, you walked into that one ;)

21Lumilyhty
dec 1, 2007, 6:37 am

Tim, I'm glad you always have an ear ready to listen to LT users' suggestions.

The solutions put forward in this thread and others seem awfully cumbersome to me. To me, a perfectly simple, useful and easy-to-use way is what the music software iTunes has: just an extra field for sorting the titles. So, taking an example from the field of music, if I have an aria called 'Martern aller Arten' from the opera 'Die Entführung aus dem Serail' by Mozart, I can make it sort itself as 'Die Entführung...', 'Entführung...' (scrapping the German definite article) or 'Martern...' (just going with the more specific part in the work), depending on what I've grown to regard as natural. Users also use this to eliminate the confusing factor of different scripts from their music library. Thus, I have all music by Руслана (Ruslana) conveniently listed between 'Ro' and 'S'.

I refuse to believe this is hard to do programming-wise. Thinking of LT, aside from the already mentioned problems with non-English definite articles and titles starting with non-letter characters, this would also sort out the following problems:

*The above-mentioned different scripts
*Problems with accented latin characters (right now LT does not distinguish between, for instance, e and é, which messes up the order)
*Titles beginning with numerals (I suspect somebody's bound to want them listed as 'Twenty-thousand leagues', not 20,000 leagues - and need I add this is language-specific)

So tell me what you think of these ideas, people...

22timspalding
dec 1, 2007, 11:55 am

>21 Lumilyhty:

The problem with your idea, as I see it, is that it puts the information in two places. 99% are totally uninterested in this issue. I don't think they want to fuss with it, or even be confronted with it. And if they change the title for some non-sorting related issue, they're going to be very weirded out if the old position still applies. I suppose we could have a second field and have it be empty by default, and when it was empty it would sort by the actual title.

That might work and I'll consider it. From a db point of view, I prever storing the title and a sorting number. Storing the title twice doubles the storage. LT is already storing more bibliographic data than all but the two or three largest libraries in the world, and with far less technical infrastructure. So where I can save space, I will. The amount are non-trivial. If I had to raise the membership fee for everyone by $1 to cover this, would it be worth it?

The non-English sorting is very, very difficult. LT does treat e and é as the same letter from a sorting standpoint. That's how English does it. But letter order and alphabetization differ in all sorts of interesting ways between languages. In theory it might be possible for users to set one standard for their library, but I don't think you could have all the standards at the same time. That is, the system needs to decide if ç is a separate letter from c (Turkish) or not (French). What's possible is not, however, going to happen. Doing different language sort is not easy, and would move lots of the "work" from the database into PHP, which is not desirable here.

23vpfluke
dec 1, 2007, 2:41 pm

Tim:
Are the three largest libraries in the U.S.: Library of Congres, Harvard University, and New York Public. How does the national British Library rank?

On ordering letters, I think treating e and é as the same is quite OK with me. I don't remember much beyond French.

Sometimes, it is hard to do joint cataloging (which we all do in LT) when big libraries have a master cataloger and that person is a final arbiter of the way the information is presented.

24timspalding
dec 1, 2007, 2:59 pm

Yes, that's right. Actually, I've never found a good list of world libraries by size. List of that sort are always half-bogus. You can count them in a dozen different ways.

25DouglasAtEik
dec 10, 2007, 4:47 pm

>1 timspalding:
Has this change been implemented (i.e. book title "Der ||Something" to sort by "Something" rather than "Der") ?
I understand your post to mean that it is implemented, whereas a quick test appears to demonstrate the contrary ...
?

26timspalding
dec 10, 2007, 10:49 pm

No, just the intellectual structure for it. Let me look at it right now.

27AnnaClaire
Bewerkt: dec 11, 2007, 10:10 am

Hang on, DouglasAtEiK did the same thing I did in message 6. I'm starting to think there was something confusing in how it was presented, the most likely culprit being:
The system is now in place. (#1)


The sentence after probably didn't help, either (if the system is in place, why would nothing have changed?).

Please, remember that we speak English, not whatever jargonese-sub-variant-dialect you were using in that first message!

28timspalding
dec 12, 2007, 1:07 am

I know. Apologies.

29timspalding
dec 12, 2007, 1:44 am

Okay, it's in there. You can do:

||A is for Ox

Or:

Ta ||Indika

The trick is that the || is going to show in various places. I think it SHOULD show in the detail screen, at least if it's your book. But not everywhere. I've nuked it from the "your library" view.

30koffieyahoo
dec 12, 2007, 3:25 am

Before I start adding these to the books in my library: Since this effectively changes the title of a book and since changing a title may uncombine a book from a work. What happens in this case?

31koffieyahoo
Bewerkt: dec 12, 2007, 3:46 am

Right definitely not working correctly yet:

* My copy of De bonkige baarden seems to have become uncombined (at least my library gives 0 shared, while before the edit I used to share it with snellius). Actually this is really weird: I doesn't seem to have become uncombined, the member count is now off by 1.

* Adding the double lines to my copy of Kafka's Das Schloß. Didn't seem to work at first. Now it works, but no double lines show up when I edit the work (either in place in my library or on the details page)

32AnnaClaire
dec 12, 2007, 10:56 am

OK, so I've double-piped Le Morte d'Arthur to Le ||Morte d'Arthur, Les Misérables (Signet Classics) to Les ||Misérables (Signet Classics), Les Précieuses Ridicules (Petits Classiques Larousse) to Les ||Précieuses Ridicules (Petits Classiques Larousse).

They're not showing up in the L's anymore, but they're not showing up in the M's and P's, either. Les Misérables is showing up at the end of the list. I can't find the other two.

I'll wait and look again. In the meantime, ?????

33thorold
Bewerkt: dec 12, 2007, 6:01 pm

I've managed to get ||A bord de l'Etoile-Matutine and De ||Aanslag up to the top of the list, but Die ||Harzer Schmalspurbahnen. is appearing in the B's ahead of Die ||Blechtrommel. Is the re-sort still running?

ETA: ...and Le ||Chant de l'équipage is now up there in front of all the a's and the numbers.

34AnnaClaire
Bewerkt: dec 12, 2007, 6:41 pm

Well, the three books I double-piped this morning are all in the right places now. So, look again a bit later -- it might fix itself.

35koffieyahoo
Bewerkt: dec 13, 2007, 3:00 am

31>

Mmm, the problem with "De bonkige baarden" may already have been there now I think about it and the sorting of Das Schloss seems to have recovered. However, changing the title of a work sometimes seems to mess up the member count...

What is definitely a problem is that when I do in library editing of the title of a book, i.e. double click on a title in my library to change it, the the double vertical lines don't show up.

Edit: The double verticals are showing up in the "random books from my library"-section on my profile page. I don't think I want to see them there.

36thorold
dec 13, 2007, 2:28 am

Looks fine for me now, except that Die ||Blödsinnigen / The Idiots appears before Die ||Blechtrommel. Since the rules for sorting accented characters are inconsistent from one language to another, there's probably no solution to that type of problem that won't upset another group of users somewhere else...

37khms
dec 13, 2007, 2:29 pm

There's no general solution that works for everybody (only picking language-sensitive versions would possibly do that), but there's a fairly good compromise solution described ... just let me get the other window ... here: http://www.unicode.org/reports/tr10/ (or you can go the extra mile and implement the tailoring explained there and let every user specify a locale for sorting - but I'd call that overkill at the present time, unless you can find a library that already does all that).

38Melanie_Green-Ar6368
jul 12, 2021, 4:43 pm

How can you sort your books by author? Sorry if this was already asked

39MarthaJeanne
jul 12, 2021, 4:52 pm

When you have 'Your Books' open, click on the Author header. If you want the books sorted by title in each author, click first on the Title header, then on the author header.

OR

Click on the sort icon at the top of the page. This is up and down arrows. Go down to the pop up and set sort and subsort.

40al.vick
Bewerkt: jul 13, 2021, 9:03 am

What about special characters in such titles as Aïda? That is the first A title listed in my catalog, (well, after AAAA Wizardry apparently, so that's strange) but it would be nice if it sorted as Aida, or at least after Ah. How are such characters handled, and how should the be handled?

41Nicole_VanK
Bewerkt: jul 13, 2021, 9:47 am

>40 al.vick: I agree I would like to see them sort somewhere with the base letter. The tricky thing is that the actual rules vary from country to country. And that there is more than one way to type diacritics. So we will probably never be able to please all. (Same thing for author names, by the way).

42anglemark
jul 13, 2021, 9:56 am

Cré na cille : The dirty dust sorts first for me, right now, which is clearly wrong.

43spiphany
jul 13, 2021, 10:00 am

This thread was started in 2007....just in case anyone (like me) started reading this thread and was very confused about why we were suddenly discussing using the pipe character for sorting and wondering what happened to the "sort character" field.

44MrAndrew
jul 13, 2021, 10:03 am

>19 lorax: (re: where should 20,000 leagues under the sea be filed?) So strip the comma from the title of your copy.

I would die before i would do that.

45al.vick
jul 13, 2021, 10:04 am

>43 spiphany: Thanks for that note!

46anglemark
jul 13, 2021, 10:21 am

>43 spiphany: I completely failed to notice. Thanks!

47MarthaJeanne
jul 13, 2021, 10:23 am

>42 anglemark: If you go to the edit book page for that work, what shows in the sort field?

48anglemark
jul 13, 2021, 10:28 am

>47 MarthaJeanne: Yeah, it was wrong. Once I realised that this was an old thread and that Tim hadn't changed the sorting system again, I knew how to fix it, of course.

49paradoxosalpha
jul 13, 2021, 10:32 am

This is one of those cases where thread necromancy has real hazards.

50PawsforThought
jul 13, 2021, 11:09 am

>41 Nicole_VanK: Yeah, for my library it drives me somewhat up the wall to see Å and Ä sorted with/just after A (same with Æ, but that’s a very small issue for me personally) and Ö (as well as Ø/Œ) with O - in my world they both go after Z.

51spiphany
jul 13, 2021, 11:23 am

Perhaps we could take the discussion of sorting non-Latin characters and characters with diacritics to another thread (say, here: https://www.librarything.com/topic/57317)
And let this one return to the grave?

52MrAndrew
jul 14, 2021, 3:56 am

crossroads, midnight, stake through heart.

53Nicole_VanK
jul 14, 2021, 7:18 am

>43 spiphany: Oh, yikes - I failed to notice that