this post was submitted on 30 Aug 2025
        
      
      220 points (98.7% liked)
      Fediverse
    37525 readers
  
      
      9 users here now
      A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).
If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!
Rules
- Posts must be on topic.
- Be respectful of others.
- Cite the sources used for graphs and other statistics.
- Follow the general Lemmy.world rules.
Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)
        founded 2 years ago
      
      MODERATORS
      
    you are viewing a single comment's thread
view the rest of the comments
    view the rest of the comments

 
          
          
Lemmy's search is a lot better if you don't use the 'All' type, like just search comments or posts. With 'All', it is basically useless. Like sorting by controversial on a user's profile.
Here's a search I did just now. Despite trying to restrict it to the last month ("Top Month"), none of the results on the first page are within the last month.
Ah, right comments don't actually top with time ranges in 0.19: https://join-lemmy.org/docs/users/03-votes-and-ranking.html#sorting-comments, my bad. The fact lemmy-ui still shows all of them is very confusing, I'll admit.
This is different in 1.0, the logic was changed for both posts and comments so these will work in future.
That's a page that really shouldn't be relevant, and it concerns me if it is, because the search page should not be working on the same ordering logic that comment pages or the front page use.
Anyway, I mainly only use the "top month" option as a proxy for what I really want, which is "filter by posts/comments in the last month", usually because I'm searching for something I saw recently, and I have a rough idea of when it was posted.
Eh, I don't think it's that surprising. Getting a list of comments on a post vs getting them from a search term are very similar operations, so it doesn't make too much sense for these to have different queries in the backend. One thing you could do, but no client to my knowledge does, is add a search bar to a post that searches through the comments only within that thread.
Everything in the backend uses the same sorting as the posts do on that page except comments, which is frustrating. Comments do need a different sort enum as there are some options that don't apply to comments (scaled, new comments, etc.), but yeah the fact the top options don't work for comment search when they should is opaque and not user friendly.
I can't wait for 1.0 to actually come out because I feel like a broken record, but this is fixed there.
Sure, but one would have thought that the ordering in a search is fundamentally different from the ordering in other places. Because you want something that contains the words you've searched for near each other to appear ahead of a post that has those words scattered at random because it's a 500 word essay. You want exact word matches prioritised ahead of entirely unrelated words that include the same characters. Like "enum" should turn up your comment, but rank a comment that contains the text "renumbers" much more lowly. A particularly smart search page might keep "enumerate" high while rejecting "renumbers", though.
Of course, it's true that at least in the current latest release, Lemmy fails at all of this. I hope 1.0 is at least fixing some of it?
This doesn't have anything to do with sort ordering though, which is based on time and votes. Text search is just a filter on top of sorting.
How Lemmy does text search is via pg_trgm which works by breaking down both the content text and search text into trigram* and if the content contains enough of the search trigrams, it's considered to match the search term.
* A trigram is just a 3 character 'words', for example the trigram of 'enum' is
{" e"," en",enu,num,"um "}.What you're describing is closer to a tsvector, so you could open up an issues on Lemmy's GitHub to move from trigram to tsvector. One advantage trigrams have though is that they're language agnostic while
tsvectoss need both a dictionary and to know the language (thankfully, Lemmy already has this info via the language setting, though the way it's stored will need to be changed to accommodate this). But tsvectors does provide much more intuitive language matching, like what you outlined.That doesn't feel like how search should work. It should be ranking results that fit the search query better higher than ones that fit it less. Regardless of how the search is done, that should remain true. So if you're using trigram matching, instead of a binary "does the comment contain 80% of the trigrams in the search query", it should be "if it contains 100% of the trigrams from the search query, rank it higher than something with 90% match, which is higher than 80%." Or maybe not that precisely, but something so that more relevant results appear above less relevant ones.
Without doing something like that, it's just...not very useful. Which is the observed behaviour of search on Lemmy right now which started this whole conversation.
But the existing filters already prescribe an order outside of how closely the search term matches, you brought up top month and I don't see how you'd want that to work other than a binary filter sorted by votes.
What you're describing would be a new sort order, analogous to Reddit's 'Relevance' sort. It's certainly doable with postgres' builtin distancing operators, though it be slower.
The truth is I don't want "top month". What I really want is "best result, filtered by this month". But unfortunately that doesn't exist, and in the absence of that, I use "top month".