• 22 Posts
  • 336 Comments
Joined 5 months ago
cake
Cake day: February 17th, 2025

help-circle







  • I avoid news and politics, that probably helps a lot on that regard

    Honestly, same. As an ML user (this is my UK alt), I never really saw any of toxicity described by so many others, but I guess I just learned early on to put my filters up and have been living in my own bubble within ML.

    If a place gets toxic, I tend to just leave instead of doubling down, and that I think is a pattern that might be alien to some because how reddit used to function: you couldn’t just leave reddit politics-wise because the frontpage literally hammered you with it, so you’d have to stand your ground and fight until users or mods waded in. But here? You can leave lemmy politics-wise because the frontpage is either your subs or local posts filtered through your block lists.

    I’m very happy in my bubble, and it does genuinely confuse me when someone says “oh, you’re from ML, are you?” and I think “yeah… and it’s mostly quiet, just the way I like it”




  • tetris11@feddit.uktoProgrammer Humor@programming.devPeak UI Design
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    1 day ago

    hierarchical letter clustering would be my guess, or graph-based clustering using ngrams of 2-4 as nodes and maximising for connections.

    Or using an optimized Regex and printing out the DFA?

    Edit: Quick N-gram analysis (min=3, max=num letters in that month)

    R-code
    library(ngram)
    
    tmonths = c("january", "february", "march",
               "april", "may", "june", "july",
               "august", "september", "october",
               "november", "december")
    
    zzz = lapply(tmonths, function(mon){
      ng = ngram::ngram_asweka(paste(unlist(strsplit(mon, split="")), collapse=" "), min=3, max=nchar(mon))
      return(gsub(" ", "", ng))
    })
    res = sort(table(unlist(zzz)))
    res[res > 1]
    

    This gives the following 9 ngram frequencies greater than 1:

      ary   uar  uary   emb  embe ember   mbe  mber   ber 
        2     2     2     3     3     3     3     3     4 
    

    As you can see two longest most common motifs are “em-ber” and “uar-y”

    Using this I propose the following graph

    Mermaid
    stateDiagram
        direction LR
        sept --> em
        nov --> em
        dec --> em
        em --> ber
        oc --> to
        to --> ber
        feb --> uar
        uar --> y
        jan --> uar
        ju --> ne
        ju --> l
        l --> y
        ma --> r
        ma --> y
        r --> ch
        
        a --> p 
        p --> r
        r --> il
        a --> u
        u --> gust
    
    




  • Issue with Karma and Tagging, is that we all have 1 topic that we are all toxic about and we then carry that label everywhere, even on topics that we are completely rational about.

    To quote an old joke:

    So a man walks into a bar, and sits down. He starts a conversation with an old guy next to him. The old guy has obviously had a few. He says to the man:

    “You see that dock out there? Built it myself, hand crafted each piece, and it’s the best dock in town! But do they call me “McGregor the dock builder”? No! And you see that bridge over there? I built that, took me two months, through rain, sleet and scoarching weather, but do they call me “McGregor the bridge builder”? No! And you see that pier over there, I built that, best pier in the county! But do they call me “McGregor the pier builder”? No!”

    The old guy looks around, and makes sure that nobody is listening, and leans to the man, and he says: “but you fuck one sheep…”