• 0 Posts
  • 42 Comments
Joined 3 years ago
cake
Cake day: October 15th, 2023

help-circle
  • There’s a lot to cover here but I’ll try to touch on each point:

    The key requirement is fast memory that can be addressed by your GPU, and ideally a lot of it - hence the insane cost of this hardware right now.

    Remember that you need space for the model’s weights (think of this as its ‘knowledge base’) and the context window, which is basically the data needed for the LLM to keep track of your current conversation with it (effectively its short term memory).

    With smaller pools of VRAM (8-16gb) you will have to compromise and either have a more capable model that will lose context quickly and start hallucinating, or a less capable model that can maintain a session for a bit longer but overall less ‘smart’.

    For software - there are a couple of options for running the LLM itself, Llama.cpp is one of the more popular tools and is the one that I use. It has a web UI with the usual chat interface, and also exposes an API that you can plug other tools (e.g. opencode) into, depending on your use case.

    In terms of hardware recommendations, at 20GB+ of VRAM you do have a bit more headroom compared to more consumer grade GPUs, but to be honest the most cost effective way to get a shitload of VRAM is likely not with a dedicated GPU but actually using a system based around a recent APU.

    I got a Minisforum MS-S1 last year for exactly this purpose. It is based on AMD’s Strix Halo platform which it has in common with the Framework Desktop and a couple of other similar devices.

    It has 128gb of unified RAM which can be divided between the GPU and CPU however you like, so plenty of capacity for even fairly chunky models. It also uses a tiny amount of power compared to a more traditional system with a dedicated GPU, while also giving really reasonable performance for most AI workloads, more than enough for use in a homelab.

    For cloud rental - doable, but pricing is a factor, and of course this will not actually be running locally.

    Usability - manage your expectations, but overall for a lot of use cases and of course depending on the model that you are running and the resources you throw at it, it can be comparable with especially older iterations of ChatGPT, Gemini etc.

    But remember, you are not a Google or an Anthropic and do not have an infinite pool of compute to throw at your model, nor do you have access to the specific models they are using.




  • I’ve switched both my laptop and desktop over to Linux (Bazzite and Fedora respectively) in the last 6 months.

    The last time I tried to daily Linux (over a decade ago) I ended up switching back eventually, but this time I really don’t think I’ll need to. All of the games I play most often work perfectly, the dev tooling is even better than it is on Windows, and the hardware compatibility side has been completely flawless.

    Gone are the days of having to hunt down obscure Linux drivers for your touchpad or webcam. Everything just works out of the box.



  • Well, I’m currently writing a service and frontend, both in C# (Blazor for the UI), and using docker-compose to build and deploy them to a Raspberry Pi running Linux. So not only cross-platform, but cross-architecture as well.

    This is not a new thing either. Since .NET Core was released almost 10 years ago, it has supported cross platform development.





  • Baldur’s Gate 3 (~600 hours), BeamNG.drive (~550), Cities Skylines (~300), Space Engineers (~300), 7 Days to Die (~250) and Satisfactory (~230).

    These are all stats from Steam and probably not fully representative. Satisfactory for example I used to play on Epic when I got it as a free game over there, probably logged at least another 500 hours or so on that platform.

    My most played game of all time is most likely TES: Oblivion, which I started playing at release back when I was a teenager and had almost infinite free time. I’m not sure if I still have my oldest save to confirm, but I suspect it would be at least 1,500 hours, probably more across several characters.



  • I decided to set up Fedora on my new laptop as it was either take a chance on that or spend like 3 hours debloating a Win11 install.

    It’s been over 10 years since I last tried dailying Linux, we have come a long way in that time. Everything just worked out of the box. No fucking around needed.

    Even relatively niche stuff like my thunderbolt dock and the laptop’s fingerprint sensor was picked up. And, thanks to the investment Valve has been putting into Wine and Proton, pretty much every game I’ve tried has worked with no issue.

    Next time my desktop is due for a clean install I’ll definitely be doing the same there.









  • Not exactly crazy but just mysterious…this was at a software company I worked at many years ago. It was one of the developers in the team adjacent to ours who I worked with occasionally - nice enough person, really friendly and helpful, everyone seemed to get on with them really well and generally seemed like a pretty competent developer. Nothing to suggest any kind of gross misconduct was happening.

    Anyway, we all went off to get lunch one day and came back to an email that this person no longer worked at the company, effective immediately. Never saw them again.

    No idea what went down - but the culture at that place actually became pretty toxic after a while, which led to a few people (including me) quitting - so maybe they dodged a bullet.