MIT CSAIL researchers used a natural language-based logical inference dataset to create smaller language models that outperformed much larger counterparts.
It’s interesting that they were able to get a model with 350M parameters to outperform others with 175B parameters
interesting indeed, even though it seems to work only on specific tasks. I definitely support this direction though. LLMs are getting out of hand (have actually been for a while now), slipped from researchers’ grasp into big tech companies’. I think the work that the open source and research community is doing already with the chatgpt lookalike models is incredible
interesting indeed, even though it seems to work only on specific tasks. I definitely support this direction though. LLMs are getting out of hand (have actually been for a while now), slipped from researchers’ grasp into big tech companies’. I think the work that the open source and research community is doing already with the chatgpt lookalike models is incredible