Right now I am testing some LLM’s that have trainingsets specifically for the Dutch language. I can test them offline, on my own machine in the terminal. It’s extremely easy to try and test these models. And after some digging, I found the dataset on which it is based. The Gigacorpus with Dutch forumposts, books, law-texts, Wikipedia etcetera. It’s fascinating to see how so many researchers and enthousiasts are working on AI models that are private, local and open source. What a difference with the ongoing and growing hype we see with OpenAI and Californian Big Tech…

Je reactie?
Laat gerust een reactie, vraag of gedachte achter. Hou het netjes en be cool.