OpenAI transcribed over a million hours of YouTube videos to train GPT-4

All of these models are based on privatizing the commons, literally the whole of the internet.

This is a very different approach to Google, which basically lives on crediting authors and sending them traffic.

Also, if you ask a model to help you scrape a website, it’ll go on a ethics tirade about how questionable scraping is.

The hypocrisy is palatable 😂 and this is one of the big elephants in the room that nobody talks about.