According to OpenAI, GPT-2 is "a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization — all without task-specific training."
GPT-2 is a deep neural network trained on 40GB of internet data (about 8 million web pages) to create a machine learning model with about 1.5 billion parameters, however due to concerns about misuse, only a limited model with 117M parameters was released to the public.
Check out the recap to learn:
- What is a language model? Where does GPT-2 fit in the broader Natural Language Processing (NLP) landscape? What types of tasks is it good for? Why are language models important?
- What's special about the language model OpenAI recently released?
- Why did the announcement create such a stir?
- Why is it a problem that they didn't release the full model?
- What are the true capabilities of this new model?
- What should OpenAI have done differently?
- How can you quantitatively evaluate the negative impacts that your software might have?
- What can the machine learning and artificial intelligence (AI) communities do differently? What conversations need to take place and where?
- What are best practices in honest reporting of new results?
Speaker: Rauf Kurbanov.
Presentation language: Russian.
Date and time: February 27th, 18:30-20:00.
Location: Times, room 204.
Videos from previous seminars are available at http://bit.ly/MLJBSeminars