top of page
Iahlt

Resources

The Mastermind Behind GPT-4 and the Future of AI | Ilya Sutskever

The Mastermind Behind GPT-4 and the Future of AI | Ilya Sutskever

In this podcast episode, Ilya Sutskever, the co-founder and chief scientist at OpenAI, discusses his vision for the future of artificial intelligence (AI), including large language models like GPT-4. Sutskever starts by explaining the importance of AI research and how OpenAI is working to advance the field. He shares his views on the ethical considerations of AI development and the potential impact of AI on society. The conversation then moves on to large language models and their capabilities. Sutskever talks about the challenges of developing GPT-4 and the limitations of current models. He discusses the potential for large language models to generate a text that is indistinguishable from human writing and how this technology could be used in the future. Sutskever also shares his views on AI-aided democracy and how AI could help solve global problems such as climate change and poverty. He emphasises the importance of building AI systems that are transparent, ethical, and aligned with human values. Throughout the conversation, Sutskever provides insights into the current state of AI research, the challenges facing the field, and his vision for the future of AI. This podcast episode is a must-listen for anyone interested in the intersection of AI, language, and society. Timestamps: 00:04 Introduction of Craig Smith and Ilya Sutskever. 01:00 Sutskever's AI and consciousness interests. 02:30 Sutskever's start in machine learning with Hinton. 03:45 Realization about training large neural networks. 06:33 Convolutional neural network breakthroughs and imagenet. 08:36 Predicting the next thing for unsupervised learning. 10:24 Development of GPT-3 and scaling in deep learning. 11:42 Specific scaling in deep learning and potential discovery. 13:01 Small changes can have big impact. 13:46 Limits of large language models and lack of understanding. 14:32 Difficulty in discussing limits of language models. 15:13 Statistical regularities lead to better understanding of world. 16:33 Limitations of language models and hope for reinforcement learning. 17:52 Teaching neural nets through interaction with humans. 21:44 Multimodal understanding not necessary for language models. 25:28 Autoregressive transformers and high-dimensional distributions. 26:02 Autoregressive transformers work well on images. 27:09 Pixels represented like a string of text. 29:40 Large generative models learn compressed representations of real-world processes. 31:31 Human teachers needed to guide reinforcement learning process. 35:10 Opportunity to teach AI models more skills with less data. 39:57 Desirable to have democratic process for providing information. 41:15 Impossible to understand everything in complicated situations. Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI
Steven Pinker: Linguistics as a Window to Understanding the Brain | Big Think

Steven Pinker: Linguistics as a Window to Understanding the Brain | Big Think

In this lecture, Steven Pinker, renowned linguist and Harvard Psychology Professor, discusses linguistics as a window to understanding the human brain. New videos DAILY: https://bigth.ink Join Big Think Edge for exclusive video lessons from top thinkers and doers: https://bigth.ink/Edge ---------------------------------------------------------------------------------- How is it that human beings have come to acquire language? Steven Pinker's introduction to the field includes thoughts on the evolution of spoken language and the debate over the existence of an innate universal grammar, as well as an exploration of why language is such a fundamental part of social relationships, human biology, and human evolution. Finally, Pinker touches on the wide variety of applications for linguistics, from improving how we teach reading and writing to how we interpret law, politics, and literature. Read the full transcript on: https://bigthink.com/videos/how-we-speak-reveals-how-we-think-with-steven-pinker ---------------------------------------------------------------------------------- Steven Pinker is an experimental psychologist who conducts research in visual cognition, psycholinguistics, and social relations. He grew up in Montreal and earned his BA from McGill and his PhD from Harvard. Currently Johnstone Professor of Psychology at Harvard, he has also taught at Stanford and MIT. He has won numerous prizes for his research, his teaching, and his nine books, including The Language Instinct, How the Mind Works, The Blank Slate, The Better Angels of Our Nature, The Sense of Style, and Enlightenment Now: The Case for Reason, Science, Humanism, and Progress. ---------------------------------------------------------------------------------- ABOUT BIG THINK: Smarter Faster™ Big Think is the leading source of expert-driven, actionable, educational content. With thousands of videos, featuring experts ranging from Bill Clinton to Bill Nye, we help you get smarter, faster. ​Our experts are either disrupting or leading their respective fields—subscribe to learn from top minds like these daily. We aim to help you explore the big ideas and core skills that define knowledge in the 21st century, so you can apply them to the questions and challenges in your own life. Other Frequent contributors include Michio Kaku & Neil DeGrasse Tyson. Michio Kaku Playlist: https://bigth.ink/Kaku Bill Nye Playlist: https://bigth.ink/BillNye Neil DeGrasse Tyson Playlist: https://bigth.ink/deGrasseTyson Read more at https://bigthink.com for a multitude of articles just as informative and satisfying as our videos. New articles posted daily on a range of intellectual topics. Join Big Think Edge, to gain access to an immense library of content. It features insight from many of the most celebrated and intelligent individuals in the world today. Topics on the platform are focused on: emotional intelligence, digital fluency, health and wellness, critical thinking, creativity, communication, career development, lifelong learning, management, problem solving & self-motivation. BIG THINK EDGE: https://bigth.ink/Edge ---------------------------------------------------------------------------------- FOLLOW BIG THINK: 📰BigThink.com: https://bigth.ink 🧔Facebook: https://bigth.ink/facebook 🐦Twitter: https://bigth.ink/twitter 📸Instagram: https://bigth.ink/Instragram 📹YouTube: https://bigth.ink/youtube ✉ E-mail: info@bigthink.com ---------------------------------------------------------------------------------- TRANSCRIPT: For more info on this video, including the full transcript, check out https://bigthink.com/big-think-edge/learn-better
GPT-3: Language Models are Few-Shot Learners (Paper Explained)

GPT-3: Language Models are Few-Shot Learners (Paper Explained)

#gpt3 #openai #gpt-3 How far can you go with ONLY language modeling? Can a large enough language model perform NLP task out of the box? OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding. OUTLINE: 0:00 - Intro & Overview 1:20 - Language Models 2:45 - Language Modeling Datasets 3:20 - Model Size 5:35 - Transformer Models 7:25 - Fine Tuning 10:15 - In-Context Learning 17:15 - Start of Experimental Results 19:10 - Question Answering 23:10 - What I think is happening 28:50 - Translation 31:30 - Winograd Schemes 33:00 - Commonsense Reasoning 37:00 - Reading Comprehension 37:30 - SuperGLUE 40:40 - NLI 41:40 - Arithmetic Expressions 48:30 - Word Unscrambling 50:30 - SAT Analogies 52:10 - News Article Generation 58:10 - Made-up Words 1:01:10 - Training Set Contamination 1:03:10 - Task Examples https://arxiv.org/abs/2005.14165 https://github.com/openai/gpt-3 Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general. Authors: Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher
Arabic Influence on Modern Hebrew!!

Arabic Influence on Modern Hebrew!!

This video is all about how the Arabic language has influenced Modern Hebrew! 🚩 Learn Hebrew and Arabic with HebrewPod101 ( http://bit.ly/HebrewPod ) and ArabicPod101 ( http://bit.ly/arabicpod101 ). (Full disclosure: if you sign up for a premium account, Langfocus receives a small referral fee. But the free account is great too!) Special thanks to Daniel Shakarov for his Hebrew audio samples, and Ahmed Souhad for his Arabic audio samples! 🚩 Support Langfocus on Patreon: http://patreon.com/langfocus Current Patrons include: Andres Resendez Borgia, Andrew Heckenberg, Anjo Barnes, Auguste Fields, Behnam Esfahbod, Bennett Seacrist, Brandon Gonzalez, Can Cetinyilmaz, Clark Roth, Fiona de Visser, Guillermo Jimenez, Jacob Madsen, John Moffat, Marcelo Loureiro, Matthew Etter, Michael Arbagi, Michael Cuomo, Nobbi Lampe-Strang, Patrick W., Rosalind Resnick, Ruben Sanchez Jr, Sebastian Langshaw, ShadowCrossZero, Victoria Goh, Vincent David, Yuko Sunda, Adam Powell, Adam Vanderpluym, Alberto del Angel, Alen, Alex Hanselka, Ali Muhammed Alshehri, Alvin Quiñones, Andrew Woods, Angeline Biot, Aous Mansouri, Ashley Dierolf, Atsushi Yoshida, Avital Levant, Bartosz Czarnotta, Brent Warner, Brian Begnoche, Brian Morton, Bruce Stark, Carl saloga, Charis T'Rukh, Chelsea Boudreau, Christian Langreiter, Christopher Lowell, David LeCount, Debbie Levitt, Diane Young, DickyBoa, divad, Divadrax, Don Ross, Donald Tilley, Edward Wilson, Eric Loewenthal, Erin Robinson Swink, Fabio Martini, fatimahl, Grace Wagner, Gus Polly, Hannes Egli, Harry Kek, Henri Saussure, Herr K, Ina Mwanda, Jack Jackson, James and Amanda Soderling, James Lillis, Jay Bernard, Jens Aksel Takle, JESUS FERNANDO MIRANDA BARBOSA, JK Nair, JL Bumgarner, Justin Faist, Kevin J. Baron, Klaw117, Konrad, Kristian Erickson, Krzysztof Dobrzanski, Laura Morland, Lee Dedmon, Leo Coyne, Leo Barudi, Lincoln Hutton, Lorraine Inez Lil, Luke Jensen, M.Aqeel Afzal, Mahmoud Hashemi, Margaret Langendorf, Maria Comninou, Mariana Bentancor, Mark, Mark Grigoleit, Mark Kemp, Markzipan, Maurice Chou, Merrick Bobb, Michael Regal, Mike Frysinger, mimichi, Mohammed A. Abahussain, Nicholas Gentry, Nicole Tovar, Oleksandr Ivanov, Oto Kohulák, Panot, Papp Roland, Patrick smith, Patriot Nurse, Paul Shutler, Pauline Pavon, Paulla Fetzek, Peter Andersson, Peter Nikitin, Peter Scollar, Pomax, Raymond Thomas, Renato Paroni de Castro, Robert Sheehan, Robert Williams, Roland Seuhs, Ronald Brady, Ryan Lanham, Saffo Papantonopoulou, Samuel Croes, Scott Irons, Scott Russell, Sergio Pascalin, Shoji AKAO, ShrrgDas, Sierra Rooney, Simon Blanchet, Simon G, Spartak Kagramanyan, Steeven Lapointe, Stefan Reichenberger, Steven Severance, Suzanne Jacobs, Theophagous, Thomas Chapel, Tomáš Pauliček, Tryggurhavn, veleum, William MacKenzie, William O Beeman, William Shields, yasmine jaafar, Yeshar Hadi, Éric Martin. Sources include: The Renaissance of Modern Hebrew and Modern Standard Arabic: Parallels and Differences in the Revival of Two Semitic Languages. Joshua Blau. 40-42. “Arabic Loanwords in Modern Hebrew". Haseeb Shehadeh. ENCYCLOPEDIA OF HEBREW LANGUAGE AND LINGUISTICS Volume 1 (A-F). 149-152. Rasmī or aslī?: Arabic’s impact on Israeli Hebrew. D Gershon Lewental, DGLnotes, 27 January 2012. http://dglnotes.com/notes/arabic-hebrew.htm Moroccan Arabic's Influence on Modern Hebrew. "Foreigncy" podcast, Oct. 14 2018. Guest: Dr. Jonas Sibony, professor of Modern Hebrew, University of Strasbourg. Arabic Influence: Modern Period. Roni Henkin. ENCYCLOPEDIA OF HEBREW LANGUAGE AND LINGUISTICS Volume 1 (A-F). 143-149. https://www.academia.edu/6747639/Arabic_influence_Modern_period.pdf. Eliezer Ben-Yehuda Is Turning in His Grave Over Israel’s Humiliation of Arabic. Seraj Assi. https://www.haaretz.com/opinion/.premium-eliezer-ben-yehuda-is-turning-in-his-grave-over-israels-humiliation-of-arabic-1.5472510 Music: "Time Illusionist" by Asher Fulero. The following images were used under Creative Commons Sharealike 3.0 license: https://en.wikipedia.org/wiki/Afroasiatic_languages#/media/File:Hamito-Semitic_languages.jpg. Author: Listorien, Anak 1. https://commons.wikimedia.org/wiki/Category:Ashkelon#/media/File:Ashqelon2011-2.jpg. Author: Oyoyoy Still images which include the above images are available for use under the same Creative Commons Sharealike 3.0 license.
The ARABIC Language (Its Amazing History and Features)

The ARABIC Language (Its Amazing History and Features)

This video is all about the Arabic language, from its early origins on the Arabian peninsula, to its current status as the 5th most spoken language on Earth. I also examine a number of features of Arabic. ▶ Learn Arabic: http://bit.ly/arabicpod101 ◀ (Full disclosure: if you sign up for a paid membership, Langfocus receives a small referral fee.) Special thanks to Murjana Shabaneh and Mohammad Abd Al Qadr for the audio samples and feedback! 🔹🔷 Check out Langfocus on Patreon http://patreon.com/langfocus 🔷🔹 Current Patreon members include these fantastic people: Brandon Gonzalez, Виктор Павлов, Mark Thesing, Jiajun "Jeremy" Liu, иктор Павлов, Guillermo Jimenez, Sidney Frattini Junior, Bennett Seacrist, Ruben Sanchez, Michael Cuomo, Eric Garland, Brian Michalowski, Sebastian Langshaw, Vadim Sobolev, FRANCISCO, Mohammed A. Abahussain, Fred, UlasYesil, JL Bumgarner, Rob Hoskins, Thomas A. McCloud, Ian Smith, Maurice Chow, Matthew Cockburn, Raymond Thomas, Simon Blanchet, Ryan Marquardt, Sky Vied, Romain Paulus, Panot, Erik Edelmann, Bennet, James Zavaleta, Ulrike Baumann, Ian Martyn, Justin Faist, Jeff Miller, Stephen Lawson, Howard Stratton, George Greene, Panthea Madjidi, Nicholas Gentry, Sergios Tsakatikas, Bruno Filippi, Sergio Tsakatikas, Qarion, Pedro Flores, Raymond Thomas, Marco Antonio Barcellos Junior, David Beitler, Rick Gerritzen, Sailcat, Mark Kemp, Éric Martin, Leo Barudi, Piotr Chmielowski, Suzanne Jacobs, Johann Goergen, Darren Rennels, Caio Fernandes, Iddo Berger, Peter Nikitin, Brent Werner, Fiona de Visser, Carl Saloga, Edward Wilson, Kevin Law, David Lecount, Joshua Philgarlic, for their generous Patreon support. Video chapters: 00:00 Introduction 00:32 General Information about the Arabic Language 01:07 Varieties of Arabic 02:06 Arabic is Semitic language 02:22 Old Arabic 03:51 Classical Arabic 05:04 Neo-Arabic & Middle Arabic 06:02 Modern Arabic 06:47 Diglossia in Arabic 08:21 The Arabic script 09:24 Arabic phonology 10:30 Morphology in the Arabic language 11:36 Verbs in Arabic 13:05 Word order in Arabic 14:00 Cases in Arabic 15:05 Sentence breakdown 16:30 Final comments 17:22 The Question of the Day Music: You're free to use this song and monetize your video, but you must include the following in your video description: Ibn Al-Noor by Kevin MacLeod is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/) Source: http://incompetech.com/music/royalty-free/index.html?isrc=USUAN1100706 Artist: http://incompetech.com/ "Raw Deal" by Gunnar Olsen. "In Case You Forgot" by Otis McDonald. Drum beat from: https://www.youtube.com/watch?v=fVvWgpBHNL0 Images: "Arabic Speaking World" map courtesy of Keteracel at English Wikipedia. https://commons.wikimedia.org/wiki/File:Arabic_speaking_world.svg
Natural Language Processing

Natural Language Processing

Natural Language Processing is a field of Artificial Intelligence dedicated to enabling computers to understand and communicate in human language. NLP is only a few decades old, but we've made significant progress in that time. I'll cover how its changed over the years, then show you how you can easily build an NLP app that can either classify or summarize text. This is incredibly powerful technology that anyone can freely use, I'll show you how to do it. Enjoy! Code for this video: https://github.com/llSourcell/bert-as-service Please Subscribe! And like. And comment. That's what keeps me going. Want more education? Connect with me here: Twitter: https://twitter.com/sirajraval Facebook: https://www.facebook.com/sirajology instagram: https://www.instagram.com/sirajraval More Learning resources: https://www.youtube.com/watch?v=0n95f-eqZdw http://mlexplained.com/2019/01/30/an-in-depth-tutorial-to-allennlp-from-basics-to-elmo-and-bert/ https://towardsdatascience.com/beyond-word-embeddings-part-2-word-vectors-nlp-modeling-from-bow-to-bert-4ebd4711d0ec https://gluon-nlp.mxnet.io/examples/sentence_embedding/bert.html Join us in the Wizards Slack channel: http://wizards.herokuapp.com/ Join us at the School of AI: https://theschool.ai/ And please support me on Patreon: https://www.patreon.com/user?u=3191693 Signup for my newsletter for exciting updates in the field of AI: https://goo.gl/FZzJ5w Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available): https://www.wagergpt.xyz

Consolidating and Exploring Open Textual Knowledge Prof. Ido Dagan, Bar Ilan University >> Click here

מבוא לשפה - עיבוד ממוחשב של שפה אנושית עם פרופסור עידו דגן  With Spotify

START WIT NLP

Start with NLP

Recommended courses: 
https://www.coursera.org/specializations/natural-language-processing

 

Recommended textbook, available online: 
https://web.stanford.edu/~jurafsky/slp3/

 

It also provides great little introductions to many fields of linguistics before you hop into the computational part.

NLP Tutorials Part -I from Basics to Advance

https://www.analyticsvidhya.com/blog/2022/01/nlp-tutorials-part-i-from-basics-to-advance/ 

Hebrew NLP resources

Hebrew NLP Resources

https://github.com/NNLP-IL/Resources

 

מאגרי מידע ושת"פים אפשריים 
https://docs.google.com/spreadsheets/d/1fGYKyA5Jf_KPCXPCpRWGfRzjDc6ALp9dgKnbIXqxM_Y/edit#gid=0

 

חוות דעת: שימושים בתכנים מוגנים בזכויות יוצרים לצורך למידת מכונה

https://www.gov.il/he/departments/legalInfo/machine-learning

Open source

Open Source

Github​

NLP
https://github.com/topics/natural-language-processing

 

Speech

https://github.com/topics/speech

spaCy · Industrial-strength Natural Language Processing in Python
https://spacy.io/

Stanza – A Python NLP Package for Many Human Languages

Created by the Stanford NLP Group

https://stanfordnlp.github.io/stanza/a

Unsupervised

Large language model (LLM)

Open LLMs List
https://github.com/eugeneyan/open-llms

What’s before GPT-4? A deep dive into ChatGPT

https://medium.com/digital-sense-ai/whats-before-gpt-4-a-deep-dive-into-chatgpt-dfce9db49956

GPT-4 Training process

Like previous GPT models, the GPT-4 base model was trained to predict the next word in a document, and was trained using publicly available data (such as internet data) as well as data we’ve licensed. The data is a web-scale corpus of data including correct and incorrect solutions to math problems, weak and strong reasoning, self-contradictory and consistent statements, and representing a great variety of ideologies and ideas.

So when prompted with a question, the base model can respond in a wide variety of ways that might be far from a user’s intent. To align it with the user’s intent within guardrails, we fine-tune the model’s behavior using reinforcement learning with human feedback (RLHF).

Note that the model’s capabilities seem to come primarily from the pre-training process—RLHF does not improve exam performance (without active effort, it actually degrades it). But steering of the model comes from the post-training process—the base model requires prompt engineering to even know that it should answer the questions.

https://openai.com/research/gpt-4

BERT
https://github.com/google-research/bert

AlephBERT

https://github.com/OnlpLab/AlephBERT
https://arxiv.org/pdf/2104.04052.pdf

Multi-language Aspects
How Language-Neutral is Multilingual BERT?​

https://arxiv.org/pdf/1911.03310.pdf

AraBERT: Transformer-based Model for Arabic Language Understanding
https://arxiv.org/pdf/2003.00104.pdf
 
ELMo

https://allennlp.org/elmo    

bottom of page