Kirjaudu sisään

Stepfunction Podcast

Stepfunction Podcast

Episode 16 - Midjourney vs Google SGE vs OpenAI DALL-E 3

23 min · 22. loka 2023

jakson Episode 16 - Midjourney vs Google SGE vs OpenAI DALL-E 3 kansikuva

Kuvaus

Seymour and Jeff discuss the recently announced updates from OpenAI, especially regarding image generation in GPT-4 and DALL-E 3. Our ranking of image generation AI's from best to worst: (1) Midjourney, (2) Google Search Generative Experience (SGE), and finally (3) DALL-E. Jeff closes by talking about the recent LLM workshop he conducted for junior high and middle school students. Links: * OpenAI announces new voice chat and image features [https://www.theverge.com/2023/9/25/23886699/chatgpt-pictures-voice-commands-ai-chatbot-openai] for ChatGPT. * DALL-E 3 update [https://techcrunch.com/2023/09/20/openai-unveils-dall-e-3-allows-artists-to-opt-out-of-training/]. * Google Converse [https://www.gadgetsnow.com/featured/how-googles-converse-generative-ai-is-different-from-google-search/articleshow/103252562.cms] aka Google SGE [https://blog.google/products/search/generative-ai-search/] is still better than DALL-E. * Midjourney [https://zapier.com/blog/how-to-use-midjourney/] is still the best. * Regarding earlier deep learning methods of translating sketches into finished drawings, Jeff was thinking of NVIDIA's GauGAN, based on SPatially-Adaptive DEnormalization (SPADE). * 2019 blog post [https://blogs.nvidia.com/blog/2019/03/18/gaugan-photorealistic-landscapes-nvidia-research/] by NVIDIA. * Associated paper at arXiv [https://arxiv.org/abs/1903.07291] and code at GitHub [https://github.com/NVlabs/SPADE]. * From Jeff's workshop: * Definitions for the G,P, and T in "ChatGPT" * Generative (as in generative AI--see this entire podcast 😉). * Pre-trained [https://stats.stackexchange.com/questions/193082/what-is-pre-training-a-neural-network]. * Transformer [https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)]. * Meta/FB's Llama2 [https://ai.meta.com/llama/] (7 Billion parameters). * Fine-Tuning–one of part of many methods to optimize a base model. See charts in this NVIDIA article [https://developer.nvidia.com/blog/selecting-large-language-model-customization-techniques/]. * Low-Rank Adaptation: * Conceptual article about LoRA [https://huggingface.co/docs/peft/conceptual_guides/lora] at HuggingFace. * Original LoRA 2021 paper [https://arxiv.org/abs/2106.09685]. * May 2023 QLoRA paper [https://arxiv.org/abs/2305.14314]. * August 2023 LoRA-FA paper [https://arxiv.org/abs/2308.03303]. * Short Wikipedia description of LoRA [https://Wiki%20article%20--%20https//en.wikipedia.org/wiki/Fine-tuning_(deep_learning)#Low-rank_adaption]. * 2019 programmer joke [https://www.reddit.com/r/ProgrammerHumor/comments/bgrlu4/stackoverflow_in_a_nutshell/] about using Google and StackOverflow [https://en.wikipedia.org/wiki/Stack_Overflow]. Send questions/comments to stepfunctionpod@gmail.com and find us on the web at www.stepfunction.org

Kommentit

0

Ole ensimmäinen kommentoija

Rekisteröidy nyt ja liity Stepfunction Podcast-yhteisöön!

Kaikki jaksot

19 jaksot

Episode 19 - Spaghetti Delivery by AI Agents

Jeff and Seymour kick off 2024 with a discussion of the three phases of generative AI. Phase 1: The launch of ChatGPT [https://openai.com/blog/chatgpt] in November, 2022. Phase 2: The rise and fall of AI wrapper companies [https://techstartups.com/2023/08/18/ai-wrappers-the-rise-of-ai-wrappers-and-the-challenges-ahead/] during early- to mid-2023. Phase 3: The current emergence of AI Agents [https://aibusiness.com/nlp/openai-is-developing-ai-agents] that can automatically chain together multiple steps that drastically change the kinds of products that can be built and the software development work behind them. Links: * The Information [https://en.wikipedia.org/wiki/The_Information_(website)]'s article "OpenAI Shifts AI Battleground to Software That Operates Devices, Automates Tasks" [https://www.theinformation.com/articles/openai-shifts-ai-battleground-to-software-that-operates-devices-automates-tasks] * Links from Google's 2018 launch of Google Duplex: * Blog post [https://blog.research.google/2018/05/duplex-ai-system-for-natural-conversation.html] by Google Research * Initial 2018 review [https://www.theverge.com/2018/5/8/17332070/google-assistant-makes-phone-call-demo-duplex-io-2018] * 2019 retropsective from The Verge [https://en.wikipedia.org/wiki/The_Verge] Send questions/comments to stepfunctionpod@gmail.com and find us on the web at www.stepfunction.org

20. helmi 202424 min

Episode 18 - OpenAI Over Board

We re-examine the swirl of events at OpenAI in November, starting with the Nov 6th Dev Day in San Francisco [https://openai.com/blog/announcing-openai-devday]. Then the tumultuous firing and re-hiring of CEO Sam Altman, President Greg Brockman, and related Board ups and downs. Links: * History of OpenAI Board [https://loeber.substack.com/p/a-timeline-of-the-openai-board]. * Board as of Nov 1, 2023: Sam Altman, Greg Brockman, Ilya Sutskever, Helen Toner, Tasha McCauley, Adam D'Angelo * Board as of Dec 1, 2023: Adam D'Angelo, Larry Summers, Bret Taylor * May 2023 - Reid Hoffman left OpenAI Board and announces investment in InflectionAI [https://www.reuters.com/technology/reid-hoffmans-new-ai-startup-inflection-launches-chatgpt-like-chatbot-2023-05-02/]. * History of impeachments at US Supreme Court [https://www.history.com/news/has-a-u-s-supreme-court-justice-ever-been-impeached]. * Casey Newton's recent Platformer [https://www.platformer.news/p/openais-alignment-problem] newsletters [https://www.platformer.news/p/the-openai-saga-isnt-over-just-yet] summarizing events at OpenAI. Send questions/comments to stepfunctionpod@gmail.com and find us on the web at www.stepfunction.org

5. joulu 202327 min

Episode 17 - Ten years from Her (2013) to Humane

Jeff and Seymour discuss interplay among sci-fi movies, speculative fiction, and 'real world AI' through the lens of Spike Jonze's [https://en.wikipedia.org/wiki/Spike_Jonze_filmography] 2013 movie Her [https://en.wikipedia.org/wiki/Her_(film)]. Also some conversation about the upcoming Humane Ai Pin [https://www.theverge.com/2023/10/27/23935644/humane-ai-pin-price-subscription]. Links: * Biden's Executive Order and related AP article quoting White House Deputy Chief of Staff Bruce Reed [https://apnews.com/article/biden-ai-artificial-intelligence-executive-order-cb86162000d894f238f28ac029005059] on impact of Tom Cruise's recent Mission Impossible movie [https://en.wikipedia.org/wiki/Mission:_Impossible_%E2%80%93_Dead_Reckoning_Part_One]. * Quote [https://mcluhangalaxy.wordpress.com/2013/04/01/we-shape-our-tools-and-thereafter-our-tools-shape-us/]: "We become what we behold. We shape our tools and then our tools shape us." * Deep learning revolution began getting significant attention in 2012 when AlexNet [https://en.wikipedia.org/wiki/AlexNet] won that year's ImageNet competition [https://en.wikipedia.org/wiki/ImageNet#History_of_the_ImageNet_challenge]. * 2013 interview with Spike Jonze [https://www.theguardian.com/film/filmblog/2013/sep/09/spike-jonze-her-scarlett-johansson] when he describes his early 2000s experience instant messaging with a chatbot similar to ELIZA [https://en.wikipedia.org/wiki/ELIZA]. * Original voice behind Her's Samantha AI was Samantha Morton [https://en.wikipedia.org/wiki/Samantha_Morton]; her role was completely re-recorded by Scartlett Johansson [https://en.wikipedia.org/wiki/Scarlett_Johansson] during post-production. * Samantha Morton also starred with Tom Cruise in an earlier influential sci-fi film Minority Report [https://en.wikipedia.org/wiki/Minority_Report_(film)]. * Rabbit [https://techcrunch.com/2023/10/04/rabbit-is-building-an-ai-model-that-understands-how-software-works/] might be building an AI-based OS with an investment by Sam Altman [https://en.wikipedia.org/wiki/Sam_Altman]. * NY Times article [https://www.nytimes.com/2023/09/28/technology/openai-apple-silicon-valley-supergroup-create-ai-device.html] about a future hardware device mixing contributions from Altman, Jony Ive [https://en.wikipedia.org/wiki/Jony_Ive]'s LoveFrom [https://www.lovefrom.com], ARM [https://en.wikipedia.org/wiki/Arm_Holdings], and Masayoshi Son [https://en.wikipedia.org/wiki/Masayoshi_Son]'s Softbank [https://en.wikipedia.org/wiki/SoftBank_Group]. * Humane's website [https://hu.ma.ne/] where you can view their product photography and pics from their fashion show. * Preview [https://gizmodo.com/everything-we-know-about-humane-ai-pin-openai-1850977325] before Humane's November 9th product launch this week. * Form factor reminded Jeff a little of the 3rd Gen iPod nano [https://www.macworld.com/article/187310/3gipodnano.html] from 2007. Send questions/comments to stepfunctionpod@gmail.com and find us on the web at www.stepfunction.org

6. marras 202328 min

Episode 16 - Midjourney vs Google SGE vs OpenAI DALL-E 3

Seymour and Jeff discuss the recently announced updates from OpenAI, especially regarding image generation in GPT-4 and DALL-E 3. Our ranking of image generation AI's from best to worst: (1) Midjourney, (2) Google Search Generative Experience (SGE), and finally (3) DALL-E. Jeff closes by talking about the recent LLM workshop he conducted for junior high and middle school students. Links: * OpenAI announces new voice chat and image features [https://www.theverge.com/2023/9/25/23886699/chatgpt-pictures-voice-commands-ai-chatbot-openai] for ChatGPT. * DALL-E 3 update [https://techcrunch.com/2023/09/20/openai-unveils-dall-e-3-allows-artists-to-opt-out-of-training/]. * Google Converse [https://www.gadgetsnow.com/featured/how-googles-converse-generative-ai-is-different-from-google-search/articleshow/103252562.cms] aka Google SGE [https://blog.google/products/search/generative-ai-search/] is still better than DALL-E. * Midjourney [https://zapier.com/blog/how-to-use-midjourney/] is still the best. * Regarding earlier deep learning methods of translating sketches into finished drawings, Jeff was thinking of NVIDIA's GauGAN, based on SPatially-Adaptive DEnormalization (SPADE). * 2019 blog post [https://blogs.nvidia.com/blog/2019/03/18/gaugan-photorealistic-landscapes-nvidia-research/] by NVIDIA. * Associated paper at arXiv [https://arxiv.org/abs/1903.07291] and code at GitHub [https://github.com/NVlabs/SPADE]. * From Jeff's workshop: * Definitions for the G,P, and T in "ChatGPT" * Generative (as in generative AI--see this entire podcast 😉). * Pre-trained [https://stats.stackexchange.com/questions/193082/what-is-pre-training-a-neural-network]. * Transformer [https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)]. * Meta/FB's Llama2 [https://ai.meta.com/llama/] (7 Billion parameters). * Fine-Tuning–one of part of many methods to optimize a base model. See charts in this NVIDIA article [https://developer.nvidia.com/blog/selecting-large-language-model-customization-techniques/]. * Low-Rank Adaptation: * Conceptual article about LoRA [https://huggingface.co/docs/peft/conceptual_guides/lora] at HuggingFace. * Original LoRA 2021 paper [https://arxiv.org/abs/2106.09685]. * May 2023 QLoRA paper [https://arxiv.org/abs/2305.14314]. * August 2023 LoRA-FA paper [https://arxiv.org/abs/2308.03303]. * Short Wikipedia description of LoRA [https://Wiki%20article%20--%20https//en.wikipedia.org/wiki/Fine-tuning_(deep_learning)#Low-rank_adaption]. * 2019 programmer joke [https://www.reddit.com/r/ProgrammerHumor/comments/bgrlu4/stackoverflow_in_a_nutshell/] about using Google and StackOverflow [https://en.wikipedia.org/wiki/Stack_Overflow]. Send questions/comments to stepfunctionpod@gmail.com and find us on the web at www.stepfunction.org

22. loka 202323 min

Episode 15 - The "Who Is Jeff Hwang?" Test

We discuss the September 2023 AI Conf [https://aiconference.com] in San Francisco, Anthropic AI and Claude, and how to test LLMs with "Who Is Jeff Hwang?". Links: * 2023 Google Pixel Launch Event news [https://www.theverge.com/23902026/google-pixel-8-launch-event-biggest-announcements-watch-buds-pro] from The Verge. * 2023 Facebook Connect [https://www.theverge.com/2023/9/27/23889627/meta-connect-quest-3-developer-conference-announcements-news] on Metaverse, Quest VR hardware, and AI. * Schedule of speakers and talks [https://aiconference.com/agenda/] at AI Conf in SF, September 2023. * Amazon invests $1.25B [https://techcrunch.com/2023/09/25/amazon-to-invest-up-to-4-billion-in-ai-startup-anthropic/] in Anthropic with option to invest [https://www.theverge.com/2023/9/25/23888841/amazon-4-billion-investment-anthropic-claude-ai-openai-microsoft] up to $4B. * Founders of Anthropic [https://en.wikipedia.org/wiki/Anthropic] are alums of OpenAI. * Better answers for "Who is Jeff Hwang" from www.phind.com [https://www.phind.com], www.pi.ai [https://www.pi.ai], Bard [https://bard.google.com/] from Google, and Microsoft Bing Chat [https://www.bing.com/search?form=MA13J8&OCID=MA13J8&q=Bing+AI&showconv=1]. * IBM Video 1 [https://www.youtube.com/watch?v=hfIUstzHs9A&t=165s]: Simple explanation of fine-tuning and prompting LLMs (start at 2:45) * IBM Video 2 [https://www.youtube.com/watch?v=T-D1OfcDW1M]: Explanation of Retrieal Augmented Generation (RAG) * Hugging Face explainer [https://huggingface.co/blog/peft] on PEFT (Parameter Efficient Fine-Tuning) Send questions/comments to stepfunctionpod@gmail.com and find us on the web at www.stepfunction.org

10. loka 202328 min