Chain of Thought Prompting – Simple Yet Powerful Technique to Harness GPT3

The emergence of large language models, such as ChatGPT, has revolutionized the field of natural language processing and has paved the way for new applications and advancements in artificial intelligence.

In the past, language models were limited by the size of the data they were trained on and the computational resources available. However, with advances in hardware and the availability of large amounts of data, researchers have been able to train much larger language models that can generate human-like text and perform a wide range of language tasks with unprecedented accuracy.

One of the most well-known large language models is ChatGPT, developed by OpenAI. ChatGPT is a conversational AI model trained on a diverse range of text, including books, websites, and social media. It uses a transformer architecture and is capable of generating text that is coherent and context-sensitive.

ChatGPT has been used for a variety of applications, including chatbots, language translation, question-answering, and summarization. It has been integrated into many platforms, such as customer service chatbots, and has proven to be an effective tool for automating and streamlining communication.

A simple technique to improve problem solving capabilities of LLM such as GPT3 is to use chain of thought prompting . In this technique a human labeler feeds in intermediate steps in response prompts (see example below), like a teacher encourages its student to explain the reasoning. The model can then can learn to reason on unseen problems, and improve its ability to answer questions that require multi step reasoning. This also helps a user to understand the thought process and how the model derived the answer.

https://ai.googleblog.com/2022/05/language-models-perform-reasoning-via.html

Blog by Jeff Dean, mentions power of chain of prompt reasoning

https://ai.googleblog.com/2023/01/google-research-2022-beyond-language.html

https://arxiv.org/abs/2201.11903

ICML 2016 – International Conference for Machine Learning Notes

Notes from ICML 2016 Held in New York

David Silver (Deep Mind), Yoshua Bengio (Univ of Montreal)
David Silver (Deep Mind), Yoshua Bengio (Univ of Montreal)

Summary

Attended the biggest ever machine learning conference in number of participants and papers. Red hot interest in deep learning and reinforcement learning. Great advancements in vision (Microsoft deep residual networks 1000 level deep neural networks), sound to text (Bidu Deepspeech 2.0), reinforcement learning (Deepmind A3C algorithm, a AI player learns to explore and play in  3D Lybrinth maze, folks who developed AlphaGo). Image captioning /understanding getting even more sophisticated (dense captioning work by Fei Fei and team). Language understanding is still lagging and needs breakthrough, however a couple of papers from Metamind  about question answering system on text and especially on images seemed promising.

Active areas that need more digging

  • Memory /attention,
  • Ways to teach machines with less data. Currently deep learning is data hungry, needs lots of annotated data
  • Understanding the story in an image (Dr Fei Fei work)
  • Text understanding, lags image and speech

My personal conclusion is that there is still a lot to go towards the goal of strong AI. Though AlphaGo (Deepmind system that beat Go) and DeepQ are great strides in AI, these systems only learn by intuition encoded in neural network weights backed by huge compute resources, and this learning seems to be different from the way humans learn. A true AI systems should be able to use the same architecture and apply to car driving, learning to play chess,   a new language or cook. I feel if breakthroughs are not made in a few more years, there could be another AI winter coming. Also at the same time it feels we are almost there to the quest of true AI!

Industry

  • Metamind acquired by Salesforce. Should be watching the salesforce conference announcements how they indent to use deep learning technologies.
  • NVidia and NYU partner to develop end to end neural network for autonomous cars
  • Clafiai – NY based startup for image captioning. Interesting use case for CMS and for accesibility.
  • Netflix – Patterns for machine learning. Netflix uses Time machine an interesting architecture to train models using production data.
  • Maluuba – Upcoming Canadian startup that specializes in natural langauge processing. Claimed that thier results are better than Google/Facebook.

Reading List For Papers presented

All papers presented at ICML 2016

My synthesized list to read over

Important List for Papers Referenced From Previous Conferences

People Met

  • Dr Fei-Fei Li (Stanford) after her keynote. Her work on image captioning is covered on NYTimes.  Interesting talk about deep captioning her latest work on understanding the story.
  • Yauan Lecunn (NYU) after his workshop discussion asked about meta thinking, learning to think. Also asked if he will be teaching the deep learning course at NYU next spring, which he affirmed.
  • David Silver (Google Deepmind). Excellent tutorial on deep reinforcement learning, that learnt to play arcade game just from raw pixel data, and alphago. Asked him question what are the limitations, and he told me that challenges are for robotics where decisions have to made quicker, and for rewards that are far in the future e.g. needle in the hawstack rewards.
  • Richard Socher (Metamind CEO/ Bought by Salesforce). Chat at the poster session about his paper on question answering system on text and images. Am curious to know how Salesforce intends to use deep learning. Wonder if SugarCrm is diving into machine learning.
  • Matthew Zeiler (ClarifAI CEO). Meeting at the Intrepid after party. Clarifi provides api for image analysis. Discussion on interesting use cases for news industry.
  • Justin Basilico (Machine Learning Netflix). Movie recommendations, which rows and position the movie appears in etc, all driven by machine learning. Netflix has a catalog of machine learning design patterns. Discussion about the Time Machine design pattern
  • Adam Trischler (Maluuba Researcher). Talk about question answering system. They are soon to release products Canadian startup, and claim to have better results than Facebook and Google on public datasets.
  • Howard Mansell (Facebook AI). Chat about Torch usage in Facebook. The talk was about how Torch is a deep learning tool for research.
  • James Zhang (Bloomberg Machine Learning Researcher). Discussion about how to use news in time series prediction.
  • Yan Xu (SAS). Talk about how deep learning can be used in marketing automation. SAS is working on predictive modeling.

Pictures

David Silver Deep Mind
David Silver Deep Mind

Yauan Lecunn at ICML 2016 Workshop
Yauan Lecunn at ICML 2016 Workshop

Dr Fei-Fei Li Keynote
Dr Fei-Fei Li Keynote

Google Arm Robot
Google Arm Robot Researcher (the Google IO one)