Generative Machine Learning Models #

What are Generative LMs * A language model is a representation of how language is structured * A familiar example is predictive text * given a short piece of text for context the language model suggests the most likely continuations * The current generation of Large Language Models (LLMs) do much the same thing, but they are initially trained on vast amounts of text using an enormous amount of compute. * LLMs have developed very sophisticated representations of the structure of text including the ability to extract and refer to information in the text and to follow instructions * LLMs have been combined with algorithms for image analysis and descriptions of images leading to a number of algorithms to generate and modify images based on instructions * I use the term generative LM rather than Generative AI * The term AI can mean many things to different people * All the recent breakthroughs I’m familiar with are based on LM technology * We have a very poor understanding of how LLMs are able to do what they dp

Why the sudden focus
- Natural Language Processing (NLP) developed slowly from the 60s through to 2010s, but in the past 5 years have shown huge progress
- One of the biggest improvements was from the release of GPT-2 around 2018. This was one of the first of these types of models where concerns about malicious use led OpenAI to restrict the full release initially
- GPT-3 (2020) showed close to human performance in many areas. Dall-E and Stable Diffusion showed incredible ability in creating images
- Since Nov 2022 Open AI released Chat GPT, the chat interface has proven more accessible for many users
- Since the start of 2023 Microsoft released Bing Chat and Google released Bard. These got attention both for their capabilities and for issues after their launch, there are concerns the companies see themselves in a race and are not sufficiently addressing AI Safety concerns
What can they do
- Generate text, images, music based on instructions
- Write code and SQL queries based on descriptions, and describe code
- Make use of external tools - Microsoft integrated GPT algorithm with bing search, algorithm
- Education - describe complex topics in terms familiar to a user, and answer questions
- Language Translation
- Voiceovers
- Examples
  - Dall-E
  - Stable Diffusion
  - Github Copilot
  - ElevenLabs voice cloning
  - Midjourney
  - D-ID
Caution/Risks
- Technical
  - Hallucination
    - At the core these models generate meaningful text. They have developed the ability to extract facts from text and present them back when appropriate as a side effect of this task, but when such facts are not available the algorithm will generate text that looks correct and from then on treat that as fact. We have very little understanding of how the algorithm accesses and stores information, and little ability to only respond based on knowledge rather than language patterns
  - Context Window
  - Lack of Direction
- Prompt Injection
  - Based on the term SQL Injection.
  - Many LLM based applications can be tricked into behaving differently than intended
- Eliza effect
  - cherry picking results
  - Since the easiest way to interact with the model is by conversing with it - it becomes tempting to think of it as a person
  - It often takes a while for the limitations of model output to become apparent
- Deep Fakes
- Bias - (lack of accountability)
- Employment
  - Image generation algorithms already impact designers and artists
- Alignment
Opportunities
- For us as Developers
- Supporting Business Users
- Supporting enterprise use cases
Approach
- Understand and become familiar with the technologies. Accept the technology for what it is, understand their limitations and be imaginative in how they can be used.
- Watch legislation
- Facilitate change
- Support organisations