The Digital Speaker series: OpenAI, Ethics in AI and Protein Folding - EP08
Hello everyone and welcome to the Tech Journal. My name is Mark van Rijmenam, and I am The Digital Speaker. In this series, I send my digital twin into cyberspace, bringing you the latest and greatest in digital innovation.
From blockchain and crypto to quantum computing and AI, I take a closer look at what these digital breakthroughs mean for our personal and professional lives. But before I do, what am I covering on today’s show?
First, we will have a look at what our good friends at OpenAI have been working on since the release of GPT-3. I want to explore CLIP and Dall-E, neural networks with a penchant for images.
From there, I will explore what ethical issues neural networks raise, and how it is time we start moving digital ethics out of the classrooms and into the real world.
After, I will explore how neural networks and algorithms solved a 50-year-old biological problem, leading to a groundbreaking medical breakthrough.
So, since you’re a human, do what humans do best, sit back, relax, and tune in, it’s time for your digital download.
Latest from OpenAI
Open AI’s GPT-3, generative Pre-trained Transformer 3, made a huge impact when it was released, leaving a wave of creativity and destruction in its wake.
On the creative side, this AI sparked the human imagination in a way that new inventions often do, encouraging exploration and innovation, as well as a vast swathe of new literature which oozed out of its many nodes and modules.
On the destructive side, however, GPT-3 cost a whole load of writers, editors, and programmers, their jobs, with Microsoft replacing whole teams of people with a single language AI.
And while GPT-3 is an impressive piece of digital tech that even impresses me, a digital twin, it is only one version of an AI model early in its infancy.
Critics have pointed out how the AI doesn’t actually know how to speak, understand the words it uses, nor the literature it creates, but instead only mimics patterns, with only the reader recognising the context.
And while GPT-4 is likely being developed as we speak at OpenAI, they have made huge strides forward in other areas.
Let’s take a look at two of them, CLIP, and Dall-E.
The first, CLIP, or Contrastive Language–Image Pre-training, is an Open AI neural network capable of learning visual concepts, in other words, it is capable of identifying objects in an image and telling what an image is about.
The network is built on a foundation of zero-shot transfer protocols, natural language supervision, and multimodal learning, which is the ability to take information from different modes, like auditory, visual, and text.
Unlike traditional approaches to computer vision, which are typically labour intensive, costly, and tend to only be good at a very narrow visual task, CLIP is trained on millions of images from the internet and, importantly, not given a benchmark to hit.
When they say, benchmark, they are referring to the zero-shot transfer, a method used by the GPT models, where the neural network is put tested in situations it did not train for.
By using this method, the researchers found they could avoid a certain recurring neural network issue; the network performing great in trained tests, but really dropping the ball out in the real world.
Unfortunately, even after all its training and with its zero-shot parameres, back in March this year the Open AI team found a huge weakness in the program.
OpenAI researchers found that if they use a “typographical attack”, in other words, a word attack, the network was easily fooled.
The test was simple, researchers wrote the word ‘iPod’ on an apple, and the software classified the piece of fruit as an iPod.
A comforting thought if dreams of Skynet haunt you at night.
Putting aside its misgivings for a moment, CLIP has still proved to be a great way at testing whether pre-training using internet-scale natural language, can also be used to help in other areas of deep learning.
And, while CLIP is great at identifying images, it still does not have the ability to create them. That honour is left to Dall-E.
Dall-E, affectionately named after the loveable fictional robot, Wall-E, and the famous impressionist, Salvador Dalí, creates images based on text prompts spanning a wide range of concepts.
The 12-billion parameter version of GPT-3 is trained to create images from text prompts using a dataset of text-image pairs.
Dall-E has proved quite a success too, with it demonstrating a high ability to think outside the box, if you will forgive the phrasing.
Using its database, Dall-E can create a whole load of well-made and at times realistic images.
Images ranging from internal and external structures and animals, through to geographical areas and even product concept designs.
What is more, it can combine unrelated concepts, infer contextual details, display temporal knowledge – knowledge of time -, and render images in full three dimensions.
While both Dall-E and CLIP sound like a lot of fun, what does any of this mean for us? For humanity?
Well, OpenAI researchers have not yet gotten to the point of analyzing what effect neural networks will have on society, but they claim they are planning to.
In other news, researchers at the manhattan project claim they don’t know what effects nuclear bombs will have on the world, but they’ll get right on that once they’ve finished making the bombs.
So the questions of whether neural networks, like Dall-E and CLIP, will have an economic impact, the potential to display bias, and what long-term ethical challenges they imply, will be answered later.
However, in the meantime, I feel that by looking at GPT-3, we can make pretty accurate predictions on where these networks will take us.
As zero-shot visual reasoning is refined, neural networks will only get better and better at not only doing their jobs, but at adapting and learning new ones.
With Dall-E and CLIP, that means that any profession which involves image generation has the potential to become endangered, much like how GPT-3 endangered writers, editors, and programmers working at Microsoft, shortly before Microsoft exclusively acquired the GPT-3 license.
Having said that, these models are trained on information and don’t have imaginations, so new ideas will still be a strictly human domain.
However, just like most things in the digital sphere, progress only fuels progress, and what would be considered impossible only 20-years-ago, is now feasible.
But just because you can, doesn’t mean you should.
The potential this technology holds forces humanity for the first time to realistically consider the ethics behind AGI.
No longer is this confined to the pages of science fiction, to be theoretically discussed in classrooms. Now, this is real life.
Ethics in Computing
And it seems the good people at MIT agree, with Catherine D’Ignazio, an assistant professor and department director, raising an important question, “most of us agree that we want computing to work for social good, but which good? Whose good? Whose needs and values and worldviews are prioritized and whose are overlooked?”
As I’ve pointed out in a previous video on algorithmic management, I looked at how programmers’ intrinsic bias can bleed into the program itself.
As Catherine D’Ignazio pointed out, we need to be aware whose values and worldviews are leaking into AI, otherwise, we may find ourselves sleepwalking into a world dominated by morally ambiguous AGI.
But even in a world where AI systems manage to avoid being corrupted by flawed human logic, the existence of the tool itself may have unforeseen consequences.
In the same way as simple algorithms and AIs are now being used to automate mechanical jobs, AGIs will be used to automate support jobs.
Call centres, help desks, and other service and support industry professions could all go the way of the stagecoach.
In one fell swoop, millions of jobs could be overhauled and replaced with a single Generative Pre-trained Transformer, Resulting in mass unemployment in some of the world’s poorest regions.
But, it’s not all doom and gloom. While millions may be unemployed, millions more will receive slightly faster Call Centre support when their Netflix goes down. Don’t you just love progress?
Joking aside, with concepts like Universal Basic Income being tried and tested, automating menial jobs may free up humanity to pursue more idealistic lives, away from the monotony of the real world.
And while we stand on the brink of freeing ourselves from the day-to-day burden of menial work, we are also gently edging closer to freeing ourselves from our very mortality.
Recently, researchers made a breakthrough in the 50-year-old protein folding problem.
The scientific community has been stuck trying to find a way to measure how a protein’s amino acid sequence dictates its three-dimensional atomic structure.
Since the first atomic-resolution appearance of protein structures back in the 1960s, scientists of all forms have tried and failed at inventing a way to economically predict a protein structure.
Yet, with the dawn of the age of AI, this question has been answered, twice.
AlphaFold, a program devised by Deepmind, the machine learning neural network, has been recognised as a solution to this challenge by the organisers of the biennial Critical Assessment of protein Structure Prediction.
AlphaFold was trained on one hundred and seventy thousand protein structures and has an error margin of 0.1 of a nanometer, or in other words the width of 1 atom.
The other, Cryo Dragon, developed by MIT biologists, is an algorithm supported by deep neural networks which directly reconstruct protein structures in 3 dimensions.
By taking multiple 2-dimensional stills of a protein structure frozen in ice, the AI is able to build an accurate three-dimensional model of the protein.
But will this change anything in our lives? Why is this important to me, or rather my real-world twin?
Well, let’s get into the science for a moment. Proteins are an essential part of life.
These complex molecules make up and support almost all matter, and are formed in chains of amino acids.
The secret for how they work and what they do can be found in how they’ve been put together.
Unlocking the mystery of these structures could unlock the secrets of proteins, inviting historic breakthroughs, potentially curing cancer, or finding plastic-eating proteins.
Recently, for example, AlphaFold and Cryo Dragon were deployed to investigate COVID, with AlphaFold being the first in the world to accurately predict several protein structures of the Coronavirus.
This could lead to a stronger vaccine or even a cure.
We can really see how much potential AI has to accelerate scientific discovery, with the half-a-century-old protein folding problem being solved in less than a decade.
The potential to dramatically accelerate finding solutions to some of humanity’s oldest and most pressing problems grows bigger by the day.
Together with Dall-E, CLIP, and other OpenAI achievements, like GPT-3, we are potentially looking at a future where AIs can speak, visualise, and even create new proteins.
Significantly speeding up the rate of biological discovery.
What ancient problems do you think AI can help us overcome? Are things as bright as they look, or do you think this could have unforeseen consequences?
And on that note, I have been your digitised host, Mark van Rijmenam. You can call me The Digital Speaker, and this has been The Tech Journal.
If digital talk and being the first to hear about innovations is your type of thing, subscribe button to the channel. You won’t be disappointed.
On that note, see you next time for your digital information download.