I wrote about text-to-image AIs here last summer. Well, the latest bleeding-edge model is out; it is way more interesting, so I thought I’d drop an update.
It’s called DALL-E 2, and it’s made by OpenAI, which also produces the GPT text generation AIs I’ve written about. They’ve given early access to a bunch of people who have been posting the results on Twitter. It’s not super surprising that the shared images are impressive; AI enthusiasts aren’t going to be sharing the failures, which I assume are frequent. Still–the successes are very, very successful.
Here’s one, and then we can move below the fold so the blog doesn’t drown in tweets.
As we lead up to #MicrosoftBuild, I’m getting excited with “30 days of DALL-E 2”! Send me your (respectful) prompts, and every day from April 24-May 24, I’ll post an AI-generated #dalle2 image. #30DaysOfDalle2
“Oil painting of a sad girl looking out the window of a school bus” pic.twitter.com/NApsb1kOmi
— Jennifer Marsman (@jennifermarsman) April 22, 2022
Made from the prompt “Pre-modern Japanese scroll about Bitcoin” #dalle2 #dalle pic.twitter.com/CLUgv27aX2
— Dalle2 Pics (@Dalle2Pics) April 22, 2022
And here’s a whole thread of variations on ‘a lion holding a globe’:
— Ryan Petersen (@typesfast) April 27, 2022
Here’s a lay-technical description of how it works that is better than anything I could write.
As with all generative AIs, this raises lots of concerns. For example, generative AIs are biased by their training set. If most of the images of lawyers you feed it are older white men, it will mostly generate images of older white men when you ask it for lawyers; conversely, flight attendants seem to all be smiling Asian women. (As the linked article notes, OpenAI does write about this, which is more than you can say for a lot of AI creators.) Then there’s the question of intellectual property. We’ve made a chunk of math that indiscriminately remixes copyrighted content; what would the lawyers (white men or otherwise) say? Finally, these things are inevitably resource-intensive, so there are the usual questions of cost/scaling/energy when we think about widespread adoption. (Hey, at least it’s a better use of resources than bitcoin!)
I don’t have access to this AI yet, so I don’t feel like speculating too much on its role in society, but it seems like it would work great as a jumping-off point for visual artists toying with a new idea, or non-artists/hobbyists looking to bootstrap a project. It’s likely better for this than the text generators are. GPT-3 can come up with some pretty neat turns of phrase–“it was the kind of place where you could fall in love, or die, or maybe even do both at the same time”–but it also spits out a lot of blah, and is often nonsensical in a way that works better for images.
As always, I do not have a strong conclusion. Open thread!
Martin
Regarding the IP, current law says that copyright only applies to human-made work. Which sort of begs the question where the human agency starts and ends. Is the author of an algorithm that generates art the creator? Can they copyright it?
So my guess is that nothing that DALL-E creates can be copyrighted.
Who is going to take the output of all of this to create video game assets and procedural storytelling?
Roger Moore
I’m not a lawyer, but I think copyright law has already hashed a lot of the details out. As far as I understand it:
ian
Great, now the machines can make art better than we can. Bring back the Luddites!
Major Major Major Major
@Martin: it might not be copyrightable, but does that by definition mean it can’t be infringement?
frosty
So there’s dozens of images of a lion holding a globe. Are they all from the 5-word phrase? Can artists control what the AI does or do they have to wade through hundreds of images to find one they like?
ETA: no, I didn’t RTFM that you linked to. I’ll do it now.
Roger Moore
@frosty:
It sounds like they can specify more detail. For example, the picture at the top asks for an oil painting and says the girl should be sad. I think the thing with the lion holding a globe is to show off the range of the AI when given a very vague request.
NotMax
The culmination of GIGO?
//
lowtechcyclist
I think that “pre-modern Japanese scroll about Bitcoin” desperately needs to be made into a NFT!
CaseyL
I looked through the “lion holding a globe” thread, and there is a strong flavor of “found this image on line” for a few of them. Are these really generated fresh by the AI, or does it go off on its own internet searches to find likely images and then bring them back? Because if it’s doing the latter, that would definitely be a copyright infringement!
Major Major Major Major
@CaseyL: It definitely doesn’t search the web, but strictly speaking there’s no reason that any given generated file can’t be pretty similar to a single other image. GPT-3 definitely repeats snippets verbatim sometimes, depending on the parameters.
Owing to the way these things work though… I’m not sure how reverse-engineerable a generated image is.
VeniceRiley
Okay lets hook this up to a 3D printer.
Old School
@lowtechcyclist:
Although if there’s no copyright, then what would make your NFT more valuable than my NFT? Or the millions that I could make?
Roger Moore
@Major Major Major Major:
It seems like you could do an search using the GPT3-generated image and see if there was an existing image that was suspiciously close. It’s not quite reverse engineering, but it would tell you if it was stealing wholesale from something available online.
Chief Oshkosh
They’re all male lions.
Ben Cisco
@VeniceRiley: And inject nanobots!
I mean, if we’re going to ignore all AI-related sci-fi, might as well go for the gold here…
NotMax
@Ben Cisco
Damn nannybots, force feeding me kale.
;)
RaflW
I don’t actually think humanity is at all prepared for how f**ked up everything is going to be when AIs can deepfake anything.
“Fake news” that’s just text-based lying is already poisoning much of political discourse and heightening people’s receptiveness to authoritarian rule. When any leader’s image and voice can be altered into appearing to say any old reckless, insane (or even positive) thing, we cannot trust sources of info for anything.
Maybe there’s a functional way through all this, but I feel like my trove of William Gibson novels are lurching ever closer to being predictive roadmaps.
David Fud
Maybe they should take the text generations and make the AIs create art with it. For instance, I wonder what it would do with
A feedback loop of computer Frankensteins murmuring and virtually brushing for our entertainment.
Ben Cisco
@NotMax: LOL
Brachiator
Dealing with the bias and racism and sexism in generating AI is a huge challenge. The lion immediately reminded me of the stone lions at a library, the MGM logo and a host of other lion images, all male. And yeah, the Cowardly lion from the Wizard of Oz.
The libraries of images are hugely biased. I think I read about this before, and did a quick check; so if you do a Google search for “shaking hands,” the result will overwhelmingly be of White men. Occasionally there will be a photo of 2 men shaking hands with a smiling white woman looking on with approval.
And there is a kind of weird defiance where the people who are working on AI projects seem to be happy to deal in stereotypes. And even some women and people of color who work in the industry conform, maybe because they don’t want to make waves.
ETA. Even when the image of “shaking hands” is a person who is ill, the result is a white male.
CaseyL
@RaflW: Humanity is constantly racing along with doing things “because we can” and has rarely if ever asked “… but should we?”
(Or at least, the ones who do ask that are never the ones who could actually stop whatever-it-is from being invented or developed.)
I was in a Twitter discussion a few weeks ago about the apparent lack of other “higher” intelligent life out in the universe.
My position is that sapience (the capacity for abstract thought and moral reasoning) may be a very overrated thing, in that it appears – in humans, at least – to lead to moral, political, and environmental disaster. Maybe there have been quite a few “sapient” species in the cosmos, and maybe every single one of them has already died out, from the same destructive selfishness and hubris that characterizes humankind.
“Sapience” allows for a level of abstraction that alienates us from one another, and from our organic, natural foundation.
I think the species who are, at most, sentient (less abstract, more concrete problem-solving) have a better chance of surviving over millions of years. Like, f’rex, almost everything that has lived on Earth other than modern mammals.
cain
If the artist is an AI generator – imagine trying to actually apply copyright law, and realize that it lasts what 75 years after the image was produced or after someone shut off the AI? :-)
different-church-lady
So why are we doing this?
Brachiator
Also, is DALLE-2 a play on Salvador Dali?
different-church-lady
@RaflW: I’ve been saying this for the past five years: It’s like a bunch of tech bros found a stash of dystopian sci-fi novels from the 50’s and said, “Yeah, let’s do that!”
Brachiator
@CaseyL:
Not to sure about the idea of “natural foundation. ” We primates have always been social. We can be alienated from one another. But we can also care for one another.
Most successful life forms, probably bacteria.
Major Major Major Major
@Brachiator: yep. Human bias in, human bias out.
Major Major Major Major
@CaseyL: humans are awesome, definitely in the top fifty animals.
we haven’t detected aliens yet because “Space is big. Really big. You just won’t believe how vastly hugely mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist, but that’s just peanuts to space.”
Steve in the ATL
@CaseyL:
Does anyone other than fundamentalist evangelicals seriously think that humans on earth are the only higher intelligent life in the infinite universe? That’s really small minded.
Martin
@Major Major Major Major: It can be infringement, but that’s going to be case by case. It’s possible for DALL-E to make infringing art, but in most cases it should be non-infringing (just statistically). And that makes for an interesting problem for the community to sort out – can DALL-E be augmented by an assessment of how derivative that art is, both of the content that it learns from, but also from other content. And then to reject that result as infringing. That alone would be useful if they could figure it out.
Major Major Major Major
@Steve in the ATL: fundie Mormons are big believers in extraterrestrial life!
Martin
@VeniceRiley: I’m waiting for generative design to take off.
CaseyL
@Brachiator: Bacteria, for sure, but I really can’t help wondering what life on Earth would be like if that damned asteroid hadn’t hit. There were many mass extinctions before that one, but it seems to be the only one that was primarily caused by an externality.
If not for the asteroid, “saurians” might still be around, in some form, even if there was a less dramatic extinction event.
Anyway, you don’t have to lim it the successful species to bacteria. Reptilians have done quite well (crocodiles essentially unchanged for 40 million years).
Major Major Major Major
@CaseyL: intelligence evolved independently at least twice, possibly three times depending on how you count birds, there’s no real reason to think it couldn’t have happened again with something other than vertebrates and cephalopods.
LeftCoastYankee
@different-church-lady:
This!
I can’t wait for the version of Twitter where everyone’s sending each other pictures based on some limited number of words.
Conceptually cool and interesting until some grifter decides it can be monetized in a way that is worse than useless
different-church-lady
@Major Major Major Major: After a while the style settles down a bit…
different-church-lady
@LeftCoastYankee:
Roger Moore
@cain:
US copyright law already restricts copyright to humans and groups of humans. By definition, if something is created by a computer program, animal, random process, etc. it lacks the degree of creativity necessary to qualify. That seems unlikely to change. Also, corporate copyright (i.e. copyright held by a group rather than an individual) is already capped at 95 years from creation, not 75 years from the death of the creator. In the unlikely event copyright is extended to works created by AI, it will presumably be for that 95 years, not life plus 75 years.
Roger Moore
@Brachiator:
I assume it’s a play on Dali and WALL-E from the Pixar movie.
different-church-lady
@Roger Moore: You’re forgetting to consider AI lawmaking.
CaseyL
@Steve in the ATL: It’s not small-mindedness on my part. I would LOVE for there to be sapient life elsewhere in the galaxy. I grew up on Star Trek (the original series), and for most of my young life I expected us to be exploring new worlds and finding new civilizations by now.
But the more I learn about life on Earth (5 great extinctions over 3 billion years, with the evolutionary clock essentially resetting each time), and the more we learn about other planets orbiting other stars (too hot, too cold, no water, tidally locked, no tectonic activity), it just seems increasingly unlikely there are other species out there like us; i.e, highly-evolved and specialized land mammals.
Or- there were, but they rose millions of years before we did, and died off long, long ago.
The best bet for extraterrestrial life elsewhere in this solar system is to find thriving deep sea ecosystems (similar to wide range of life we have around the deep ocean vents) under the ice of Europa, Titan, and a few other moons.
Roger Moore
@Major Major Major Major:
Humans are clearly S Tier.
Steeplejack
@RaflW:
Maybe we can skip through the bad bits to the AI realm of Iain M. Banks’s Culture.
different-church-lady
Artificial humanity.
Steve in the ATL
@CaseyL:
What do you have against the Greys and Pleiadians?
ETA; caught a show in a hotel room last week with that “I’m not saying it’s aliens…” guy
SFBayAreaGal
I noticed the sad girl is white.
Steve in the ATL
@Roger Moore:
Surely you don’t think Disney is done extending this, do you?
Brachiator
@Steve in the ATL:
When I was a kid, the scientific consensus was that our solar system was the only one with planets. Period. End of issue.
Intelligent life elsewhere seems implausible, but we have barely begun seriously studying the universe.
sab
@Steve in the ATL: It’s only fair. Corporations live longer than people.
sab
@SFBayAreaGal: I thought she was brown.
Steve in the ATL
@sab: even though they are people!
sab
@Steve in the ATL: Cockroaches are nearly as tough as tardigrades.
Brachiator
@Roger Moore:
Capping AI copyright at 95 years seems crazily arbitrary.
This discussion also reminds me of some hard core anti-copyright people who don’t even believe that an artist’s children should be compensated for the creator’s work. They also want to see fans as possible co-creators.
Matt McIrvin
@Roger Moore: I could see copyright trolls attempting to copyright entire categories of images by using an AI they own to generate every possible variation, or something close enough for a copyright strike. I suspect versions of this have already happened.
Matt McIrvin
@Steve in the ATL: Disney may actually be done extending it–the politics around the issue has changed not in their favor.
different-church-lady
@Brachiator: The world appears to be breaking down into two camps: those who think everything should belong to them, and those who think everything should be free.
Sure Lurkalot
I’m failing to see the purpose of this while at the same time having a distinct fear that people will use this for nefarious reasons.
Steve in the ATL
@different-church-lady: we are caught between republicans and Bernie bros!
Roger Moore
@Steve in the ATL:
The sense I’ve gotten is that Congress is less interested in extending copyright than they have been in the past. I certainly won’t discount the possibility of it being extended yet again, but it doesn’t seem like the done deal it was before.
Brachiator
@different-church-lady:
This is especially true of entertainment. And the ability to easily reproduce works has exacerbated the problem. Stuff might just as well be free if everyone can get a copy.
When I studied literature, I noted but didn’t pay much attention to how people would be paid to transcribe the dialog of Elizabethan plays, and then rival companies would print the scripts and perform them or variations.
I had to look up the development of copyright in the West.
And of course later Dickens gave readings of his works because his copyrights were not respected, especially in America. Libraries grew in part thanks to people disrespecting copyright.
And so it goes. One unexpected development may be how the rapid development of inexpensive creative tools has led to fans being able to produce work that can be enjoyed on the same level as the product of original authors. And then things really get crazy when fans are able to monetize their efforts.
Roger Moore
@Brachiator:
Don’t think of it as capping AI copyright at some arbitrary length. Think of it as treating AI copyright the same as corporate copyright. Given that both AI and corporations have the same fuzzy concept of “life”, it seems like a reasonable approach.
I am one of the people who thinks our current copyright terms are way too long. I don’t have a great problem with the children of artists being able to get something from their parents’ work after their death. I think 75 years after the author’s death is unreasonably long. It’s probably longer than the lifespan of any children, and frequently longer than the life of any grandchildren.
The key to me is that copyright was always intended to be for a limited term. That’s written into the Constitutional discussion of copyright. The idea is that the copyright holder gets a temporary monopoly on their work in exchange for making it public. After that term is over, the work goes into the public domain so other creators can adapt it. Making the term of copyright unreasonably long makes a mockery of that concept. It locks up creative works so nobody else can do anything with them.
You only have to look at the works that are out of copyright, like the plays of Shakespeare, or Alice in Wonderland, or the early Sherlock Holmes stories, to see how much really creative stuff can happen to works once they pass into the public domain. By constantly extending copyright, we’re ensuring that doesn’t happen to the great works of the 20th Century, and we’re making the whole world poorer in the process.
RSA
Much of what OpenAI does is basic scientific research; this is a speculative application. See their “about” page:
Basic research is often exploratory, to help us understand some phenomenon like general intelligence or creativity.
Brachiator
@Roger Moore:
Re: Capping AI copyright at 95 years seems crazily arbitrary.
I can see how the parity provides neat conformity. But even though corporations are not really alive, the owners and shareholders reasonably want to get all the value they can out of a work. I don’t see the equivalent expectation on the part of AI.
I think there is a reasonable compromise here. Artists often make bad deals before they even have families or children. And too often creators don’t get paid well, get poor profit participation or lose rights to their work. I tend to favor rules that are on the side of creators.
And some artists have children or second families late in life. Seventy five years is not always a long period of time.
I previously noted one of the first copyright statutes.
The original term was 14 years. It’s interesting that the publisher, not the author originally held the copyright.
To the contrary, we often see corporations exploiting works so thoroughly that all that’s left is parody or the dregs of commercial advertising. And obsessive fan nostalgia often results in the lame recycling of the least creative aspects of the original work.
And yet here I am ultimately on your side. You have to let succeeding generations have access to works.
different-church-lady
@RSA:
Oh. Everything is fine then.
Eolirin
@different-church-lady: AI like this, but much more sophisticated, will eventually be necessary for entertainment industries, video games especially, but also film, to generate the output they need for their projects, as audience demand increases in scale and expectation of quality, in a timescale that’s viable.
What this specific AI currently does is a bit like a toy, but it’s very early days.
Also, the natural language understanding parts of this is just as impressive as the image generation, and has wildly more applicability.
On the broader topic, and not in response to you specifically, I think the copyright discussion is maybe missing something here. It’s unlikely that any of these generated images would be used as is, and even generating them involves a significant level of human intervention in terms of choosing the right parameters; there’s going to need to be a much greater discussion around what counts as a tool being used by a human to create a work that’s copyrightable and what counts as an uncopyrightable generation method, and what level of direct tweaking from the latter turns it into the former going forward or there’ll be weird edge cases in the law.
Stuff is going to be complicated. Especially when a piece with intensive human driven modification that would be viewed as human work and subject to copyright law may be difficult to distinguish from the auto generated work that isn’t.
different-church-lady
@Eolirin:
The problem here is that CGI is already making films suck ass.
RSA
There are many significant long-term risks with AI. If you can explain the potential harm here and possibilities for mitigation, that would be useful.
different-church-lady
@RSA:
I believe the potential for harm has been adequately described up-thread.
Why I should be the one on whom it is incumbent to explain how the harm can be mitigated is a question I do not follow.
Major Major Major Major
@Brachiator:
In the infinite universe, really?
RSA
@different-church-lady:
None of the concrete risks mentioned are related to the specific technology Major^4 describes. Deep fakes are a problem that AI makes possible. Bias is a problem in AI output. DALL-E 2 isn’t contributing to either of these problems. If people are using this post to say AI in general has risks, fine, but I was trying to get at how DALL-E 2 poses concrete risks, rather than maybe-novel challenges to copyright laws and such.
But you’re right, it’s not incumbent on you to come up with possible solutions. I don’t know if there are general ways to prevent private actors from developing and releasing software that others can use to do harm, except on a case-by-case basis. It may not be a solvable problem.