Holly Herndon on the power of machine learning and developing her digital twin Holly+ – The FADER

The FADER: Holly Herndon, thank you for joining us today for The FADER interview.

Holly Herndon: Thanks for having me.

So Holly+ has been live for about 24 hours. How have you felt about its reception so far?

Honestly, I've been really super pleased with it. I think at one point there were 10 hits to the server a second. So that means people were kind of going insane uploading stuff and that's basically what I wanted to happen. So I've been really, really happy with it. I also am happy with people kind of understanding that it's like, this is still kind of a nascent tech, so it's not a perfect rendition of my voice, but it's still, I think, a really interesting and powerful tool. And I think most people really got that. So I've been really pleased.

That's one of the things that drew me to Holly+ when I first read about it in a press release was that it seems like the technology is being developed specifically for this time and it is nascent and it is sort of still growing, but it feels like an attempt to get in on the ground floor of something that is already happening in a lot of different sectors of technology.

I mean, I've been working with, I like to say machine learning rather than artificial intelligence, because I feel like artificial intelligence is just such a loaded term. People imagine kind of like Skynet, it's kind of sentient. So I'm going to use machine learning for this conversation. But I've been working with machine learning for several years now. I mean the last album that I made, PROTO, I was creating kind of early models of my voice and also the voices of my ensemble members and trying to create kind of a weird hybrid ensemble where people are singing with models of themselves. So it's been going on for a while and of course machine learning has been around for decades, but there have been some really interesting kind of breakthroughs in the last several years that I think is why you see so much activity in this space now.

It's just so much more powerful now I think, than it was a couple decades back. We had some really interesting style transfer, white papers that were released. And so I think it's an exciting time to be involved with it. And I was really wanting to release a public version of kind of a similar technique that I was using on my album that people could just play with and have fun with. And I was actually just kind of reaching out to people on social media. And Yotam sent me back a video of one of my videos, but he had translated it into kind of an orchestral passage. And he was like, "I'm working on this exact thing right now." So it was perfect timing. And so we kind of linked up and started working on this specific Holly+ model.

Talk to me a little bit about some of those really powerful developments in machine learning that have informed Holly+.

Oh gosh. I mean, there's a whole history to go into. But I guess a lot of the research that was happening previously is a lot of people were using MIDI. Basically kind of trying to analyze MIDI scores to create automatic compositions in the style of an existing composer or a combination of existing composers. And I found that to be not so interesting. I'm really interested in kind of the sensual quality of audio itself. And I feel like so much is lost in a MIDI file. So much is lost in a score even. It's like the actual audio is material. I find really interesting. And so when some of the style transfers started to happen and we could start to deal with audio material rather than kind of a representation of audio through MIDI or through score material, that's when things I think got to be really interesting.

So you could imagine if you could do a style transfer of kind of any instrument onto any other instrument. Some of the really unique musical decisions that one might make as a vocalist or as a trombonist or as a guitarist, they're very different kind of musical decisions that you make depending on the kind of physical instrument that you're playing. And if you can kind of translate that to another instrument that would never make those kinds of same decisions, and I'm giving the instrument sentience here, but you know what I mean. Some of the kinds of decisions that a musician would make playing a specific instrument, if you can translate that onto others, I find that a really interesting kind of new way to make music and to find expression through sound generation. And I do think it is actually new for that reason.

I also wanted to talk a little bit about some of the ethical discussions around machine learning and some of the developments that have happened over the past year. Of course, the last time we spoke, it was about Travis Scott, which was an AI generated version of a Travis Scott song using his music, which was created without his consent. And over the past year as well, a Korean company has managed to develop an AI based on deceased popular singer and an entire reality show was created around that. It was something called AI vs. Human. So I was wondering if these sorts of developments in this sphere informed how you approached Holly+ and the more managerial aspects of how you wanted to present it to the world.

This is something that I think about quite a lot. I think that voice models, or even kind of physical likeness models or kind of style emulation, I think it opens up a whole new kind of question for how we deal with IP. I mean, we've been able to kind of re-animate our dead through moving picture or through samples, but this is kind of a brand new kind of field in that you can have the person do something that they never did. It's not just kind of replaying something that they've done in the past. You can kind of re-animate them in and give them entirely new phrases that they may not have approved of in their lifetime or even for living artists that they might not approve of. So I think it opens up a kind of Pandora's box.

And I think we're kind of already there. I mean if you saw the Jukebox project, which was super impressive. I mean, they could start with a kind of known song and then just kind of continue the song with new phrases and new passages and in a way that kind of fit the original style. It's really powerful. And we see some of the really convincing Tom Cruise deep fakes and things. These are kind of part of, I think, our new reality. So I kind of wanted to jump in front of that a little bit. There's kind of different ways that you could take it. You could try to be really protective over your self and your likeness. And we could get into this kind of IP war where you're just kind of doing take downs all the time and trying to hyper control what happens with your voice or with your likeness.

And I think that that is going to be a really difficult thing for most people to do, unless you kind of have a team of lawyers, which I'm sure that's probably already happening with people who do have teams of lawyers. But I think the more interesting way to do it is to kind of open it up and let people play with it and have fun with it and experiment. But then if people want to have kind of an officially approved version of something, then that would go through myself and my collective, which is represented through a DAO. And we can kind of vote together on the stewardship of the voice and of the likeness. And I think it really goes back to kind of really fundamental questions like who owns a voice? What does vocal sovereignty mean?

These are kind of huge questions because in a way a voice is inherently communal. I learned how to use my voice by mimicking the people around me through language, through centuries of evolution on that, or even vocal styles. A pop music vocal is often you're kind of emulating something that came before and then performing your individuality through that kind of communal voice. So I wanted to find a way to kind of reflect that communal ownership and that's why we decided to set up the DAO to kind of steward it as a community, essentially.

I saw on Twitter Professor Aaron Wright, he described DAOs as, "Subreddits with bank accounts and governance that can encourage coordination rather than shit posting and mobs." So how did you choose the different stewards that make up the DAO?

That's a really good question. And it's kind of an ongoing thing that's evolving. It's easy to say, "We figured out the DAO and it's all set up and ready to go." It's actually this thing that's kind of in process and we're working through the mechanics of that as we're going. It's also something that's kind of in real-time unfolding in terms of legal structures around that. I mean, Aaron, who you mentioned, he was part of the open law team that passed legislation in Wyoming recently to allow DAOs to be legally recognized entities, kind of like an LLC, because there's all kinds of, really boring to most people probably, complications around if a group of people ingest funds, who is kind of liable for tax for the XYZ? So there's all kinds of kind of regulatory frameworks that have to come together in order to make this kind of a viable thing.

And Aaron's done a lot of the really heavy lifting on making some of that stuff come about. In terms of our specific DAO, we're starting it out me and Matt. We started the project together and we've also invited in our management team from RVNG and also Chris and Yotam from Never Before Heard Sounds who created the voice model with us. And as well, we plan on having a kind of gallery that we're working on with Zuora. And so the idea is that people can make works with Holly+ and they can submit those works to the gallery. And the works that are approved or selective, then there's kind of a split between the artist and the gallery, the gallery being actually the DAO. And then any artist who presents in that way will also be invited into the DAO. So it's kind of ongoing. There will probably be other ways to onboard onto the DAO as we go, but we're wanting to keep it really simple as we start and not try to put the cart before the horse.

Now, of course, Holly+ is free to use right now for anyone who wants to visit the website. I was hoping you could explain to me how the average listener or a consumer of art can discern the difference between an official artwork that's been certified by the DAO versus something that was just uploaded to the website and taken and put into a tracker, a piece of art?

This is something we had to think about for a long time. It was like, "Do we want to ask people to ask for permission to use it in their tracks to release on Spotify or to upload?" And actually we came to the conclusion that we actually just wanted people to use it. It's not about trying to collect any kind of royalties in that way. I just want people to have fun with it and use it. So in terms of creating works and publishing them, it's completely free and open for anyone to use. We're kind of treating it almost like a VST, like a free VST at this point. So you can use it on anything and it's yours and what you make with it is yours. And you can publish that. And that is 100% yours.

We do have this gallery that we're launching on Zuora. That space is a little bit different in that you can propose a work to the DAO and then the DAO votes on which works we want to include in the gallery. And then those works, there would be a kind of profit split between the DAO and the artists. And basically the funds that are ingested from that, if those works do sell, are basically to go back to producing more tools for Holly+. It's not about trying to make any kind of financial gain, really. It's about trying to continue the development of this project.

Do you have any idea of what those future tools could look like right now?

Well, I don't want to give too much away, but there will be future iterations. So there might be some real-time situations. There might be some plugin situations. There's all kinds of things that we're working on. I mean, I think right now this first version, Chris and Yotam have been able to figure out how to transfer polyphonic audio into a model, which is... I'm a major machine learning nerd. So for me, I'm like, "Oh my God, I can't believe you all figured that out." That's been such a difficult thing for people to figure out. Usually people are doing monophonic, just simple one instrument, monophonic lines. But you can just put in a full track and it will translate it back. And what you get back, it's still does have that kind of machine learning, scratchy kind of neural net sound to it.

I think because it has that kind of quality it's easier for me to just open up and allow anyone to use that freely. I think as the tools evolve and speech and a more kind of maybe naturalistic likeness to my voice becomes possible, I think that that opens up a whole new set of questions around how that IP should be treated. And I certainly don't have all of the answers. It's definitely something that I'm kind of learning in public, doing and figuring out along the way. But I just see this kind of coming along the horizon and I wanted to try to find, I don't know, cool and interesting and somehow fair ways to try to work this out along the way.

Follow this link:
Holly Herndon on the power of machine learning and developing her digital twin Holly+ - The FADER

Related Posts

Comments are closed.