NVIDIA’s incredible new AI that can create images, and more

07.03.2023 0 By admin

Today we are going to look at NVIDIA’s incredible new AI that can create images, and more.

Now, wait a second.

Stop right there.

Every Fellow Scholar knows that today,  there are plenty of text to image AIs out there, where in goes a piece of text,  and out comes an image.

They come in all kinds of flavors these days.

Everyone  knows.

So our question today is why publish this paper? Do we really need more of these? Well, this new paper is called StyleGAN-T.

Keep your eyes on this part, because this  means that this is a GAN-based technique.

A GAN is a Generative Adversarial Network.

This  roughly means that we have two neural networks competing against each other,  and as they compete, they get better together.

Okay, that all sounds great,  but I am still not convinced.

What does this give us? Why would we even use this? Well, there are two excellent reasons.

Reason number one, GANs are excellent at latent-space  interpolation.

What does that mean? It means that we can create these interesting 2D spaces,  choose a point on this plane, which in this case, corresponds to a font.

And the points  nearby hide other fonts that are similar to this one.

So as we start exploring nearby,  we get a beautiful, smooth morphing animation between these fonts.

In our earlier paper,  we did something similar with photorealistic material models, so artists can find or even  better, adjust a material so that it fits their virtual worlds best.

So this new technique supposedly can do proper latent-space exploration for text  to image.

READ  Which video games are fun to play on weekends?

Supposedly.

Now let’s see if it is true in practice here too.

Here is a previous technique, the crowd favorite, Stable Diffusion.

This can make  an interesting video, but as you see, the results are quite jumpy.

It doesn’t  feel like one result morphs into the next one.

And now, let’s see the new technique.

Oh yes,  now that’s what I am talking about! With this, we get more continuous results and  can explore these latent spaces as much as we desire, and that is going to be super useful.

You see, what we can do with this is that we write a prompt, for instance,  “A corgi’s head depicted as an explosion of a nebula.

” And, we don’t just get an image anymore.

No-no, due to its amazing interpolation capabilities, we get an opportunity to  not only witness the birth of the universe, but to choose the good boy that we find to be the  most adorable.

I choose this one.

Right before it morphs into a cat.

Yes, this one will do.

Which  one is your favorite? Let me know in the comments below.

So its latent-space exploration capability  is not only an afterthought here, it is one of the new technique’s key features.

Now, remember,  I mentioned that this is reason number one of why we should use it.

So what is reason number two? Well, two, it is fast.

Real fast.

But to know how fast exactly, let’s pop the hood and have a  look.

Now hold on to your papers, Fellow Scholars and …what? 0.

READ  Game Grumps 2015 Playthrough

1 seconds per image? Is that really  possible? Wow.

These animations can be made practically in real time! The age of real-time  AI image, and even video synthesis is here.

My goodness! It did not take decades, it didn’t even  take years.

Less than a year after OpenAI’s DALL-E 2, which asked for approximately 10-15 seconds per  image, we are here.

Real time.

I can’t believe it.

Wow! This is truly incredible.

However, not even  this technique is perfect.

Let’s see a failure case.

A sign that says deep learning.

Come on,  this one again? Remember our moment with DALL-E 2? It had the same issue.

There are techniques out  there that do much better on text, for instance, Imagen Video is better for this, however,  it is not nearly as fast as this.

Yes, that one is about a hundred times slower per image.

So, the perfect text to image AI still doesn’t exist, every technique offers  its own little tradeoff, but man, are they all getting better  and better at an insane pace.

Amazing new papers are popping up every week.

So, what do you think? What would you use this for? Let me know in the comments below! Thanks for watching and for your generous support, and I’ll see you next time!