A new rabbit hole (AI art & music)

Sorry, that is not correct. CUDA-Cores and -support are/is available for a lot of cards, not only for RTXxxxx.

The GTX1660 and the RTX2080 are based on the same architecture and support both CUDA in Version 7.5: Turing (microarchitecture) - Wikipedia

The most limiting factor will be the VRAM. But even the 1660 should render something with small resolutions, this was also written in the excellent blog posts about the text-to-image scripts: Text-to-Image Summary – Part 1 | Softology's Blog

Sorry for being imprecise, CUDA cores is not correct, RT and Tensor cores is what I meant…

According to Softologys blog you may be able to use a non-RTX card, but at the reduction in settings that would be required seem to make it mostly not worth the effort

My assertion that an RTX was required was based directly on the machine learning mode pre requisite installation instructions:

An NVIDIA 3090 GPU with 24GB VRAM is highly recommended and will get the best performance and highest resolution outputs.

An NVIDIA 2080 GPU with 8GB VRAM would be the bare minimum hardware spec and will only give you small resolution images.

If you have an older GPU or one with less than 8GB VRAM, do not bother. You will only be disappointed and complain “it doesn’t work”.

1 Like

Hmmm :face_with_raised_eyebrow:

I uploaded a video to YouTube of some cool text-to-image morphing I made, but it won’t play for me, does this work for anyone?

8 Likes

Works for me :smiley:

Oh good, must just be my browser playing up for some reason… time to delete some caches and cookies I guess

1 Like

Works for me on Ubuntu Studio 21.10 with Brave Browser. Realy cool !

More alien synthesizers from beyond this realm, for your inspiration :sunglasses:

Even more alien synthesizers from beyond this realm, for your inspiration :sunglasses:

1 Like

The second image is really nice! What Script are you using? I am half-through, but I have not found something, that comes even close to your creations.

Currently I am using the disco diffusion v5 script

for the above images i used:

  • a custom size of 640x360 (this gets resized by VoC to an acceptable ratio)
  • 7500 guidance scale
  • 250 range scale
  • CLIP denoised turn on
  • Secondary model turned on (turning this off seems to double the rendering time)
  • 350 iterations
  • Super resolution output using ruDALL -E Real-ESRGAN x2

But really that is all to do with image quality, in terms of what you get it is all about text prompt trial and error, and using some sneaky tricks

I slightly change the text prompt each time, adding a word, changing a word, or removing a word depending on the outcome.

For the final image I ended up with the text prompt: Electronic piano keyboards and Modular synthesizers with dials arcane wires instruments, GI render macro ZBrush CGSociety

The GI render is a trick to make the image look computer generated (check out global illumination to see what that looks like)

I introduced the macro part after the first 5 to get rid of the black and white style and introduce a bit of depth of field

And I think the first prompt had the word scaffolding included, but i removed that at some point

2 Likes

Many thanks! That worked instantly.

4 Likes

I was triggered by Tensor cores.

I do know Tensor Flow, Google’s open source machine learning platform.

So I was wondering…has AI/machine learning entered the consumer hardware processor world?

And yes…AI/machine learning in a consumer processor is a thing these days. The Tensor Processing unit.

1 Like

And they are not the only one. Nvidia Jetson is another consumer-ML-board.

I love all of these! Especially with the depth of field turned on it looks like you got very close to this weird machine and snapped a picture with a DSLR.

Time is definitely not standing still. Google has another spin-off from the Tensor Flow project:

Magenta. ‘A research project exploring the role of machine learning in the process of creating art and music.’

And Google is (of course) not the only one moving into that territory…

But…that’s another Rabbit hole…

1 Like

Very tangentially related; but I thought this video was really interesting

2 Likes

I’m now experimenting with the Cutout batches setting

It has a significant effect on render time (ie. longer)

But potentially gives more detailed results?

These were with a value of 4 and took > 5 minutes to render

I’ve read online that the difference with Cutout batches is more noticeable with a seed image, so I took this photo of my modular setup

Using that as a seed image gave me these results:

1 Like

Ah, Derek Muller’s Veritasium channel. Many high quality educational deep dives into many interesting subjects.

This one on analog computers, neural networks and AI.

Ironically in VCV Rack we’re often trying to simulate analog systems in a digital environment…

I can’t really hope to run this stuff on my own, but I got an invite to Midjourney yesterday and I have hardly been doing anything else since then :smiley: Crazy stuff.

and, off topic:

Too lazy to read and check if it was suggested already, but it would be really cool to create an AI that works with VCV. Maybe force it replicate the sounds with a set of VCOs. Haha, i hope there is someone crazy enough to do it