Awesome! Yeah, I want to use it a heck of a lot, I’m making a sweet integration between long-form, wink articles, book kinds, and text-to-speech DVM and yours is the DVM that makes the highest quality audio, but the output is not very reliable and doesn’t conform too well with the NIP-90 spec (for example, the request even is not inserted in the output)

Would be amazing if you can improve it!

Also, it’s missing a NIP-89 announcement so I can’t list it on Highlughter

Reply to this note

Please Login to reply.

Discussion

Alright, I just pushed some updates. Which means it's either broken or fixed.

- Published a NIP-89 announcement

- Added 5250 request in output

The code for the DVM is not organized well so I might redo the whole thing if you think you'll use it often.

I had to update the pricing for the DVM because it was set at $0.01 for every request, regardless of length (the value would fluctuate in sats relative to bitcoin price). It's now set to $0.36/1000 characters of content.

Let me know if it's no longer competitive, but it's the best price I can give right now with the services I'm using.

It currently only works for "event" type inputs, but I can add "text", if needed. It should work for other languages too.

Does the DVM not work on any Long Form Content Kind 30023?

Are you requesting TTS for Long Form Content events or text?

And yeah, I plan on making this a very integral part of highlighter so I expect a lot of demand coming in hopedully

Hell yeah 🤙

Pumped to see it get some action!

I’ll start working on it more and get some improvements going now that I know it’s not a dead project

Thanks for your help and feedback 🙏

I got you! I’ll start working on it again tonight

Might take me some time to figure it out again, but I’ll let you know when I have some updates pushed

I would love to transfer all my written books in PDF format into audio books, read by Guy Swann's voice. is that something this DVM could do as well?

The DVM currently only does Speech to Text for Nostr events, but I can update it to work with urls if the PDF is available online

Full disclosure though, the cost is $0.36/1000 characters (not words) so for a full length book it could be more than $100 depending on the length

I see. Is it that expensive to do inference on the GPU?

It's a API wrapped as a DVM and that's the service cost of the API

I'm not running my own model or hardware

I can make a much cheaper DVM, but the largest part of the expense is the voice cloning for the service I'm using