Cost-effective PyTorch model inference by…

Nov 19, 2023

How hard is it to deploy pretrained models on GPUs without tons of YAML files and unmanaged instances?

4 Comments

Re: pricing. I think you're comparing Modal's GPU price (without cpu and mem) with AWS GPU+CPU+MEM price

Expand full comment

Good point. I will update the calculations when I have more time :)

Expand full comment

Thanks for writing this, it was a great read. What are some things you think Modal is still lacking?

Expand full comment

Maybe fine-grained control over the container scheduling algorithm? There's an experimental feature for this: https://modal.com/docs/guide/concurrent-inputs

In regards to storage, there is support for persistent file systems but it's a bit different than S3 - so maybe object storage support as well?

Expand full comment

Fikisipi by Filip Dimitrovski