Memory Efficient Routing of Large Language Model Inference Requests

Published in US Patent, 2025

Filed on 18/06/2025

Direct Link