Google is intensifying its efforts to embed generative artificial intelligence (GenAI) directly onto consumer devices. The company is championing LiteRT, a framework billed as a "universal" solution for on-device machine learning. Its specific iteration for language models, LiteRT-LM, is being highlighted for enabling rapid, local processing of AI tasks, moving away from cloud dependency.
LiteRT-LM is now powering GenAI features within Google's own Chrome browser, Chromebook Plus devices, and the Pixel Watch. This marks a significant step in making advanced AI capabilities accessible without constant network connection. The framework promises "blazing fast" performance, with claims of outperforming established alternatives like llama.cpp on various hardware configurations.
FRAMEWORK UNDERPINNINGS
The LiteRT framework itself is presented as an evolution from Google's earlier TFLite, aiming for "high-performance deployment." It offers acceleration across multiple hardware types – including CPUs, GPUs, and dedicated Neural Processing Units (NPUs) found in modern chipsets. Companies like Qualcomm and MediaTek are explicitly mentioned as partners in this endeavor, integrating LiteRT to harness the power of their NPUs for on-device AI.
Read More: Apple Device Activation Lock: Why Disabling Find My Matters
LiteRT is designed to be cross-platform, supporting operating systems such as Android and iOS, alongside desktop and even Internet of Things (IoT) devices like the Raspberry Pi. The framework facilitates the deployment of popular open-weight GenAI models, including Google's own Gemma family, which offers various sizes suitable for different device capabilities.
DEVELOPMENT AND DEPLOYMENT
For developers, Google provides tools and resources to convert existing PyTorch GenAI models into formats compatible with LiteRT. This includes a Python module called the Torch Generative API and a dedicated LiteRT GenAI Model Zoo hosted on platforms like Hugging Face. A command-line interface (CLI) is also available, allowing for more direct interaction and deployment.
The company emphasizes the production-readiness of LiteRT-LM, citing its open-source nature and active development. Recent updates have introduced capabilities such as Swift APIs for iOS integration, support for running models on Windows, and NPU acceleration specifically for Gemma models.
Read More: Andrej Karpathy joins Anthropic AI to build LLM models
BACKGROUND
The push for on-device AI reflects a broader industry trend seeking to enhance user privacy, reduce latency, and improve the reliability of AI-powered features. By processing data locally, devices can potentially offer more immediate responses and keep sensitive information from leaving the user's hardware. The development of frameworks like LiteRT aims to democratize access to these sophisticated AI capabilities, making them a standard component of everyday technology.