All Posts

Published on
February 15, 2025
ONNX and running models in the browser
AI Artificial-Intelligence Deep-Learning LLMs Quantization Models ONNX Hardware
Last week I blogged about how Quantization can help you run your models on lower-powered hardware. In todays blog, I am extending the discussion further, talking about ONNX (Open Neural Network Exchange), which provides a standard format for representing machine learning models. This enables interoperability between frameworks and simplifies deployment across diverse hardware, including browser-based inference with onnxruntime-web. I have also included a demo to run a model in the browser.
Published on
February 2, 2025
Quantization of Models: Why and How
AI Artificial-Intelligence Deep-Learning LLMs Quantization Models Data-Types Performance Optimization
When storing data in memory, the data type used to represent the data has an impact on the memory usage and the performance of the overall system. Consider saving a number. On a high level, the number can either be an integer (whole number) or a floating-point number (number with decimal). Floating-point numbers can represent larger range of numbers with higher precision. Weights and biases in a large language model, which are learned during training and are used to make predictions, are stored as floating-point numbers to maintain high precision. The count of these parameters is what constitutes the size of the model, memory usage and how much computational resources are needed to run the model. In this post, we will discuss how quantization can be used to reduce the memory usage of models and improve performance (assuming the loss of precision is acceptable).
Published on
January 25, 2025
Try OpenTofu for Your Next IaC Project
Infrastructure-as-Code IaC OpenTofu Terraform DevOps Cloud Multi-Cloud Pulumi Platform-Engineering Automation
In one of my previous articles, I discussed why and how to adopt Infrastructure as Code (IaC) to manage your cloud infrastructure efficiently. There are several tools and frameworks available for IaC, most notably Terraform, Pulumi, Ansible, Puppet, etc. These tools allow you to define and manage your infrastructure as code, enabling automation, repeatability, and scalability in your cloud environment. In this article I want to discuss OpenTofu - an open-source alternative to Terraform that has gained popularity recently.
Published on
November 18, 2024
Optimizing Storage Costs with Automatic Tiering
Cloud-Storage Cost-Optimization Data-Protection MinIO Tiering Multi-Cloud Compliance
Storage costs can quickly add up as data volumes grow. Automatic tiering is a powerful technique that can help optimize storage expenses by moving data between different storage tiers based on its access patterns and business requirements. With multi-cloud environments tiering is even more important as it can help you leverage the best storage options across different cloud providers. In this article, I will discuss building a solution around automatic tiering using MinIO as the storage backend.
Published on
October 6, 2024
Are LLMs getting closer to Human Like reasoning and AGI?
AGI Agents Assistants Open-AI Artificial-Intelligence
Humans are capable of complex reasoning. When posed with a problem, they break it up into smaller steps, iteratively going through each step, learning and solving to reach the end goal. Most AI models up until now havent been capable of this complex reasoning tasks that require multi-step thinking and adaptive learning. Last month OpenAI released o1 which addresses these challenges by incorporating Chain-of-Thought and Reinforcement Learning to achieve near-human reasoning capabilities.

All Posts

All Posts

ONNX and running models in the browser

Quantization of Models: Why and How

Try OpenTofu for Your Next IaC Project

Optimizing Storage Costs with Automatic Tiering

Are LLMs getting closer to Human Like reasoning and AGI?