BitNet.cpp is Microsoft’s official inference framework for 1-bit LLMs. It enables running very large quantized models on standard CPUs without GPU hardware. The framework provides optimized kernels for lossless inference at 1.58-bit precision using the BitNet b1.58 architecture. The first release focuses on CPU inference, with GPU and NPU support planned. This lets you run models that...
PDFCraft, Browser PDF Toolkit for Private Documents
P
