5 repos
Database Engines — Data & Databases
We curate 5 GitHub repositories matching data & databases · Database Engines. Refine with filters or upvote what's useful.
Database Engines — Data & Databases
- denoland/deno
denoland/deno
106,258Deno is a high-performance runtime for JavaScript and TypeScript that prioritizes security and developer productivity. Built on the V8 engine, it provides a secure execution environment that enforces a default-deny security model, requiring explicit user authorization for access to system resources like the file system, network, and environment variables. The runtime natively supports modern web-standard APIs, ensuring consistent behavior and portability across different environments. What distinguishes Deno is its integrated approach to the software development lifecycle. It bundles essential utilities—including a formatter, linter, test runner, and dependency manager—directly into the runtime, eliminating the need for external build tools or complex transpilation steps. The platform features a universal module resolution system that supports remote HTTPS URLs, local paths, and standard package registries, all backed by lockfiles to ensure build determinism and supply chain security. Beyond its core runtime capabilities, Deno includes a built-in, persistent key-value database engine that supports atomic transactions and reactive data monitoring. It also provides a robust compatibility layer for the Node.js ecosystem, allowing for the seamless execution of legacy modules and native binary addons. For multi-tenant or distributed applications, the runtime offers isolated sandbox environments that manage resource constraints and security boundaries, facilitating secure code execution in shared infrastructure. The project is distributed as a single binary, providing a unified toolchain for managing dependencies, executing tasks, and configuring runtime security policies.
Rustdenojavascriptrust - nomic-ai/gpt4all
nomic-ai/gpt4all
77,146GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a comprehensive ecosystem for managing the entire model lifecycle, including discovery, downloading, and configuration of local weights. What distinguishes the platform is its integrated retrieval-augmented generation engine, which allows users to index local documents into semantic vector spaces. This capability enables context-aware chat sessions where the model can reference private files, notes, and spreadsheets to provide grounded, relevant responses. The system also features a local HTTP server that exposes an OpenAI-compatible API, allowing developers to integrate these private, self-hosted models into existing applications and workflows. Beyond its core inference and retrieval capabilities, the project includes a graphical desktop interface for end-user interaction and a Python software development kit for programmatic access. These tools support advanced configuration of model parameters, performance monitoring, and the management of local embedding pipelines for custom semantic search tasks. The software is distributed as a unified application package, with documentation available to guide users through installation and local environment setup.
C++ai-chatllm-inference - mlabonne/llm-course
mlabonne/llm-course
75,340This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as well as the practical implementation of supervised instruction fine-tuning and preference-based model alignment. The repository distinguishes itself by providing a deep dive into advanced model composition and optimization techniques. It details methodologies for weight-space model merging and mixture-of-experts strategies, alongside practical guidance on low-precision parameter quantization and inference optimization to manage hardware requirements. Furthermore, it explores the development of autonomous agentic systems capable of tool-use orchestration and the construction of retrieval-augmented generation pipelines to ground model outputs in external data. The content spans the entire technical stack, from foundational deep learning concepts and neural network design to the complexities of deploying, evaluating, and securing models in production environments. It includes a curated collection of technical articles, blog posts, and interactive notebooks that track state-of-the-art research trends and experimental methodologies in generative artificial intelligence.
courselarge-language-modelsllm - redis/redis
redis/redis
73,096Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to process requests efficiently, while maintaining data durability through append-only persistence logs and asynchronous snapshotting mechanisms. What distinguishes Redis is its ability to handle complex data structures—including strings, hashes, lists, sets, and sorted sets—alongside hierarchical JSON documents and high-dimensional vector embeddings. It supports advanced operational patterns such as active-active database deployment for global distribution, real-time data streaming, and probabilistic statistics for large-scale data analysis. These capabilities are complemented by a pluggable indexing engine that enables semantic similarity matching and full-text retrieval. The platform offers a comprehensive ecosystem for managing distributed state, including master-replica replication, automated cluster management, and granular security controls like access control lists and TLS encryption. Developers can interact with the database through language-specific client libraries that support connection multiplexing and object mapping, or via a command-line interface for direct administrative tasks and scripting. Redis is deployed through standard package managers and supports both self-managed clusters and managed cloud instances. Observability is provided through integrated tools for performance analysis, slow log monitoring, and bulk data management.
Ccachecachingdatabase - twitter/the-algorithm
twitter/the-algorithm
72,764The algorithm is a distributed recommendation engine pipeline designed to construct and serve personalized content timelines. It functions as a multi-stage orchestration layer that aggregates candidate content from diverse social graphs and high-dimensional embedding spaces, processing user interaction data to deliver a unified, ranked experience. The system utilizes a high-performance machine learning serving infrastructure to execute deep learning models that predict engagement probabilities in real-time. It distinguishes itself through a hybrid retrieval strategy that combines graph-traversal techniques for discovering content outside of a user's immediate network with vector-based similarity searches to identify relevant interests. Beyond core ranking, the platform incorporates a post-ranking processing layer that applies heuristic filters to ensure content diversity, visibility preferences, and social quality safeguards. This architecture also supports multi-task learning to optimize relevance across various platform surfaces, including the integration of non-content items and personalized notifications.
Scala