Awesome List Updates on Jan 22 - Jan 28, 2024
34 awesome lists updated this week.
🏠 Home · 🔍 Search · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor
1. Awesome Flame
App Releases / Casual
- Save The Potato source-code (⭐32) - 🥇 Winner of Flame Game Jam 3.0 - Rotate the shields and save the potato from incoming orbs! By imaNNeo. for Android or iOS
2. Awesome Software Patreons
Open Source Projects
- Bottles - Easily manage and run Windows apps on Linux.
- Sonic Pi - Code-based music creation and performance tool.
Open Source Projects / Operating Systems
- PostmarketOS - A real Linux distribution for phones.
3. Awesome Vue
Projects Using Vue.js / Open Source
- vue3-realworld-app (⭐36) - 🖖 Best practices for building RealWorld with Vue3
4. Awesome Git Hooks
Git Hook Scripts / pre-commit
- dotenvx (⭐919) - Prevent committing your
.env
file(s) to code.
5. Awesome Kotlin
Android / Projects
- inorichi/tachiyomi - Free and open source manga reader for Android.
6. Awesome Datascience
Deep Learning Packages / Visualization Tools
7. Awesome Rails
Gems / Other external resources
- solid_queue (⭐1.7k) - A gem to Database-backed Active Job backend 🔴
8. Awesome Zsh Plugins
Plugins / superconsole - Windows-only
- hypnosnek (⭐0) - Simple utilities with p10k integration for managing
python
environments.
Themes / superconsole - Windows-only
- zap-robbyrussell (⭐1) - The OMZ robbyrussell theme, patched to add compatibility with zap.
9. Awesome Azure Openai Llm
What is the RAG (Retrieval-Augmented Generation)?
RAG (Retrieval-Augmented Generation) : Integrates the retrieval (searching) into LLM text generation. RAG helps the model to “look up” external information to improve its responses. cite [25 Aug 2023]
Retrieval-Augmented Generation: Research Papers
- Benchmarking Large Language Models in Retrieval-Augmented Generation: [cnt]: Retrieval-Augmented Generation Benchmark (RGB) is proposed to assess LLMs on 4 key abilities [4 Sep 2023]:
-
Expand: Research Papers
- Active Retrieval Augmented Generation : [cnt]: Forward-Looking Active REtrieval augmented generation (FLARE): FLARE iteratively generates a temporary next sentence and check whether it contains low-probability tokens. If so, the system retrieves relevant documents and regenerates the sentence. Determine low-probability tokens by
token_logprobs in OpenAI API response
. git (⭐562) [11 May 2023] - Self-RAG: [cnt] 1.
Critic model C
: Generates reflection tokens (IsREL (relevant,irrelevant), IsSUP (fullysupported,partially supported,nosupport), IsUse (is useful: 5,4,3,2,1)). It is pretrained on data labeled by GPT-4. 2.Generator model M
: The main language model that generates task outputs and reflection tokens. It leverages the data labeled by the critic model during training. 3.Retriever model R
: Retrieves relevant passages. The LM decides if external passages (retriever) are needed for text generation. git (⭐1.7k) [17 Oct 2023] - A Survey on Retrieval-Augmented Text Generation: [cnt]: This paper conducts a survey on retrieval-augmented text generation, highlighting its advantages and state-of-the-art performance in many NLP tasks. These tasks include Dialogue response generation, Machine translation, Summarization, Paraphrase generation, Text style transfer, and Data-to-text generation. [2 Feb 2022]
- Retrieval meets Long Context LLMs: [cnt]: We demonstrate that retrieval-augmentation significantly improves the performance of 4K context LLMs. Perhaps surprisingly, we find this simple retrieval-augmented baseline can perform comparable to 16K long context LLMs. [4 Oct 2023]
- FreshLLMs: [cnt]: Fresh Prompt, Google search first, then use results in prompt. Our experiments show that FreshPrompt outperforms both competing search engine-augmented prompting methods such as Self-Ask (Press et al., 2022) as well as commercial systems such as Perplexity.AI. git [5 Oct 2023]
- RECOMP: Improving Retrieval-Augmented LMs with Compressors: [cnt]: 1. We propose RECOMP (Retrieve, Compress, Prepend), an intermediate step which compresses retrieved documents into a textual summary prior to prepending them to improve retrieval-augmented language models (RALMs). 2. We present two compressors – an
extractive compressor
which selects useful sentences from retrieved documents and anabstractive compressor
which generates summaries by synthesizing information from multiple documents. 3. Both compressors are trained. [6 Oct 2023] - Retrieval-Augmentation for Long-form Question Answering: [cnt]: 1. The order of evidence documents affects the order of generated answers 2. the last sentence of the answer is more likely to be unsupported by evidence. 3. Automatic methods for detecting attribution can achieve reasonable performance, but still lag behind human agreement.
Attribution in the paper assesses how well answers are based on provided evidence and avoid creating non-existent information.
[18 Oct 2023] - INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning: INTERS covers 21 search tasks across three categories: query understanding, document understanding, and query-document relationship understanding. The dataset is designed for instruction tuning, a method that fine-tunes LLMs on natural language instructions. git (⭐194) [12 Jan 2024]
- RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture. [16 Jan 2024]
- The Power of Noise: Redefining Retrieval for RAG Systems: No more than 2-5 relevant docs + some amount of random noise to the LLM context maximizes the accuracy of the RAG. [26 Jan 2024]
- Corrective Retrieval Augmented Generation (CRAG): Retrieval Evaluator assesses the retrieved documents and categorizes them as Correct, Ambiguous, or Incorrect1. For Ambiguous and Incorrect documents, the method uses Web Search to improve the quality of the information. The refined and distilled documents are then used to generate the final output. [29 Jan 2024] CRAG implementation by LangGraph git (⭐5.2k)
- RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval: Introduce a novel approach to retrieval-augmented language models by constructing a recursive tree structure from documents. git (⭐35k)
pip install llama-index-packs-raptor
/ git (⭐27) [31 Jan 2024] - CRAG: Comprehensive RAG Benchmark: a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search ref [7 Jun 2024]
- PlanRAG: Decision Making. Decision QA benchmark, DQA. Plan -> Retrieve -> Make a decision (PlanRAG) git (⭐112) [18 Jun 2024]
- Searching for Best Practices in Retrieval-Augmented Generation:
Best Performance Practice
: Query Classification, Hybrid with HyDE (retrieval), monoT5 (reranking), Reverse (repacking), Recomp (summarization).Balanced Efficiency Practice
: Query Classification, Hybrid (retrieval), TILDEv2 (reranking), Reverse (repacking), Recomp (summarization). [1 Jul 2024] - Retrieval Augmented Generation or Long-Context LLMs?: Long-Context consistently outperforms RAG in terms of average performance. However, RAG's significantly lower cost remains a distinct advantage. [23 Jul 2024]
- Graph Retrieval-Augmented Generation: A Survey [15 Aug 2024]
- Active Retrieval Augmented Generation : [cnt]: Forward-Looking Active REtrieval augmented generation (FLARE): FLARE iteratively generates a temporary next sentence and check whether it contains low-probability tokens. If so, the system retrieves relevant documents and regenerates the sentence. Determine low-probability tokens by
RAG Pipeline & Advanced RAG
- How to optimize RAG pipeline: Indexing optimization [24 Oct 2023]
Microsoft Azure OpenAI relevant LLM Framework / Lucene based search engine with OpenAI Embedding
- JARVIS (⭐23k): an interface for LLMs to connect numerous AI models for solving complicated AI tasks! [Mar 2023]
- Microsoft Fabric: Fabric integrates technologies like Azure Data Factory, Azure Synapse Analytics, and Power BI into a single unified product [May 2023]
Azure Reference Architectures / Azure AI Search
- A set of capabilities designed to improve relevance in these scenarios. We use a combination of hybrid retrieval (vector search + keyword search) + semantic ranking as the most effective approach for improved relevance out-of–the-box.
TL;DR: Retrieval Performance; Hybrid search + Semantic rank > Hybrid search > Vector only search > Keyword only
ref [18 Sep 2023]
Semantic Kernel / Feature Roadmap
- .NET Semantic Kernel SDK: 1. Renamed packages and classes that used the term “Skill” to now use “Plugin”. 2. OpenAI specific in Semantic Kernel core to be AI service agnostic 3. Consolidated our planner implementations into a single package ref [10 Oct 2023]
Semantic Kernel / Code Recipes
- Chat Copilot Sample Application: A reference application for building a chat experience using Semantic Kernel. Leveraging plugins, planners, and AI memories. git (⭐2k) [Apr 2023]
- Semantic Kernel Recipes: A collection of C# notebooks git (⭐165) [Mar 2023]
- Semantic Kernel-Powered OpenAI Plugin Development Lifecycle ref [30 Oct 2023]
- SemanticKernel Implementation sample to overcome Token limits of Open AI model. Semantic Kernel でトークンの限界を超えるような長い文章を分割してスキルに渡して結果を結合したい (zenn.dev) ref [06 May 2023]
Semantic Kernel / Semantic Kernel Planner
Semantic Kernel Planner ref [24 Jul 2023]
Semantic Kernel / Semantic Function
- Prompt Template language Key takeaways
LangChain vs Competitors / Prompting Frameworks
- Prompting Framework (PF): Prompting Frameworks for Large Language Models: A Survey git (⭐72)
Prompt Guide & Leaked prompts / Prompt Template Language
- Prompt Engineering: Prompt Engineering, also known as In-Context Prompting ... [Mar 2023]
Finetuning / PEFT: Parameter-Efficient Fine-Tuning (Youtube) [24 Apr 2023]
- Fine-tuning a GPT - LoRA: Comprehensive guide for LoRA doc [20 Jun 2023]
OpenAI's Roadmap and Products / OpenAI's plans according to Sam Altman
- Humanloop Interview 2023 : doc [29 May 2023]
Numbers LLM / GPT series release date
MLLM (multimodal large language model) / GPT series release date
- Benchmarking Multimodal LLMs.
LLaVA-1.5 achieves SoTA on a broad range of 11 tasks incl. SEED-Bench.
SEED-Bench: [cnt]: Benchmarking Multimodal LLMs git (⭐289) [30 Jul 2023]
Learning and Supplementary Materials / Korean
- Large Language Models: Application through Production (⭐727): A course on edX & Databricks Academy
- Large Language Model Course (⭐36k): Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. [Jun 2023]
Caching / OSS Alternatives for OpenAI Code Interpreter (aka. Advanced Data Analytics)
- Caching: A technique to store data that has been previously retrieved or computed, so that future requests for the same data can be served faster.
- To reduce latency, cost, and LLM requests by serving pre-computed or previously served responses.
- Strategies for caching: Caching can be based on item IDs, pairs of item IDs, constrained input, or pre-computation. Caching can also leverage embedding-based retrieval, approximate nearest neighbor search, and LLM-based evaluation. ref
Defensive UX / OSS Alternatives for OpenAI Code Interpreter (aka. Advanced Data Analytics)
- Defensive UX: A design strategy that aims to prevent and handle errors in user interactions with machine learning or LLM-based products.
- Why defensive UX?: Machine learning and LLMs can produce inaccurate or inconsistent output, which can affect user trust and satisfaction. Defensive UX can help by increasing accessibility, trust, and UX quality.
- Guidelines for Human-AI Interaction: Microsoft: Based on a survey of 168 potential guidelines from various sources, they narrowed it down to 18 action rules organized by user interaction stages.
- People + AI Guidebook: Google: Google’s product teams and academic research, they provide 23 patterns grouped by common questions during the product development process3.
- Human Interface Guidelines for Machine Learning: Apple: Based on practitioner knowledge and experience, emphasizing aspects of UI rather than model functionality4.
10. Awesome Machine Learning
C++ / General-Purpose Machine Learning
- Truss - An open source framework for packaging and serving ML models.
Python / Computer Vision
- MLX (⭐16k)- MLX is an array framework for machine learning on Apple silicon, developed by Apple machine learning research.
Python / Neural Networks
- Kinho (⭐31) - Simple API for Neural Network. Better for image processing with CPU/GPU + Transfer Learning.
Tools / Misc
- Infinity (⭐2.2k) - The AI-native database built for LLM applications, providing incredibly fast vector and full-text search. Developed using C++20
11. Awesome Ipfs
Pinning services
- Gateway3 - A decentralized IPFS pinning service designed for developers. Supports content pinning, IPNS hosting, DAG operations, pinning tweets, and web hosting.
12. Free for Dev
Testing
- webhookbeam.com - Set up webhooks and monitor them via push notifications and emails.
Generative AI
- Portkey - Control panel for Gen AI apps featuring an observability suite & an AI gateway. Send & log up to 10,000 requests for free every month.
- OpenPipe - Fully managed fine-tuning for developers. Free plan lets you fine-tune one model with upto 2,000 rows per dataset.
- Braintrust - Evals, prompt playground, and data management for Gen AI. Free plan gives upto 1,000 private eval rows/week.
Analytics, Events and Statistics
- LogSpot - Full unified web and product analytics platform, including embeddable analytics widgets and automated robots (slack, telegram, and webhooks). Free plan includes 10,000 events per month.
Privacy Management
- Concord - Full data privacy platform, including consent management, privacy request handling (DSARs), and data mapping. Free tier includes core consent management features and they also provide a more advanced plan for free to verified open source projects.
13. Free Programming Books (English, By Subjects)
Security & Privacy
- The MoonMath Manual to zk-SNARKs - Least Authority
14. Awesome Theoretical Computer Science
Lecture Notes / Monograph
- Chekuri. Approximation Algorithmis Illinois - A broad introduction to results and techniques with an emphasis on fundamental problems and widely applicable tools. Also more advanced and specialized topics.
- Dinitz. Approximation Algorithms. Johns Hopkins - It includes greedy, local search, dynamic programming, randomized rounding, tree embeddings, and semidefinite programming.
- Gupta & Ravi. Approximation Algorithms. CMU - It includes convex programming-based, randomness, and metric methods.
Books / Monograph
- Williamson & Shmoys. The Design of Approximation Algorithms - It includes greedy, local search algorithms, dynamic programming, linear and semidefinite programming, and randomization.
- Du & Ko. Design and Analysis of Approximation Algorithms - A technique-oriented approach provides a unified view. It includes detailed algorithms, proofs, analyses, examples, and applications from research papers.
- Fedor Fomin. Parametrized Algorithms - Modern comprehensive explanation of recent tools and techniques with exercises, for graduate students.
15. Urban and Regional Planning Resources
Public Data Resources / Equity and Environmental Justice
- STEAP - The Screening Tool for Equity Analysis of Projects (STEAP) is a census sampling tool that allows rapid screening of potential project locations anywhere in the United States to support Title VI, environmental justice, and other socioeconomic data analyses.
Vendor Data Resources / Infrastructure
- Geomate - Geomate provides HD vector maps from high-resolution aerial imagery to support autonomous vehicles and urban planning use cases.
16. Awesome Selfhosted
Software / Communication - Custom Communication Systems
- Mattermost - Platform for secure collaboration across the entire software development lifecycle, can be integrated with Gitlab (alternative to Slack). (Source Code (⭐29k))
AGPL-3.0/Apache-2.0
Go/Docker/K8S
Software / Groupware
- Tine - Software for digital collaboration in companies and organizations. From powerful groupware functionalities to clever add-ons, tine combines everything to make daily team collaboration easier. (Source Code (⭐11))
AGPL-3.0
Docker
17. Awesome Digital History
Archives and primary sources / Africa
- West African Arabic Manuscript Database - A comprehensive collection of manuscripts that provides insight into the Islamic scholarly tradition in West Africa.
Learning / Switzerland
- Introduction to Python for Humanists - A textbook offering a comprehensive introduction to Python programming, tailored for researchers and students in the humanities.
18. Awesome Terraform
Managed Registries / Miscellaneous
- cloudsmith - Managed package hoster for internal and external clients. 💲
19. Awesome Neovim
(requires Neovim 0.5)
- lopi-py/luau-lsp.nvim (⭐38) - A luau-lsp extension to improve your experience.
AI / Diagnostics
- gsuuon/model.nvim (⭐313) - Integrate LLMs via a prompt builder interface. Multi-providers including OpenAI (+ compatibles), PaLM, HuggingFace and local engines like llamacpp.
Marks / Diagnostics
- otavioschwanck/arrow.nvim (⭐442) - Like harpoon, but with a different UX, single keybinding needed and statusline support.
Project / Diagnostics
- LintaoAmons/cd-project.nvim (⭐95) - All you need is just an easier way to
cd
to another project directory.
20. Awesome Developer First
Authentication & Identity
- Kinde - Authentification and user management as a service.
Backend-as-a-Service
- Appwrite - End-to-end backend server for frontend and mobile developers.
21. Awesome Capacitorjs
Plugins / Community Plugins
- @capawesome-team/capacitor-android-dark-mode-support - Capacitor plugin to support dark mode on Android.
- @capawesome-team/capacitor-android-foreground-service - Capacitor plugin to run a foreground service on Android.
- @capawesome-team/capacitor-datetime-picker - Capacitor plugin that let the user easily enter both a date and a time.
- @capawesome-team/capacitor-file-compressor - Capacitor plugin for compressing files.
- @capawesome-team/capacitor-file-opener - Capacitor plugin to open a file with the default application.
- @capawesome-team/capacitor-nfc - Capacitor plugin for reading and writing NFC tags.
- @capawesome-team/capacitor-printer - Capacitor plugin for printing.
22. Awesome Ai4lam
Learning Resources / Generative AI
- What are large language models (LLMs)? – (YouTube) by Google for Developers
- Generative AI for Everyone – free Coursera course by Andrew Ng
- What Is ChatGPT Doing … and Why Does It Work? – by Stephen Wolfram
Publications and News Sources / Journals and Magazines
23. Awesome Django
Third-Party Packages / APIs
- django-webhook (⭐177) - A plug-and-play Django app for sending outgoing webhooks on model changes.
Hosting / PaaS (Platforms-as-a-Service)
24. Awesome Raspberry Pi
Tools
- PiKISS (⭐872) - A bunch of scripts with menu to make your life easier.
25. Awesome Video
Learning / Books
- Fundamentals of Multimedia - 2022-02-17 (3rd Edition). Ze-Nian Li (Author), Mark S. Drew (Author), Jiangchuan Liu.
Encoding / Talks Presentations Podcasts
- realeyes-media/demo-encoder (⭐56) - A nodejs encoding system based on ffmpeg and configured to write HLS streaming files to S3 - realeyes-media/demo-encoder
Streaming Server and Storage / SRT
- prologic/tube (⭐22) - 📺 a Youtube-like (without censorship and features you don't need!) Video Sharing App written in Go which also supports automatic transcoding to MP4 H.265 AAC, multiple collections and R...
Players / Android
- google/ExoPlayer (⭐22k) - ExoPlayer is an application level media player for Android.
FFMPEG / Web
- cuda/ubuntu16.04/ffmpeg-gpu/Dockerfile · master · nvidia / container-images / samples - Sample Dockerfiles for Docker Hub images
26. Awesome Generative Deep Art
Generative AI history, timelines, maps, and definitions
- [🔥🔥🔥] Generative AI in a nutshell: a map with the most common Generative AI' concepts by Henrik Kniberg Youtube Video explaining the map
Online Tools and Applications
- Recast Studio: AI-powered podcast marketing assistant.
Auxiliary tools and concepts / Deforum
- Marblism: Generate a SaaS boilerplate from a prompt
27. Awesome Fantasy
Epic Fantasy / The Chronicles of Prydain 1964 by Lloyd Alexander [4.42]
Epic Fantasy / The Daevabad Trilogy 2017 *byS. A. Chakraborty*[4.3]
28. Awesome Privacy
Android Gallery
- Google Photos has privacy issues. They collect a lot of data about you, which you can see in their privacy policy. Google can scan your photos and might flag them for different reasons, as shown in this incident. They also use your photos to improve their AI technology.
- Amazon Photos also has similar privacy problems. Like Google Photos, it gathers a lot of information from your photo gallery. You can see a bit of what kind of data they collect in their examples list.
- Samsung, Huawei, Xiaomi, etc. Gallery
- Aves (⭐2.4k) - Beautiful gallery and metadata explorer app, built for Android with Flutter.
- Fossify Gallery (⭐1.5k) - Fork of Simple Gallery. Browse your memories without any interruptions with this photo and video gallery.
29. Awesome Embedded Rust
Peripheral Access Crates / StarFive
HAL implementation crates / StarFive
30. Awesome React
React Frameworks
- remix (⭐27k) - Full stack web Framework that lets you focus on the user interface
React Libraries
- react-error-boundary (⭐6.4k) - A React error boundary component that lets you catch errors
31. Awesome Vite
Templates / React
- react-component-library-vite (⭐3) - A library template for with
React
,Javascript
,Styled-Components
,Vitest
,React Testing Library
,Storybook
.
32. Awesome Agi Cocosci
Domain Specific Language / Design Theory
- Domain-Specific Language - Wikipedia. Wikipedia encyclopedia entry on Domain Specific Languages.
33. Awesome Cl
C, C++
- stacks-api (⭐1) - a Stacks API client. AGPL-3.0
Web frameworks / Isomorphic web frameworks
- Weblocks (Reblocks) (⭐48) - A widgets-based framework with a built-in ajax update mechanism that "solves the JavaScript problem". LLGPL.
- example code bases: Ultralisp (⭐228), krasnodar (⭐6), a dashboard made for a hackaton (2024) (demo video).
34. Awesome Docker
IDE integrations
- denops-docker.vim (⭐82) - Manage docker containers and images in Vim. By @skanehira
Web / Other
- Mafl (⭐272) - Minimalistic flexible homepage by @hywax
- Prev: Jan 29 - Feb 04, 2024
- Next: Jan 15 - Jan 21, 2024