Awesome List Updates on Dec 11 - Dec 17, 2023
9 awesome lists updated this week.
🏠 Home · 🔍 Search · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor
1. Awesome Rails
Open Source Rails Apps / Other external resources
- chronlife (⭐6) - A social platform for people with chronic diseases (using Rails 7.0).
2. Awesome Datascience
Other Awesome Lists / Book Deals (Affiliated) 🛍
3. Free Programming Books (English, By Subjects)
Embedded Systems
- Control and Embedded Systems (HTML)
- First Steps with Embedded Systems - Byte Craft Limited (PDF)
- Introduction to Embedded Systems, Second Edition - Edward Ashford Lee, Sanjit Arunkumar Seshia (PDF)
- Mastering the FreeRTOS Real Time Kernel - a Hands On Tutorial Guide - freertos.org (PDF)
4. Static Analysis
Programming Languages / Other
- JET (⭐726) — Static type inference system to detect bugs and type instabilities.
- cargo-semver-checks — Scan your Rust crate releases for semver violations. It can be used either directly via the CLI, as a GitHub Action in CI, or via release managers like
release-plz
. It found semver violations in more than 1 in 6 of the top 1000 most-downloaded crates on crates.io.
5. Awesome Ai4lam
Learning Resources / Introductions to AI
- DeepLearning.AI Short Courses, a free courses from a platform created by Andrew Ng
- Introduction to Hugging Face, a free course by Codecademy
Learning Resources / AI in galleries, libraries, archives and museums
- The AI4LAM YouTube channel has introductory presentations on many topics
Policies and recommendations / Frameworks
- A Framework for U.S. AI Governance: Creating a Safe and Thriving AI Sector white paper by the MIT Schwarzman College of Computing, Dec. 11, 2023. (See also related article in MIT News.)
6. Awesome Algorand
Other Development Tools / Smart Contracts
- puya (⭐81) - PuyaPy is an official Python to TEAL compiler that allows you to write code to execute on the Algorand Virtual Machine (AVM) with Python syntax.
7. Awesome Iam
Authorization / Policy models
- Authorization Academy - An in-depth, vendor-agnostic treatment of authorization that emphasizes mental models. This guide shows the reader how to think about their authorization needs in order to make good decisions about their authorization architecture and model.
8. Awesome Azure Openai Llm
What is the RAG (Retrieval-Augmented Generation)?
In a 2020 paper, Meta (Facebook) came up with a framework called retrieval-augmented generation to give LLMs access to information beyond their training data. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: [cnt] [22 May 2020]
- RAG-sequence — We retrieve k documents, and use them to generate all the output tokens that answer a user query.
- RAG-token— We retrieve k documents, use them to generate the next token, then retrieve k more documents, use them to generate the next token, and so on. This means that we could end up retrieving several different sets of documents in the generation of a single answer to a user’s query.
- Of the two approaches proposed in the paper, the RAG-sequence implementation is pretty much always used in the industry. It’s cheaper and simpler to run than the alternative, and it produces great results. cite [30 Sep 2023]
RAG Pipeline & Advanced RAG
- Demystifying Advanced RAG Pipelines: An LLM-powered advanced RAG pipeline built from scratch git (⭐776) [19 Oct 2023]
LlamaIndex
- LlamaIndex Overview (Japanese) [17 Jul 2023]
- LlamaIndex Tutorial: A Complete LlamaIndex Guide [18 Oct 2023]
- Multimodal RAG Pipeline ref [Nov 2023]
Vector Database Comparison
- Not All Vector Databases Are Made Equal: Printed version for "Medium" limits. doc [2 Oct 2021]
- Faiss: Facebook AI Similarity Search (Faiss) is a library for efficient similarity search and clustering of dense vectors. It is used as an alternative to a vector database in the development and library of algorithms for a vector database. It is developed by Facebook AI Research. git (⭐30k) [Feb 2017]
Vector Database Comparison / Vector Database Options for Azure
- Pgvector extension on Azure Cosmos DB for PostgreSQL: ref [13 Jun 2023]
- Vector Search in Azure Cosmos DB for MongoDB vCore [23 May 2023]
- Azure Cache for Redis Enterprise: Enterprise Redis Vector Search Demo [22 May 2023 ]
Vector Database Comparison / Lucene based search engine with OpenAI Embedding
- Vector Search with OpenAI Embeddings: Lucene Is All You Need: Our experiments were based on Lucene 9.5.0, but indexing was a bit tricky because the HNSW implementation in Lucene restricts vectors to 1024 dimensions, which was not sufficient for OpenAI’s 1536-dimensional embeddings. Although the resolution of this issue, which is to make vector dimensions configurable on a per codec basis, has been merged to the Lucene source trunk git (⭐2.5k), this feature has not been folded into a Lucene release (yet) as of early August 2023. [29 Aug 2023]
Microsoft Azure OpenAI relevant LLM Framework / Lucene based search engine with OpenAI Embedding
- Kernel Memory (⭐1.4k): Kernel Memory (FKA. Semantic Memory (SM)) is an open-source service and plugin specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines. [Jul 2023]
- FLAML (⭐3.8k): A lightweight Python library for efficient automation of machine learning and AI operations. FLAML provides an seamless interface for AutoGen, AutoML, and generic hyperparameter tuning. [Dec 2020]
- A Memory in Semantic Kernel vs Kernel Memory (FKA. Semantic Memory (SM)): Kernel Memory is designed to efficiently handle large datasets and extended conversations. Deploying the memory pipeline as a separate service can be beneficial when dealing with large documents or long bot conversations. ref (⭐2k)
Azure Reference Architectures / Azure AI Search
- Azure Cognitive Search rebranding Azure AI Search, it supports Vector search and semantic ranker. [16 Nov 2023]
Azure Enterprise Services / Azure AI Search
- Azure OpenAI Service On Your Data in Public Preview ref [19 Jun 2023]
- Azure OpenAI Finetuning: Babbage-002 is $34/hour, Davinci-002 is $68/hour, and Turbo is $102/hour. ref [16 Oct 2023]
- Customer Copyright Commitment: protects customers from certain IP claims related to AI-generated content. ref [16 Nov 2023]
Semantic Kernel / Azure AI Search
- Microsoft LangChain Library supports C# and Python and offers several features, some of which are still in development and may be unclear on how to implement. However, it is simple, stable, and faster than Python-based open-source software. The features listed on the link include: Semantic Kernel Feature Matrix / doc:ref / blog:ref / git [Feb 2023]
Semantic Kernel / Semantic Kernel Planner
- Stepwise Planner released. The Stepwise Planner features the "CreateScratchPad" function, acting as a 'Scratch Pad' to aggregate goal-oriented steps. [16 Aug 2023]
LangChain features and related libraries / DSPy optimizer
- LangChain Expression Language: A declarative way to easily compose chains together [Aug 2023]
- OpenGPTs (⭐6.4k): An open source effort to create a similar experience to OpenAI's GPTs [Nov 2023]
- langflow (⭐24k): LangFlow is a UI for LangChain, designed with react-flow. [Feb 2023]
- Flowise (⭐29k) Drag & drop UI to build your customized LLM flow [Apr 2023]
Prompt Engineering / Prompt Template Language
- ReAct: [cnt]: Grounding with external sources. (Reasoning and Act): Combines reasoning and acting ref [6 Oct 2022]
- Zero-shot
- Large Language Models are Zero-Shot Reasoners: [cnt]: Let’s think step by step. [24 May 2022]
- Few-shot Learning
- Open AI: Language Models are Few-Shot Learners: [cnt] [28 May 2020]
- Retrieval Augmented Generation (RAG): [cnt]: To address such knowledge-intensive tasks. RAG combines an information retrieval component with a text generator model. [22 May 2020]
- Chain-of-Verification reduces Hallucination in LLMs: [cnt]: A four-step process that consists of generating a baseline response, planning verification questions, executing verification questions, and generating a final verified response based on the verification results. [20 Sep 2023]
- Reflexion: [cnt]: Language Agents with Verbal Reinforcement Learning. 1. Reflexion that uses
verbal reinforcement
to help agents learn from prior failings. 2. Reflexion converts binary or scalar feedback from the environment into verbal feedback in the form of a textual summary, which is then added as additional context for the LLM agent in the next episode. 3. It is lightweight and doesn’t require finetuning the LLM. [20 Mar 2023] / git (⭐2.2k)
Power of Prompting
- GPT-4 with Medprompt: GPT-4, using a method called Medprompt that combines several prompting strategies, has surpassed MedPaLM 2 on the MedQA dataset without the need for fine-tuning. ref [28 Nov 2023]
- promptbase (⭐5.3k): Scripts demonstrating the Medprompt methodology [Dec 2023]
LangChain Agent & Memory / Criticism to LangChain
- What’s your biggest complaint about langchain?: ref [May 2023]
LangChain vs Competitors / LangChain vs LlamaIndex
- Basically LlamaIndex is a smart storage mechanism, while LangChain is a tool to bring multiple tools together. cite [14 Apr 2023]
LangChain vs Competitors / LangChain vs Semantic Kernel vs Azure Machine Learning Prompt flow
What's the difference between LangChain and Semantic Kernel?
LangChain has many agents, tools, plugins etc. out of the box. More over, LangChain has 10x more popularity, so has about 10x more developer activity to improve it. On other hand, Semantic Kernel architecture and quality is better, that's quite promising for Semantic Kernel. ref (⭐21k) [11 May 2023]
- Using Prompt flow with Semantic Kernel: ref [07 Sep 2023]
Prompt Guide & Leaked prompts / Prompt Template Language
- Prompts for Education (⭐1.5k): Microsoft Prompts for Education [Jul 2023]
Finetuning / PEFT: Parameter-Efficient Fine-Tuning (Youtube) [24 Apr 2023]
- PEFT: Parameter-Efficient Fine-Tuning. PEFT is an approach to fine tuning only a few parameters. [10 Feb 2023]
- Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning: [cnt] [28 Mar 2023]
- QLoRA: Efficient Finetuning of Quantized LLMs: [cnt]: 4-bit quantized pre-trained language model into Low Rank Adapters (LoRA). git (⭐9.8k) [23 May 2023]
- LIMA: Less Is More for Alignment: [cnt]: fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, either equivalent or strictly preferred to GPT-4 in 43% of cases. [18 May 2023]
-
Expand: LongLoRA
- LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models: [cnt]: A combination of sparse local attention and LoRA git (⭐2.6k) [21 Sep 2023]
- Key Takeaways from LongLora
The document states that LoRA alone is not sufficient for long context extension.
Although dense global attention is needed during inference, fine-tuning the model can be done by sparse local attention, shift short attention (S2-Attn).
S2-Attn can be implemented with only two lines of code in training.
- QA-LoRA: [cnt]: Quantization-Aware Low-Rank Adaptation of Large Language Models. A method that integrates quantization and low-rank adaptation for large language models. git (⭐107) [26 Sep 2023]
Finetuning / Llama Finetuning
- Multi-query attention (MQA): [cnt] [22 May 2023]
- Comprehensive Guide for LLaMA with RLHF: StackLLaMA: A hands-on guide to train LLaMA with RLHF [5 Apr 2023]
RLHF (Reinforcement Learning from Human Feedback) & SFT (Supervised Fine-Tuning) / Llama Finetuning
- Libraries: TRL, trlX (⭐4.4k), Argilla
TRL: from the Supervised Fine-tuning step (SFT), Reward Modeling step (RM) to the Proximal Policy Optimization (PPO) step
The three steps in the process: 1. pre-training on large web-scale data, 2. supervised fine-tuning on instruction data (instruction tuning), and 3. RLHF. ref [ⓒ 2023]
- Reinforcement Learning from AI Feedback (RLAF): [cnt]: Uses AI feedback to generate instructions for the model. TLDR: CoT (Chain-of-Thought, Improved), Few-shot (Not improved). Only explores the task of summarization. After training on a few thousand examples, performance is close to training on the full dataset. RLAIF vs RLHF: In many cases, the two policies produced similar summaries. [1 Sep 2023]
- OpenAI Spinning Up in Deep RL!: An educational resource to help anyone learn deep reinforcement learning. git (⭐9.9k) [Nov 2018]
Model Compression for Large Language Models / Llama Finetuning
- A Survey on Model Compression for Large Language Models ref [15 Aug 2023]
Quantization Techniques / Llama Finetuning
- bitsandbytes: 8-bit optimizers git (⭐5.9k) [Oct 2021]
Pruning and Sparsification / Llama Finetuning
- Wanda Pruning: [cnt]: A Simple and Effective Pruning Approach for Large Language Models [20 Jun 2023] ref
Knowledge Distillation: Reducing Model Size with Textbooks / Llama Finetuning
- Orca 2: [cnt]: Orca learns from rich signals from GPT 4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. ref [18 Nov 2023]
- Distilled Supervised Fine-Tuning (dSFT)
- Zephyr 7B: [cnt] Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). ref [25 Oct 2023]
- Mistral 7B: [cnt]: Outperforms Llama 2 13B on all benchmarks. Uses Grouped-query attention (GQA) for faster inference. Uses Sliding Window Attention (SWA) to handle longer sequences at smaller cost. ref [10 Oct 2023]
Other techniques and LLM patterns / Llama Finetuning
- Large Transformer Model Inference Optimization: Besides the increasing size of SoTA models, there are two main factors contributing to the inference challenge ... [10 Jan 2023]
- Mixture of experts models: Mixtral 8x7B: Sparse mixture of experts models (SMoE) magnet [Dec 2023]
- Huggingface Mixture of Experts Explained: Mixture of Experts, or MoEs for short [Dec 2023]
- Simplifying Transformer Blocks: Simplifie Transformer. Removed several block components, including skip connections, projection/value matrices, sequential sub-blocks and normalisation layers without loss of training speed. [3 Nov 2023]
3. Visual Prompting & Visual Grounding / Llama Finetuning
- Visual Prompting [21 Nov 2022]
- Andrew Ng’s Visual Prompting Livestream [24 Apr 2023]
OpenAI's Roadmap and Products / OpenAI's plans according to Sam Altman
- OpenAI’s CEO Says the Age of Giant AI Models Is Already Over ref [17 Apr 2023]
- Q* (pronounced as Q-Star): The model, called Q* was able to solve basic maths problems it had not seen before, according to the tech news site the Information. ref [23 Nov 2023]
OpenAI's Roadmap and Products / GPT-4 details leaked unverified
- The Dawn of LMMs: [cnt]: Preliminary Explorations with GPT-4V(ision) [29 Sep 2023]
- GPT-4 details leaked
- GPT-4 is a language model with approximately 1.8 trillion parameters across 120 layers, 10x larger than GPT-3. It uses a Mixture of Experts (MoE) model with 16 experts, each having about 111 billion parameters. Utilizing MoE allows for more efficient use of resources during inference, needing only about 280 billion parameters and 560 TFLOPs, compared to the 1.8 trillion parameters and 3,700 TFLOPs required for a purely dense model.
- The model is trained on approximately 13 trillion tokens from various sources, including internet data, books, and research papers. To reduce training costs, OpenAI employs tensor and pipeline parallelism, and a large batch size of 60 million. The estimated training cost for GPT-4 is around $63 million. ref [Jul 2023]
OpenAI's Roadmap and Products / OpenAI Products
- OpenAI DevDay 2023: GPT-4 Turbo with 128K context, Assistants API (Code interpreter, Retrieval, and function calling), GPTs (Custom versions of ChatGPT: ref), Copyright Shield, Parallel Function Calling, JSON Mode, Reproducible outputs [6 Nov 2023]
- ChatGPT can now see, hear, and speak: It has recently been updated to support multimodal capabilities, including voice and image. [25 Sep 2023] Whisper (⭐67k) / CLIP (⭐24k)
- GPT-3.5 Turbo Fine-tuning Fine-tuning for GPT-3.5 Turbo is now available, with fine-tuning for GPT-4 coming this fall. [22 Aug 2023]
- DALL·E 3 : In September 2023, OpenAI announced their latest image model, DALL-E 3 git (⭐11k) [Sep 2023]
- Open AI Enterprise: Removes GPT-4 usage caps, and performs up to two times faster ref [28 Aug 2023]
- Custom instructions: In a nutshell, the Custom Instructions feature is a cross-session memory that allows ChatGPT to retain key instructions across chat sessions. [20 Jul 2023]
Numbers LLM / GPT series release date
- tiktoken (⭐12k): BPE tokeniser for use with OpenAI's models. Token counting. [Dec 2022]
- What are tokens and how to count them?: OpenAI Articles
- Byte-Pair Encoding (BPE): P.2015. The most widely used tokenization algorithm for text today. BPE adds an end token to words, splits them into characters, and merges frequent byte pairs iteratively until a stop criterion. The final tokens form the vocabulary for new data encoding and decoding. [31 Aug 2015] / ref [13 Aug 2021]
Trustworthy, Safe and Secure LLM / GPT series release date
- NeMo Guardrails (⭐3.9k): Building Trustworthy, Safe and Secure LLM Conversational Systems [Apr 2023]
- The Foundation Model Transparency Index: [cnt]: A comprehensive assessment of the transparency of foundation model developers ref [19 Oct 2023]
- Hallucinations: [cnt]: A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions [9 Nov 2023]
- Hallucination Leaderboard (⭐1.1k): Evaluate how often an LLM introduces hallucinations when summarizing a document. [Nov 2023]
Large Language Model Is: Abilities / GPT series release date
- Emergent Abilities of Large Language Models: [cnt]: Large language models can develop emergent abilities, which are not explicitly trained but appear at scale and are not present in smaller models. . These abilities can be enhanced using few-shot and augmented prompting techniques. ref [15 Jun 2022]
- Multitask Prompted Training Enables Zero-Shot Task Generalization: [cnt]: A language model trained on various tasks using prompts can learn and generalize to new tasks in a zero-shot manner. [15 Oct 2021]
- Language Modeling Is Compression: [cnt]: Lossless data compression, while trained primarily on text, compresses ImageNet patches to 43.4% and LibriSpeech samples to 16.4% of their raw size, beating domain-specific compressors like PNG (58.5%) or FLAC (30.3%). [19 Sep 2023]
- LLMs Represent Space and Time: [cnt]: Large language models learn world models of space and time from text-only training. [3 Oct 2023]
- Math soving optimized LLM WizardMath: [cnt]: Developed by adapting Evol-Instruct and Reinforcement Learning techniques, these models excel in math-related instructions like GSM8k and MATH. git (⭐9.2k) [18 Aug 2023] / Math solving Plugin: Wolfram alpha
- Large Language Models for Software Engineering: [cnt]: Survey and Open Problems, Large Language Models (LLMs) for Software Engineering (SE) applications, such as code generation, testing, repair, and documentation. [5 Oct 2023]
- LLMs for Chip Design: Domain-Adapted LLMs for Chip Design [31 Oct 2023]
Large Language Models (in 2023) / GPT series release date
Evolutionary Tree of Large Language Models / GPT series release date
- A Survey of Large Language Models: [cnt] /git (⭐9.8k) [31 Mar 2023] contd.
- LLM evolutionary tree: [cnt]: A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers) git (⭐9.1k) [26 Apr 2023]
Build an LLMs from scratch: picoGPT and lit-gpt / GPT series release date
- lit-gpt: Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed. git (⭐9.4k) [Mar 2023]
- pix2code (⭐12k): Generating Code from a Graphical User Interface Screenshot. Trained dataset as a pair of screenshots and simplified intermediate script for HTML, utilizing image embedding for CNN and text embedding for LSTM, encoder and decoder model. Early adoption of image-to-code. [May 2017] -> Screenshot to code (⭐16k): Turning Design Mockups Into Code With Deep Learning [Oct 2017] ref
LLM Materials for East Asian Languages / Japanese
- LLM 研究プロジェクト: ブログ記事一覧 [27 Jul 2023]
- ブレインパッド社員が投稿した Qiita 記事まとめ: ブレインパッド社員が投稿した Qiita 記事まとめ [Jul 2023]
- rinna: rinna の 36 億パラメータの日本語 GPT 言語モデル: 3.6 billion parameter Japanese GPT language model [17 May 2023]
- rinna: bilingual-gpt-neox-4b: 日英バイリンガル大規模言語モデル [17 May 2023]
- New Era of Computing - ChatGPT がもたらした新時代 [May 2023]
- 大規模言語モデルで変わる ML システム開発: ML system development that changes with large-scale language models [Mar 2023]
- GPT-4 登場以降に出てきた ChatGPT/LLM に関する論文や技術の振り返り: Review of ChatGPT/LLM papers and technologies that have emerged since the advent of GPT-4 [Jun 2023]
- LLM を制御するには何をするべきか?: How to control LLM [Jun 2023]
- 1. 生成 AI のマルチモーダルモデルでできること: What can be done with multimodal models of generative AI 2. 生成 AI のマルチモーダリティに関する技術調査 [Jun 2023]
- LLM の推論を効率化する量子化技術調査: Survey of quantization techniques to improve efficiency of LLM reasoning [Sep 2023]
- LLM の出力制御や新モデルについて: About LLM output control and new models [Sep 2023]
- Azure OpenAI を活用したアプリケーション実装のリファレンス (⭐264): 日本マイクロソフト リファレンスアーキテクチャ [Jun 2023]
- 生成 AI・LLM のツール拡張に関する論文の動向調査: Survey of trends in papers on tool extensions for generative AI and LLM [Sep 2023]
- LLM の学習・推論の効率化・高速化に関する技術調査: Technical survey on improving the efficiency and speed of LLM learning and inference [Sep 2023]
Learning and Supplementary Materials / Korean
- Attention Is All You Need: [cnt]: 🏆 The Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. [12 Jun 2017] Illustrated transformer
- Must read: the 100 most cited AI papers in 2022 : doc [8 Mar 2023]
- The Best Machine Learning Resources : doc [20 Aug 2017]
- What are the most influential current AI Papers?: NLLG Quarterly arXiv Report 06/23 git (⭐8) [31 Jul 2023]
- gpt4free (⭐60k) for educational purposes only [Mar 2023]
- Comparing Adobe Firefly, Dalle-2, OpenJourney, Stable Diffusion, and Midjourney: Generative AI for images [20 Jun 2023]
- Open Problem and Limitation of RLHF: [cnt]: Provides an overview of open problems and the limitations of RLHF [27 Jul 2023]
- IbrahimSobh/llms (⭐266): Language models introduction with simple code. [Jun 2023]
- DeepLearning.ai Short courses: DeepLearning.ai Short courses [2023]
- Deep Learning cheatsheets for Stanford's CS 230 (⭐6.3k): Super VIP Cheetsheet: Deep Learning [Nov 2019]
- Best-of Machine Learning with Python (⭐16k):🏆A ranked list of awesome machine learning Python libraries. [Nov 2020]
Section 10: General AI Tools and Extensions / OSS Alternatives for OpenAI Code Interpreter (aka. Advanced Data Analytics)
- Vercel AI Vercel AI Playground / Vercel AI SDK git (⭐9.1k) [May 2023]
- Quora Poe A chatbot service that gives access to GPT-4, gpt-3.5-turbo, Claude from Anthropic, and a variety of other bots. [Feb 2023]
Section 11: Datasets for LLM Training / OSS Alternatives for OpenAI Code Interpreter (aka. Advanced Data Analytics)
- LLM-generated datasets:
- Self-Instruct: [cnt]: Seed task pool with a set of human-written instructions. [20 Dec 2022]
- Self-Alignment with Instruction Backtranslation: [cnt]: Without human seeding, use LLM to produce instruction-response pairs. The process involves two steps: self-augmentation and self-curation. [11 Aug 2023]
- SQuAD: The Stanford Question Answering Dataset (SQuAD), a set of Wikipedia articles, 100,000+ question-answer pairs on 500+ articles. [16 Jun 2016]
- RedPajama: LLaMA training dataset of over 1.2 trillion tokens git (⭐4.5k) [17 Apr 2023]
- 大規模言語モデルのデータセットまとめ: 大規模言語モデルのデータセットまとめ [Apr 2023]
Challenges in evaluating AI systems / Math
- Pretraining on the Test Set Is All You Need: [cnt]
- On that note, in the satirical Pretraining on the Test Set Is All You Need paper, the author trains a small 1M parameter LLM that outperforms all other models, including the 1.3B phi-1.5 model. This is achieved by training the model on all downstream academic benchmarks. It appears to be a subtle criticism underlining how easily benchmarks can be "cheated" intentionally or unintentionally (due to data contamination). cite [13 Sep 2023]
9. Awesome Cakephp
Migration
- 🍰 Upgrade/Migration Guide - Official migration guide.
Templating
- 🍰 Templating (⭐1) - HTML snippets as value objects, (Font) icons, and templating topics.
- Prev: Dec 18 - Dec 24, 2023
- Next: Dec 04 - Dec 10, 2023