Post

๐ŸŒŸ ๐‹๐‹๐Œ ๐๐ซ๐จ๐ฏ๐ข๐๐ž๐ซ'๐ฌ ๐‘๐ž๐ฅ๐ž๐š๐ฌ๐ž (2024 1H) & ๐Ÿ“ 2024 ๐‹๐‹๐Œ ๐’๐ฎ๐ซ๐ฏ๐ž๐ฒ (on Training / Data / RAG / Serving / Agent)

๐ŸŒŸ LLM Providerโ€™s Release (2024 1H): A Comprehensive Overview

Curiosity: What patterns can we retrieve from the rapid pace of LLM releases in 2024? How do these innovations connect to the broader evolution of the field?

2024โ€™s first half witnessed an unprecedented surge in LLM releases, with 21 major models from leading providers. This comprehensive overview retrieves insights from release patterns, technical innovations, and market dynamics to understand where the field is heading.

Release Timeline Overview

gantt
    title LLM Releases 2024 1H Timeline
    dateFormat YYYY-MM-DD
    section Major Releases
    GPT-4o (OpenAI)           :2024-05-13, 1d
    Llama-3 (Meta)            :2024-04-18, 1d
    Claude-3 (Anthropic)      :2024-03-04, 1d
    Gemini-1.5 (Google)       :2024-03-08, 1d
    section Open Source
    Qwen-2 (Alibaba)          :2024-06-07, 1d
    DeepSeek-V2              :2024-05-07, 1d
    Phi-3 (Microsoft)         :2024-04-22, 1d
    section Specialized
    Solar-Mini-ja (Upstage)   :2024-05-22, 1d
    Mistral-Large            :2024-02-26, 1d

21 LLM Releases: Complete Catalog

#ModelProviderRelease DateKey FeaturesNewsPaper
1Qwen-2Alibaba Group2024.06.07Multilingual, large-scaleLink-
2Solar-Mini-jaUpstage2024.05.22Japanese-optimizedLink-
3Yi-Large01.AI2024.05.13Large-scale modelLink-
4Yi-1.501.AI2024.05.13Enhanced versionLinkarXiv
5GPT-4oOpenAI2024.05.13Omni-modal, fasterLink-
6Qwen-MaxAlibaba Group2024.05.11Maximum performanceLink-
7DeepSeek-V2DeepSeek2024.05.07Efficient architectureLinkarXiv
8Snowflake-ArcticSnowflake2024.04.24Enterprise-focusedLink-
9Phi-3Microsoft2024.04.22Small language modelLinkarXiv
10Llama-3Meta2024.04.18Open-source leaderLink-
11Mixtral-8x22BMistral AI2024.04.17Mixture of expertsLink-
12Reka-CoreReka AI2024.04.15MultimodalLinkarXiv
13Command-R-PlusCohere2024.04.04Enterprise RAGLink-
14DBRXDatabricks2024.03.27Open-source SOTALink-
15Gemini-1.5Google2024.03.08Long contextLinkarXiv
16Claude-3Anthropic2024.03.04Safety-focusedLink-
17Mistral-LargeMistral AI2024.02.26European leaderLink-
18GemmaGoogle2024.02.21Open modelsLinkarXiv
19Qwen-1.5Alibaba Group2024.02.04MultilingualLink-
20Solar-MiniUpstage2024.01.25Efficient KoreanLink-
21Solar-10.7BUpstage2023.12.23Top pre-trainedLinkarXiv

Provider Distribution

pie title LLM Releases by Provider (2024 1H)
    "Alibaba Group" : 3
    "Upstage" : 3
    "Google" : 2
    "Mistral AI" : 2
    "01.AI" : 2
    "Others" : 9

Retrieve: Analysis of release patterns reveals several key trends:

  1. Open Source Acceleration: Major releases from Meta (Llama-3), Alibaba (Qwen series), and Databricks (DBRX)
  2. Multimodal Expansion: GPT-4o, Gemini-1.5, Reka-Core emphasize vision capabilities
  3. Efficiency Focus: Phi-3, Solar-Mini demonstrate small model excellence
  4. Regional Specialization: Solar-Mini-ja (Japanese), Qwen series (Chinese)

Innovate: These releases show the field moving toward:

  • More efficient architectures (DeepSeek-V2, Phi-3)
  • Better multilingual support (Qwen, Solar)
  • Enterprise-ready solutions (Snowflake Arctic, Command-R-Plus)
  1. ๐๐ฐ๐ž๐ง-2 (โ€‹Alibaba Groupโ€‹, 2024.06.07)
  2. ๐’๐จ๐ฅ๐š๐ซ-๐Œ๐ข๐ง๐ข-๐ฃ๐š (โ€‹Upstageโ€‹, 2024.05.22)
  3. ๐˜๐ข-๐‹๐š๐ซ๐ ๐ž (โ€‹01.AIโ€‹, 2024.05.13)
  4. ๐˜๐ข-1.5 (โ€‹01.AIโ€‹, 2024.05.13)
  5. ๐†๐๐“-4๐จ (โ€‹OpenAIโ€‹, 2024.05.13)
  6. ๐๐ฐ๐ž๐ง-๐Œ๐š๐ฑ (โ€‹Alibaba Groupโ€‹, 2024.05.11)
  7. ๐ƒ๐ž๐ž๐ฉ๐’๐ž๐ž๐ค-๐•2 (DeepSeek, 2024.05.07)
  8. ๐’๐ง๐จ๐ฐ๐Ÿ๐ฅ๐š๐ค๐ž-๐€๐ซ๐œ๐ญ๐ข๐œ (โ€‹Snowflakeโ€‹, 2024.04.24)
  9. ๐๐ก๐ข-3 (โ€‹Microsoftโ€‹, 2024.04.22)
  10. ๐‹๐ฅ๐š๐ฆ๐š-3 (โ€‹Meta Facebookโ€‹, 2024.04.18)
  11. ๐Œ๐ข๐ฑ๐ญ๐ซ๐š๐ฅ-8๐ฑ22๐ (โ€‹Mistral AIโ€‹, 2024.04.17)
  12. ๐‘๐ž๐ค๐š-๐‚๐จ๐ซ๐ž (โ€‹Reka AIโ€‹โ€‹, 2024.04.15)
  13. ๐‚๐จ๐ฆ๐ฆ๐š๐ง๐-๐‘-๐๐ฅ๐ฎ๐ฌ (โ€‹Cohereโ€‹, 2024.04.04)
  14. ๐ƒ๐๐‘๐— (โ€‹Databricksโ€‹, 2024.03.27)
  15. ๐†๐ž๐ฆ๐ข๐ง๐ข-1.5 (โ€‹Googleโ€‹, 2024.03.08)
  16. ๐‚๐ฅ๐š๐ฎ๐๐ž-3 (โ€‹Anthropicโ€‹, 2024.03.04)
  17. ๐Œ๐ข๐ฌ๐ญ๐ซ๐š๐ฅ-๐‹๐š๐ซ๐ ๐ž (โ€‹Mistral AIโ€‹, 2024.02.26)
  18. ๐†๐ž๐ฆ๐ฆ๐š (โ€‹Googleโ€‹, 2024.02.21)
  19. ๐๐ฐ๐ž๐ง-1.5 (โ€‹Alibaba Groupโ€‹, 2024.02.04)
  20. ๐’๐จ๐ฅ๐š๐ซ-๐Œ๐ข๐ง๐ข (โ€‹Upstageโ€‹, 2024.01.25)
  21. ๐’๐จ๐ฅ๐š๐ซ-10.7๐ (โ€‹Upstageโ€‹, 2023.12.23)

๐Ÿ“ 2024 LLM Survey: Comprehensive Research Overview

Retrieve: What are the latest research trends across training, data, RAG, serving, and agents? This section compiles essential survey papers that capture the state of the art.

Essential Reading: These surveys provide comprehensive overviews of rapidly evolving LLM research areas.

 LLM 2024 survey

Survey Categories Overview

graph TB
    A[2024 LLM Surveys] --> B[Training]
    A --> C[Data]
    A --> D[RAG]
    A --> E[Serving]
    A --> F[Agent]
    
    B --> B1[Self-Evolution]
    B --> B2[Continual Learning]
    B --> B3[Pre-trained Models]
    
    C --> C1[Datasets]
    C --> C2[Data Selection]
    C --> C3[Instruction Tuning]
    
    D --> D1[RALM Survey]
    D --> D2[AIGC RAG]
    D --> D3[LLM RAG]
    
    E --> E1[Inference]
    E --> E2[Invocation Methods]
    E --> E3[Resource Efficiency]
    
    F --> F1[Multimodal Agents]
    F --> F2[Multi-Agents]
    F --> F3[Personal Agents]
    
    style A fill:#e1f5ff
    style B fill:#fff3cd
    style C fill:#d4edda
    style D fill:#f8d7da
    style E fill:#e7d4f8
    style F fill:#ffe5e5

๐Ÿ“š Training Surveys

Retrieve: How do LLMs evolve and adapt? These surveys explore self-evolution, continual learning, and transfer learning.

SurveyDateFocusarXivGitHub
Self-Evolution of LLMs2024.04.22Autonomous improvement mechanismsLinkRepo
Continual Learning of LLMs2024.04.25Lifelong learning approachesLinkRepo
Continual Learning with PTMs2024.01.29Pre-trained model adaptationLinkRepo

๐Ÿ“Š Data Surveys

Innovate: Data quality and selection are critical for LLM performance. These surveys explore dataset curation and optimization.

SurveyDateFocusarXivGitHub
Datasets for LLMs2024.02.28Comprehensive dataset catalogLinkRepo
Data Selection for LMs2024.02.26Selection strategiesLinkRepo
Data Selection for Instruction Tuning2024.02.04Instruction data curationLinkRepo

๐Ÿ” RAG Surveys

Retrieve: Retrieval-Augmented Generation is transforming how LLMs access knowledge. These surveys cover the latest RAG research.

SurveyDateFocusarXivGitHub
RAG and RAU Survey2024.04.30RALM in NLPLinkRepo
RAG for AIGC2024.02.29AI-generated contentLinkRepo
RAG for LLMs2023.12.18Comprehensive RAG overviewLinkRepo

โšก Serving Surveys

Innovate: Efficient inference and serving are crucial for production deployment. These surveys explore optimization strategies.

SurveyDateFocusarXivGitHub
LLM Inference Unveiled2024.02.26Roofline model insightsLinkRepo
Effective LLM Service Invocation2024.02.05LLMaaS strategiesLinkRepo
Resource-Efficient LLMs2024.01.01Efficiency optimizationLinkRepo

๐Ÿค– Agent Surveys

Retrieve: AI agents represent the next frontier. These surveys explore multimodal, multi-agent, and personal agent systems.

SurveyDateFocusarXivGitHub
Large Multimodal Agents2024.02.23Vision-language agentsLinkRepo
LLM-based Multi-Agents2024.01.21Multi-agent systemsLinkRepo
Personal LLM Agents2024.01.10Personalization & securityLinkRepo
graph LR
    A[2024 LLM Research] --> B[Training<br/>3 surveys]
    A --> C[Data<br/>3 surveys]
    A --> D[RAG<br/>3 surveys]
    A --> E[Serving<br/>3 surveys]
    A --> F[Agent<br/>3 surveys]
    
    style A fill:#e1f5ff
    style B fill:#fff3cd
    style C fill:#d4edda
    style D fill:#f8d7da
    style E fill:#e7d4f8
    style F fill:#ffe5e5

Key Takeaways

Retrieve: These 15 comprehensive surveys cover the essential areas of LLM research: training methodologies, data strategies, RAG systems, serving optimization, and agent architectures.

Innovate: By studying these surveys, you can retrieve the latest research insights and innovate on your own LLM applications, staying at the forefront of this rapidly evolving field.

Curiosity โ†’ Retrieve โ†’ Innovation: Start with curiosity about LLM capabilities, retrieve knowledge from these surveys, and innovate by applying cutting-edge techniques to your projects.

Information about Tokens in LLsM

Why do we keep talking about โ€œtokensโ€ in LLMs instead of words?

It happens to be much more efficient to break the words into sub-words (tokens) for model performance!

The typical strategy used in most modern LLMs since GPT-1 is the Byte Pair Encoding (BPE) strategy. The idea is to use, as tokens, sub-word units that appear often in the training data. The algorithm works as follows:

  • We start with a character-level tokenization
  • we count the pair frequencies
  • We merge the most frequent pair
  • We repeat the process until the dictionary is as big as we want it to be

The size of the dictionary becomes a hyperparameter that we can adjust based on our training data. For example, GPT-1 has a dictionary size of ~40K merges, GPT-2, GPT-3, and ChatGPT have a dictionary size of ~50K, and Llama 3 128K.

 Tokens from Words in LLMS

This post is licensed under CC BY 4.0 by the author.