LLaMA-Meta家的平民大模型

2025年09月13日 19:13:30
134
平民大模型 聊天机器人 AI问答系统 AI内容生成
llama meta-llama/llama

这是Meta(脸书母公司)开源的大语言模型家族,从70亿参数到700亿参数应有尽有,主打一个“轻量能跑、开源免费”。普通人下载后,在消费级显卡上就能微调,不用再眼巴巴看着大厂模型流口水~

项目大小 1.12 KB
涉及语言 Python 94.65% Shell 5.35%
许可协议 LICENSE
仓库同步说明
  • • 同步需要仓库写入权限以创建目标仓库
  • • 使用平台账号授权登录后将同步到您平台下的个人仓库

Note of deprecation

Thank you for developing with Llama models. As part of the Llama 3.1 release, we’ve consolidated GitHub repos and added some additional repos as we’ve expanded Llama’s functionality into being an e2e Llama Stack. Please use the following repos going forward:

  • llama-models - Central repo for the foundation models including basic utilities, model cards, license and use policies
  • PurpleLlama - Key component of Llama Stack focusing on safety risks and inference time mitigations
  • llama-toolchain - Model development (inference/fine-tuning/safety shields/synthetic data generation) interfaces and canonical implementations
  • llama-agentic-system - E2E standalone Llama Stack system, along with opinionated underlying interface, that enables creation of agentic applications
  • llama-cookbook - Community driven scripts and integrations

If you have any questions, please feel free to file an issue on any of the above repos and we will do our best to respond in a timely manner.

Thank you!

(Deprecated) Llama 2

We are unlocking the power of large language models. Llama 2 is now accessible to individuals, creators, researchers, and businesses of all sizes so that they can experiment, innovate, and scale their ideas responsibly.

This release includes model weights and starting code for pre-trained and fine-tuned Llama language models — ranging from 7B to 70B parameters.

This repository is intended as a minimal example to load Llama 2 models and run inference. For more detailed examples leveraging Hugging Face, see llama-cookbook.

Updates post-launch

See UPDATES.md. Also for a running list of frequently asked questions, see here.

Download

In order to download the model weights and tokenizer, please visit the Meta website and accept our License.

Once your request is approved, you will receive a signed URL over email. Then run the download.sh script, passing the URL provided when prompted to start the download.

Pre-requisites: Make sure you have wget and md5sum installed. Then run the script: ./download.sh.

Keep in mind that the links expire after 24 hours and a certain amount of downloads. If you start seeing errors such as 403: Forbidden, you can always re-request a link.

Access to Hugging Face

We are also providing downloads on Hugging Face. You can request access to the models by acknowledging the license and filling the form in the model card of a repo. After doing so, you should get access to all the Llama models of a version (Code Llama, Llama 2, or Llama Guard) within 1 hour.

Quick Start

You can follow the steps below to quickly get up and running with Llama 2 models. These steps will let yo***un quick inference locally. For more examples, see the Llama 2 cookbook repository.

  1. In a conda env with PyTorch / CUDA available clone and download this repository.

  2. In the top-level directory run:

    1
    pip install -e .
  3. Visit the Meta website and register to download the model/s.

  4. Once registered, you will get an email with a URL to download the models. You will need this URL when yo***un the download.sh script.

  5. Once you get the email, navigate to your downloaded llama repository and run the download.sh script.

    • Make sure to grant execution permissions to the download.sh script
    • During this process, you will be prompted to enter the URL from the email.
    • Do not use the “Copy Link” option but rather make sure to manually copy the link from the email.
  6. Once the model/s you want have been downloaded, you can run the model locally using the command below:

1
2
3
4
torchrun --nproc_per_node 1 example_chat_completion.py \
    --ckpt_dir llama-2-7b-chat/ \
    --tokenizer_path tokenizer.model \
    --max_seq_len 512 --max_batch_size 6

Note

  • Replace llama-2-7b-chat/ with the path to your checkpoint directory and tokenizer.model with the path to your tokenizer model.
  • The –nproc_per_node should be set to the MP value for the model you are using.
  • Adjust the max_seq_len and max_batch_size parameters as needed.
  • This example runs the example_chat_completion.py found in this repository but you can change that to a different .py file.

Inference

Different models require different model-parallel (MP) values:

Model MP
7B 1
13B 2
70B 8

All models support sequence length up to 4096 tokens, but we pre-allocate the cache according to max_seq_len and max_batch_size values. So set those according to your hardware.

Pretrained Models

These models are not finetuned for chat or Q&A. They should be prompted so that the expected answer is the natural continuation of the prompt.

See example_text_completion.py for some examples. To illustrate, see the command below to run it with the llama-2-7b model (nproc_per_node needs to be set to the MP value):

1
2
3
4
torchrun --nproc_per_node 1 example_text_completion.py \
    --ckpt_dir llama-2-7b/ \
    --tokenizer_path tokenizer.model \
    --max_seq_len 128 --max_batch_size 4

Fine-tuned Chat Models

The fine-tuned models were trained for dialogue applications. To get the expected features and performance for them, a specific formatting defined in chat_completion
needs to be followed, including the INST and <> tags, BOS and EOS tokens, and the whitespaces and breaklines in between (we recommend calling strip() on inputs to avoid double-spaces).

You can also deploy additional classifiers for filtering out inputs and outputs that are deemed unsafe. See the llama-cookbook repo for an example of how to add a safety checker to the inputs and outputs of your inference code.

Examples using llama-2-7b-chat:

1
2
3
4
torchrun --nproc_per_node 1 example_chat_completion.py \
    --ckpt_dir llama-2-7b-chat/ \
    --tokenizer_path tokenizer.model \
    --max_seq_len 512 --max_batch_size 6

Llama 2 is a new technology that carries potential risks with use. Testing conducted to date has not — and could not — cover all scenarios.
In order to help developers address these risks, we have created the Responsible Use Guide. More details can be found in our research paper as well.

Issues

Please report any software “bug”, or other problems with the models through one of the following means:

Model Card

See MODEL_CARD.md.

License

Our model and weights are licensed for both researchers and commercial entities, upholding the principles of openness. Our mission is to empower individuals, and industry through this opportunity, while fostering an environment of discovery and ethical AI advancements.

See the LICENSE file, as well as our accompanying Acceptable Use Policy

References

  1. Research Paper
  2. Llama 2 technical overview
  3. Open Innovation AI Research Community

For common questions, the FAQ can be found here which will be kept up to date over time as new questions arise.

Original Llama

The repo for the original llama release is in the llama_v1 branch.


                

                

免责声明 © 2026 - 虚宝阁

本站部分源码来源于网络,版权归属原开发者,用户仅获得使用权。依据《计算机软件保护条例》第十六条,禁止:

  • 逆向工程破解技术保护措施
  • 未经许可的分发行为
  • 去除源码中的原始版权标识

※ 本站源码仅用于学习和研究,禁止用于商业用途。如有侵权, 请及时联系我们进行处理。

侵权举报请提供: 侵权页面URL | 权属证明模板

响应时效:收到完整材料后48小时内处理

相关推荐

Stable Diffusion AI绘画界的 扛把子

Stable Diffusion AI绘画界的 扛把子

提到AI画图,没人能绕开Stable Diffusion!由Stability AI开源,支持文本生成图像、图像修复、风格迁移,关键是完全免费商用(非商用更没问题),普通电脑装个WebUI就能玩到飞起。

41831 2025-09-13
Whisper-OpenAI的语音魔术师

Whisper-OpenAI的语音魔术师

OpenAI开源的语音识别模型,能把语音转文字、文字转语音,还支持99种语言!关键是准确率超高,连带口音的中文、英文都能轻松识别,简直是会议记录、视频字幕的救星。

89153 2025-09-13
scira: AI 驱动搜索引擎

scira: AI 驱动搜索引擎

极简的 AI 驱动搜索引擎,支持引用来源

10827 2025-10-20
n8n: AI自动化工作流工具

n8n: AI自动化工作流工具

基于节点的自动化工作流工具,能帮助用户轻松创建和管理复杂的自动化流程,无需编写大量代码。并且内置了AI能力,支持 400+ 应用和服务!

146989 2025-09-26
YOLOv8-目标检测界的闪电侠

YOLOv8-目标检测界的闪电侠

YOLO系列的最新版,主打“又快又准”的目标检测。能瞬间识别图片/视频里的人、车、动物、物体,在普通显卡上就能实时处理视频流,工业级场景都在用它。

47034 2025-09-13
node-qrcode

node-qrcode

qr code generator

7955 2025-10-17
Claudable:基于 Next.js 框架的网站生成器

Claudable:基于 Next.js 框架的网站生成器

把你用自然语言描述的应用想法,直接变成可以运行的网站代码。Claudable 背后依赖强大的 AI 编程助手,主要是 Claude Code,也支持 Cursor CLI 来理解你的需求并生成代码。你不需要懂复杂的 API 设置、数据库配置或者部署流程:用简单的语言告诉 Claudable 你想要什么应用。

2525 2025-09-15
DeepFaceLab

DeepFaceLab

换脸圈的“老大哥”,开源界扛把子,功能强到离谱。不管是图片换脸、长视频换脸,还是修复模糊人脸,它都能搞定。网上很多高质量换脸视频都出自它手,但毕竟是专业级选手,上手难度稍高,得花点时间研究流程,学会了就能玩出花。

18651 2025-09-14

仓库下载

gitee

GitHub 下载代理

文件信息

文件名
文件大小
文件类型
代理耗时