Microsoft’s new phi-1.5 1.3B model outperforms llama2-7b in benchmarks

Microsoft’s new phi-1.5 1.3B model outperforms llama2-7b in benchmarks

Microsoft Research yesterday released a new language model called phi-1.5. The phi-1.5 is a Transformer with 1.3 billion parameters and is best suited for prompts using the QA format, the chat format, and the code format.

This new model was trained using variety of data sources, including subsets of Python codes from Q&A content from StackOverflow, competition code from code_contests, synthetic Python textbooks, exercises generated by gpt-3.5-turbo-0301, augmented with a new data source that consists of various NLP synthetic texts.

According to Microsoft Research team, phi-1.5 demonstrates a nearly state-of-the-art performance among models with less than 10 billion parameters when assessed against benchmarks testing common sense, language understanding, and logical reasoning. The phi-1.5 beats Meta’s llama-2 7b at AGIEval score and nearly up to par with llama-2 7b in GPT4ALL’s Benchmark suite with LM-Eval Harness.

Microsoft released this open-source model to provide the research community with a non-restricted small model to explore vital safety challenges.

phi-1.5 model details:

  • Architecture: a Transformer-based model with next-word prediction objective
  • Dataset size: 30B tokens
  • Training tokens: 150B tokens
  • Precision: fp16
  • GPUs: 32xA100-40G
  • Training time: 8 days

You can check out the new phi-1.5 model at Hugging Face here.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button