| Their Ling 2 model was 1 Trillion Parameters with 50B active parameters. They made the same commitment for the flash model too, a 104B model with 7B active parameters [link] [comments] |
| Their Ling 2 model was 1 Trillion Parameters with 50B active parameters. They made the same commitment for the flash model too, a 104B model with 7B active parameters [link] [comments] |