From 1409bafd2bac9da59207c050a02e49fced68a919 Mon Sep 17 00:00:00 2001
From: 0x000011b <0x000011b@waifu.club>
Date: Sun, 1 Jan 2023 11:50:30 -0300
Subject: [PATCH] chore: update ROADMAP

---
 ROADMAP.md | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/ROADMAP.md b/ROADMAP.md
index 5614459..00d306a 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -25,9 +25,13 @@ For anyone who's interested in the actual details, here's a TL;DR version of the
   - The model checkpoint is hosted on HuggingFace under [Pygmalion-AI/pygmalion-350m](https://huggingface.co/Pygmalion-AI/pygmalion-350m).
   - **Note:** Inference should not be done on the regular HuggingFace web UI since we need to do some prompt trickery and response parsing. To play around with the model, [try out this notebook](https://colab.research.google.com/drive/1K55_MCagEDD9EmWhjCi3Bm66vJM88m6P?usp=sharing).
 - We have written a [userscript which can anonymize and dump your CharacterAI chats](./extras/characterai-dumper/), and made [a website where you can upload them](https://dump.nopanda.io/) to be used as training data for future models. If you're interested in contributing, please read through [this Rentry](https://rentry.org/f8peb) for more information.
+- We released a prototype 1.3B model fine-tuned on a new dataset, which includes the anonymized CharaterAI data.
+  - It's hosted on HuggingFace under [Pygmalion-AI/pygmalion-1.3b](https://huggingface.co/Pygmalion-AI/pygmalion-1.3b).
+  - We've already received feedback from several users (thank you everyone who took the time to test out the model and write to us!) and identified several shortcomings in it.
 
 ## Next Steps
 
-- We will attempt to fine-tune OPT-1.3B. For that, we'll need:
-  - More hardware, which we'll probably rent out in the cloud;
-  - More high-quality data, which will hopefully be covered by the contributed CAI logs and some other good datasets we can get our hands on.
+- We're on the lookout for more high-quality data sources, and we still welcome [new CharacterAI dumps](https://dump.nopanda.io/).
+- We plan on training a new 1.3B model, but only after brainstorming and making changes to our data processing pipeline in an attempt to improve on the problems we've seen on the current 1.3B.
+  - Feel free to join the Matrix server if you want to join in on the discussions.
+- Once we have a decently performing 1.3B model, we plan on setting up some sort of way for people to rate/rank bot responses and send that data to us, which will then be used to train a reward model to be used for a PPO fine-tune, similar to what OpenAI did for ChatGPT. We might join forces with other communities for this step, as well.