Table of Contents
This is really great video; it was very thorough and had great analogies. I didn't know about the unfreezing of the partial model, that's a neat fact!
I've had many office hours with people coming with finetune related questions that I believe would be better off 99% of the time with semantic search and the Hypothetical Document Embeddings (HyDE) in the rarest of cases perhaps.
a history of transfer learning
long live NLU, rip NLP
fine tuning is tweaking a task
only similarity b/w finetuning is q/a search is that they both use embeddings at some point
fine tuning unfreezes part of a model -- does not stop confabulation (hallucination)
unfreezing an entire model is expensive af
models barf out patterns, they do not have a theory of mind or knowledge
bigger models are more convincing, but a largest model will never know itself (as an information store)
finetuning is way more difficult than prompt engineering (10,000x harder)
finetuning at scale is very hard -- how much do we share in alignment
cost of fine tuning goes up with more data -- needs constant retraining
instruct -> question + body of info -> is answer in here?
finetuning teaches model to write a new pattern
formulate -> research -> criticize -> answer
dewey decimal is indexing on a smaller set of data, compile all the relevant research and scale it
Bram Adams Newsletter
Join the newsletter to receive the latest updates in your inbox.