Notes on OpenAI Q&A Finetuning GPT-3 Vs Semantic Search - Which to Use, When, and Why

A great video about finetuning vs semantic search. Finetuning teaches a model to write new patterns, not to have a theory of mind.

Bram Adams
Overall Thoughts

This is really great video; it was very thorough and had great analogies. I didn't know about the unfreezing of the partial model, that's a neat fact!

I've had many office hours with people coming with finetune related questions that I believe would be better off 99% of the time with semantic search and the Hypothetical Document Embeddings (HyDE) in the rarest of cases perhaps.

Notated Transcript


a history of transfer learning


long live NLU, rip NLP


fine tuning is tweaking a task


only similarity b/w finetuning is q/a search is that they both use embeddings at some point


fine tuning unfreezes part of a model -- does not stop confabulation (hallucination)


unfreezing an entire model is expensive af


models barf out patterns, they do not have a theory of mind or knowledge

bigger models are more convincing, but a largest model will never know itself (as an information store)


finetuning is way more difficult than prompt engineering (10,000x harder)


finetuning at scale is very hard -- how much do we share in alignment


cost of fine tuning goes up with more data -- needs constant retraining


instruct -> question + body of info -> is answer in here?


finetuning teaches model to write a new pattern


formulate -> research -> criticize -> answer


dewey decimal is indexing on a smaller set of data, compile all the relevant research and scale it


