Posts

Showing posts from July, 2024

AI Reading Notes: Prompt Engineering, Agent and RAG

Image
Prompt Engineering and Reasoning https://arxiv.org/pdf/2212.09597 2 types of reasoning enhancement Strategy based enhancement Prompt engineering Single stage enhancement Few shot Chain of thoughts Multi stage enhancement enhance through multiple round of input and output Define specific follow up questions Inject additional context at each round Process optimization - optimize the whole inference and training process Self-Optimization: rate and correct the output from one rationale by using extra module Ensemble-Optimization: Execute multiple rationale in parallel and do majority vote Iterative-optimization: rate the output and iteratively fine tune the model with good output External engine - optimize with help of external tools Physical simulator: use physical simulator’s output as prompt to LM code interpreter: convert LM output into code and execute other tool like calculator, search api Knowledge based enhancement Implicit knowledge Use prompt to

AI Reading Notes: Deep Learning and Large Language Model Basics

Image
  GPT High level understandings Computational Irreducibility: Some computation can’t be reduced to something quite immediate. Tension between Learnability and computational irreducibility. Learning is to compress data by leveraging regularities inside the data. But computational irreducibility implies there is a limit to regularities where the data can’t be compressed too much. Tradeoff between capability and trainability: the more you want a system to make “true use” of its computational capabilities, the more it’s going to show computational irreducibility, and the less it’s going to be trainable. And the more it’s fundamentally trainable, the less it’s going to be able to do sophisticated computation. ChatGPT is successfully able to “capture the essence” of human language and the thinking behind it and has the potential to be the “world model” Popular Neural Network Convolutional neural network (CNN) Instead of a fully connected feed-forward network, only connect a n