Posts

Showing posts with the label software development

Thoughts And Lessons Learned from Building An LLM App

Recently I built an LLM app which summarize news in a period for a user. This app contains following features. RSS Subscription:  Users can subscribe to RSS channels. Daily News Crawl:  Automatically crawls news entries from RSS channels daily. Preference Survey:  LLM surveys users to understand their news preferences. News Summaries: Generates daily or weekly summaries based on user preferences. Users can expand summary entries to fetch and summarize content from reference URLs. Updates user preferences based on click history. Question Answering:  LLM answers user questions based on crawled news (e.g., "What are the latest AI trends?"). My tech stack is Frontend:  NextJs + custom ExpressJs server + Tailwind CSS Avoid server components to simplify client-side state interaction logic. Use server components only when components are mostly independent (e.g., signin/signup page vs news summary app page). Custom ExpressJs server for flexible middle...

Notes for ICML Physics of LLM Talk

 Source:  https://youtu.be/yBL7J0kgldU?si=koiBhKpq3Cp1M8G7 research methodology deconstruct into building blocks, structure, knowledge, reasoning etc.. study in controlled way, idealized environment, control the data, tweak the params highly repeatable experiments 100m size model, universal laws 1xH100 within a day probe inner working knowledge extraction 2 types of data biography of N individuals QA data to extract the fact of the N individuals based on Biography Training data: N biographies, + N/2 QA data Test data: the other N/2 QA data If the model can perform well on the other N/2 individuals’ biography questions, then it has knowledge extraction capability Option 1: Pre train with both N biographies and N/2 QA result: good knowledge extraction Option 2: Pre train with biography data only, fine tune with QA result: bad knowledge extraction Option 3: augment the biography data for each person, pretrain with biography and fine tune with QA ...

MIT Efficient ML Course Notes and Highlights

  Personal highlights Memory movement is more expensive than computation Network latency is more significant than computation with same memory consumption, we want the network to have as much computation as possible to increase accuracy Common technique: Pruning, Quantization, Distillation different level of grouping and granularity used in pruning, quantization, parallel execution Common evaluation and optimization criteria weight significance, activation significance, tensor wise, channel wise, batch wise … l2 loss, KL divergence, accuracy, latency, number of computation, memory usage Common ideas to optimize a neural network structure using above techniques architecture option as a trainable parameter and additional loss or KD divergence Optimize the architecture params with regular weights either together or freeze one and optimize the other iteratively iteratively prune/ quantize / distill and evaluate after fine tune in each round abrasion study, delete...

AI Reading Notes: Prompt Engineering, Agent and RAG

Image
Prompt Engineering and Reasoning https://arxiv.org/pdf/2212.09597 2 types of reasoning enhancement Strategy based enhancement Prompt engineering Single stage enhancement Few shot Chain of thoughts Multi stage enhancement enhance through multiple round of input and output Define specific follow up questions Inject additional context at each round Process optimization - optimize the whole inference and training process Self-Optimization: rate and correct the output from one rationale by using extra module Ensemble-Optimization: Execute multiple rationale in parallel and do majority vote Iterative-optimization: rate the output and iteratively fine tune the model with good output External engine - optimize with help of external tools Physical simulator: use physical simulator’s output as prompt to LM code interpreter: convert LM output into code and execute other tool like calculator, search api Knowledge based enhancement Implicit knowledge Use prompt to...