-->
FlashAttention 安装学习笔记在安装和编译 FlashAttention 的过程中,遇到如下问题,进行记录,以防后面再次踩坑CUDA 环境变量设置问题描述在编译 flash-attn 的时候,系统找不到 CUDA 相关头文件或编译器,表现为:nvcc not foundcannot find cuda.h排查过程运行 ls /usr/local/cuda/include 确认是否存在 cusparse.h、cusparse_v2.h 等头文件执行 which nvcc 或 ps -ef | grep nvcc 检查 nvcc 是否在路径中解决办法需要在 环境变量 中显式指定 CUDA 路径:export CUDA_HOME=/usr/local/cuda export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH这样 nvcc 就能被找到,头文件和库路径也能被正常识别CUDA 运行时与 PyTorch 编译版本不对应使用conda安装p
Listening and note-takingTips for note-taking:Don't take notes just for taking notes. Take it in an valuable and worthy way.Concentrate on important words (usually nouns, verbs, etc.) that give important information to you.Omit unimportant words.Have a consistent system for punctuation and abbreviation that will make sense to you. (It's unnecessary to make sense for others)Make the best of abbreviations, symbols and shapes.Summary writingGist: the main or essential part Summary: a comprehensive
Discussion skillsEffective discussion should be:Be an attentive audienceA discussion should have a conclusion (decisions), otherwise it is an ineffective discussionTips for participating a discussion:For freshmen:ask a question so that the speaker can clarify or elaborate on a point that you don't understandask cause-and-effectFor junior students:explain that you find the speaker's opinions or points interesting and describe whybased on what the speaker had said and extend his opinion / pointspa
Final part of paper writingSection of resultsIn this section, specific data of the work and detailed analysis of the data should be fully presented.Three moves to write results part:Provide preparatory informationDescribe the data in figures and tablesReport the results based on the dataWriting requirements for results:any data given in this section must be meaningfulpresentation of the data should be short without verbiage and be crystal claritythis section should also be well-written and shoul
M/G/1 queuing systemG means general distribution / more general situation. In this case, service time of the server has a general distribution with mean $X=E[x ]=1/\mu$ and standard deviation $\sigma_x$, where $mu$ is the mean service rate (services per second, etc.)Mean waiting time for a service request can be described by P-K formula: $$W=\frac{\lambda E[x^2]}{2(1-\rho)} $$ Using Little's Law, we can obtain the mean number of service requests in the buffer: $$N_Q=\lambda W=\frac{\lambda^2 E[x
How to write a good introductionImportancemost read partdetermine the attitude of readerthe most dedicate partThe earlier you write the introduction part, the more you have to polish in the end. It's greatly suggested writing the introduction part after you finish the main body or at least the draft of your whole paper.Writing grammarTensePresent Simple/Present Continuous: this pair is used to state accepted facts and truths.Past Simple/Present Perfect: this pair is used to discuss the knowledge
Paper publication issuesCategories of papers:General articleTheoretical articleTechnical articleReport of significant achievementsOther kinds of articlesClassification of journals:JournalRapid communication / lettersReviewProceedingsOther technical documents: standard, patent, etc.Procedures of paper submission:In advance: Choose target journalsnecessary selection: read few published papers in this journal, it's better done before writing your paperone paper can only be submitted to one journal
Ergodic process:As nothing is neither generated or lost in the queuing system, the arrival rate of packages equals to the departure rate.Little's LawStochastic processes and Markov chainsGiven appropriate state space and time variable, a stochastic process can be properly described. If the time variable is discrete, the process is called discrete-time process; otherwise, it's called continuous-time process.$\mu$ is the number of customers that can be served at one time (serving rate) $\lambda$
BangyaoWang
不啻微芒,造炬成阳