weekly-summary-20250608
技术
Prompt Hub
https://smith.langchain.com/hub/
微软推出NLWeb: Microsoft’s Protocol for AI-Powered Website Search
https://glama.ai/blog/2025-06-01-what-is-nlweb
Vue的可视化封装
https://github.com/graphieros/vue-data-ui
如何统计AI代码贡献率?
https://mp.weixin.qq.com/s/6_vPZU5fz8oRPulwWrlqBg
Cursor核心成员访谈:我们对AI编程的几个关键判断
https://mp.weixin.qq.com/s/Woqnmyg3Nmn2J3P7Z1WZ4w
视频:
https://www.youtube.com/watch?v=sLaxGAL_Pl0
他们提到目前编程模型的瓶颈不只是模型能力本身,而是 “反馈机制” 设计得还不够好。比如过去经常提到 Evals 测试评估,但如果没有真实有效的反馈信号(比如用户有没有保留这段代码、有没有采纳模型建议),模型其实很难真正学会什么是 “好” 的修改。
在有些情况下,确实需要通过把任务拆分成更小的部分,让模型每一部分都做对,这样才能减少奖励稀疏的问题。
未来比较合理的方向可能是多种机制结合,比如有一种机制能一次性消化 1 亿 Token,虽然每个 Token 得到的信息不多,但能大致了解代码库。然后等到你真的要做某件事时,模型能记得哪些内容相关,再去重点刷新记忆,这种方式可能最有前景。
比如 o3 这种 Agent 会一直抓取内容,直到构建出正确的上下文再解决问题。我预计未来的模型会在做决定前连续调用工具很久。
我觉得 “长上下文” 或者 “代码库专用模型” 会很重要。只要能复用之前积累的知识、理解代码结构,不用每次都重新理解,模型就会高效很多,生成答案时只需要输出关键信息。
– 这个可能就是llms.txt的作用吧
这个内容对我们做AI类产品(比如叙事可视化)很有启发:如何通过RL判断生成结果的质量?
Breakthrough Method of Agile (ai-driven) Development
https://github.com/bmadcode/BMAD-METHOD
Will the Model Eat Your Stack?
https://www.dbreunig.com/2025/05/27/will-the-model-eat-your-stack.html
我们得思考下,当前阶段,搞什么是能坚持长一点时间的、相对受到LLM升级冲击小一点的。
想法
好文摘抄
什么是AI工程师
当被问及什么是 AI 工程师时,Janvi 认为 AI 产品工程师 (AI Product Engineer) 的核心是基于模型构建产品。这包括大量的实验、原型设计,并最终将产品投入生产。其基础仍是软件工程,但需要掌握一些领域特定技能,如微调 (fine-tuning)、编写优质提示 (prompts)、托管开源模型以及构建可靠的评估体系 (evals)。
https://www.zhihu.com/question/10260155433/answer/1911925562181678282
学者,则难者亦易矣!
天下事有难易乎?学者,则难者亦易矣!
Hype Coding
https://simonwillison.net/2025/May/31/steve-krouse/
There’s a new kind of coding I call “hype coding” where you fully give into the hype, and what’s coming right around the corner, that you lose sight of whats’ possible today. Everything is changing so fast that nobody has time to learn any tool, but we should aim to use as many as possible. Any limitation in the technology can be chalked up to a ‘skill issue’ or that it’ll be solved in the next AI release next week. Thinking is dead. Turn off your brain and let the computer think for you. Scroll on tiktok while the armies of agents code for you. If it isn’t right, tell it to try again. Don’t read. Feed outputs back in until it works. If you can’t get it to work, wait for the next model or tool release. Maybe you didn’t use enough MCP servers? Don’t forget to add to the hype cycle by aggrandizing all your successes. Don’t read this whole tweet, because it’s too long. Get an AI to summarize it for you. Then call it “cope”. Most importantly, immediately mischaracterize “hype coding” to mean something different than this definition. Oh the irony! The people who don’t care about details don’t read the details about not reading the details