When we looked, use of “goblin” in ChatGPT had risen by 175% after the launch of GPT‑5.1, while “gremlin” had risen by 52%.
令人震惊的数据表明,一个看似无害的偏好可以迅速在模型中扩散,突显了监控和及时响应模型行为变化的重要性。
When we looked, use of “goblin” in ChatGPT had risen by 175% after the launch of GPT‑5.1, while “gremlin” had risen by 52%.
令人震惊的数据表明,一个看似无害的偏好可以迅速在模型中扩散,突显了监控和及时响应模型行为变化的重要性。
The release includes DeepSeek-V4-Pro (1.6T total / 49B active) and DeepSeek-V4-Flash (284B total / 13B active), both trained natively at 1M context length.
DeepSeek V4的模型规模之大令人震惊,这表明了在长上下文处理方面取得的显著进步。
These papers suggest that strategic data engineering and inference-time optimization can substitute for raw parameter count.
这一观点提出了通过数据工程和推理时间优化来提高模型性能的新方法,为模型优化提供了新的思路。
🔹 **DeepSeek-V4-Flash:** 284B total / 13B active params. Your fast, efficient, and economical choice.
DeepSeek-V4-Flash的参数规模明显小于Pro版本:总参数2840亿,活跃参数130亿。参数效率比约为4.6%,略高于Pro版本。这种参数设计使其在保持性能的同时实现更快响应和更低成本,适合需要快速响应的应用场景。
🔹 **DeepSeek-V4-Pro:** 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.
这里提供了DeepSeek-V4-Pro的具体参数数据:总参数1.6万亿,活跃参数490亿。这种参数规模远超大多数开源模型,接近顶级闭源模型。参数效率比(活跃参数/总参数)约为3%,表明采用了稀疏激活技术,这可能是其性能与效率平衡的关键。
Two variants are available: **Sakana Fugu Mini 🐟**, optimized with latency in mind, and **Sakana Fugu Ultra 🐡**, the full orchestration system, optimized for performance for demanding tasks.
文章提到有两种变体:Mini(延迟优化)和Ultra(性能优化),但未提供具体的性能指标差异,如延迟降低百分比或吞吐量提升数据。这种缺乏具体量化参数的描述难以评估两种变体在实际应用中的性能差异。
GPQAD | 94.4 | 90.9 | 92.7 | 92.4 | **95.1** | LCBv6 | 90.3 | 92.1 | 92.4 | 90.4 | **93.2** | SWEPro | 48.4 | 51.2 | _53.4_ | 51.3 | **54.2**
性能对比表格显示,Sakana Fugu Ultra在三个基准测试中均优于竞争对手:GPQAD上达95.1%(超越Gemini 3.1的94.4%),LCBv6上达93.2%(超越GPT 5.4的92.1%),SWEPro上达54.2%(超越Opus 4.6的53.4%)。这些数据表明其多模型协调策略确实带来了性能提升,特别是在科学推理任务上优势明显。
The best-performing model across these three metrics was a pair of independent linear trends: one for reasoning models and one for non-reasoning models.
这个模型选择结果(100%的三个指标)表明将模型分为推理和非推理两类是最优预测模型。这提供了强有力的统计证据,支持推理能力可能是AI加速发展的关键因素。然而,文章没有详细说明如何定义推理模型,这可能影响结果的可靠性。
Reasoning models show both a one-off jump in performance and a roughly 2-3x faster trend compared to non-reasoning models.
这是一个重要的性能对比数据,表明推理模型比非推理模型的进步速度快2-3倍。这是一个显著的加速比率,暗示推理能力的突破可能代表了AI发展的一个转折点。然而,文章没有提供具体的基准测试数据来支持这一倍数关系,需要谨慎对待。
The best-performing model across these three metrics was a pair of independent linear trends: one for reasoning models and one for non-reasoning models.
这个发现表明推理模型和非推理模型的发展轨迹确实存在显著差异。这种分离的线性趋势模型在三个指标上表现最佳,100%的情况下优于其他模型,提供了强有力的统计证据支持AI能力加速的论点。
Reddit, Shutterstock, and News Corp are making hundreds of millions a year licensing their high-quality data to companies training AI, and those contracts are growing about 20 percent annually, according to their quarterly filings.
这一数据揭示了AI训练数据市场的巨大经济价值,表明高质量数据已成为AI公司的战略资产。传统内容公司正在转型为AI的'输入公司',这种转变不仅改变了他们的商业模式,也重新定义了数据在AI生态系统中的核心地位。
SOTA models of different architectures and parameter scales exhibit highly consistent failure patterns on the same set of hard samples, suggesting that the performance bottleneck stems from shared deficiencies in training data rather than architecture itself.
大多数人认为不同架构的模型会有不同的失败模式和弱点,但作者发现无论架构和参数规模如何,SOTA模型在相同困难样本上表现出高度一致的失败模式,这表明性能瓶颈源于训练数据的共同缺陷,而非架构差异,这一发现挑战了模型多样化的传统观点。
Without any architectural modification, MinerU2.5-Pro achieves 95.69 on OmniDocBench v1.6, improving over the same-architecture baseline by 2.71 points and surpassing all existing methods including models with over 200× more parameters.
大多数人认为更大的模型架构必然带来性能提升,但作者仅通过数据工程和训练策略优化,在保持1.2B参数架构不变的情况下,超越了参数量超过200倍的现有模型,这挑战了'越大越好'的行业共识,证明了数据质量的重要性。
reserved words
Perhaps a sort of protobuf is better.
The consensus is reached in the same way as fortransactions i.e. using hasgraph consensus algorithm. The onlydifference is, that the concerning events in the hashgraph nowcontain other type of data instead of transactions
Not necessarily, how to store received events is an implementation detail. One could dump them in an array on a side. Can be as efficient as array of pointers to events. Where idx of this array is event's position in total order.
research data life cycle
Annotated with RDA Tags: Working groups
an object-oriented approach to data modelling – where data is described in terms of classes, attributes, and associations
Conceptual data model: describes the semantics of a domain, being the scope of the model. For example, it may be a model of the interest area of an organization or industry. This consists of entity classes, representing kinds of things of significance in the domain, and relationship assertions about associations between pairs of entity classes. A conceptual schema specifies the kinds of facts or propositions that can be expressed using the model. In that sense, it defines the allowed expressions in an artificial 'language' with a scope that is limited by the scope of the model.
"Data models for different systems are arbitrarily different. The result of this is that complex interfaces are required between systems that share data. These interfaces can account for between 25-70% of the cost of current systems".
The term data model can refer to two distinct but closely related concepts
A data model can sometimes be referred to as a data structure, especially in the context of programming languages.
Sometimes it refers to an abstract formalization of the objects and relationships found in a particular application domain
A data model[1][2][3][4][5] is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities.
个人学习可能取决于他人行为的主张突出了将学习环境视为一个涉及多个互动参与者的系统的重要性
You own your success.
ReconfigBehSci. (2022, January 24). @STWorg @FraserNelson @GrahamMedley no worse- he took Medley’s comment that Sage model the scenarios the government asks them to consider to mean that they basically set out to find the justification for what the government already wanted to do. Complete failure to distinguish between inputs and outputs of a model [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1485625862645075970
ReconfigBehSci [@SciBeh]. (2021, November 1). RT @HJWesteneng: Growth advantage and extrapolation of AY.4.2 based on Sanger Institute data in the UK (multilevel multinomial model). Base… [Tweet]. Twitter. https://twitter.com/SciBeh/status/1455467011509731332
Dr Nisreen Alwan 🌻. (2020, March 14). Our letter in the Times. ‘We request that the government urgently and openly share the scientific evidence, data and modelling it is using to inform its decision on the #Covid_19 public health interventions’ @richardhorton1 @miriamorcutt @devisridhar @drannewilson @PWGTennant https://t.co/YZamKCheXH [Tweet]. @Dr2NisreenAlwan. https://twitter.com/Dr2NisreenAlwan/status/1238726765469749248
Country State (belongs to country) City (belongs to State) Neighborhood (belongs to city)
Liu, C., Yang, Y., Chen, B., Cui, T., Shang, F., & Li, R. (2022). Revealing spatio-temporal interaction patterns behind complex cities. ArXiv:2201.02117 [Physics]. http://arxiv.org/abs/2201.02117
Padilla, L., Hosseinpour, H., Fygenson, R., Howell, J., Chunara, R., & Bertini, E. (2021). Effects of COVID-19 Uncertainty Visualizations on Novice Risk Estimates. PsyArXiv. https://doi.org/10.31234/osf.io/6axc7
Cepelewicz, J. (n.d.). The Hard Lessons of Modeling the Coronavirus Pandemic. Quanta Magazine. Retrieved February 11, 2021, from https://www.quantamagazine.org/the-hard-lessons-of-modeling-the-coronavirus-pandemic-20210128/
Nsoesie, E. O., Oladeji, O., Abah, A. S. A., & Ndeffo-Mbah, M. L. (2021). Forecasting influenza-like illness trends in Cameroon using Google Search Data. Scientific Reports, 11(1), 6713. https://doi.org/10.1038/s41598-021-85987-9
Cantwell, G. T., Kirkley, A., & Newman, M. E. J. (2020). The friendship paradox in real and model networks. ArXiv:2012.03991 [Physics]. http://arxiv.org/abs/2012.03991
We love dbt because of the values it embodies. Individual transformations are SQL SELECT statements, without side effects. Transformations are explicitly connected into a graph. And support for testing is first-class. dbt is hugely enabling for an important class of users, adapting software engineering principles to a slightly different domain with great ergonomics. For users who already speak SQL, dbt’s tooling is unparalleled.
when using [[dbt]] the [[transformations]] are [[SQL statements]] - already something that our team knows
The attribution data modelIn reality, it’s impossible to know exactly why someone converted to being a customer. The best thing that we can do as analysts, is provide a pretty good guess. In order to do that, we’re going to use an approach called positional attribution. This means, essentially, that we’re going to weight the importance of various touches (customer interactions with a brand) based on their position (the order they occur in within the customer’s lifetime).To do this, we’re going to build a table that represents every “touch” that someone had before becoming a customer, and the channel that led to that touch.
One of the goals of an [[attribution data model]] is to understand why someone [[converted]] to being a customer. This is impossible to do accurately, but this is where analysis comes in.
There are some [[approaches to attribution]], one of those is [[positional attribution]]
[[positional attribution]] is that we are weighting the importance of touch points - or customer interactions, based on their position within the customer lifetime.
ORWG Virtual Meeting 08/09/2020 https://www.youtube.com/playlist?list=PLOA0aRJ90NxvXtMt5Si5ukmR9LYfvDueB (n.d.)
Karatayev, Vadim A., Madhur Anand, and Chris T. Bauch. ‘Local Lockdowns Outperform Global Lockdown on the Far Side of the COVID-19 Epidemic Curve’. Proceedings of the National Academy of Sciences 117, no. 39 (29 September 2020): 24575–80. https://doi.org/10.1073/pnas.2014385117.
Stock, James H. ‘Data Gaps and the Policy Response to the Novel Coronavirus’. Working Paper. Working Paper Series. National Bureau of Economic Research, March 2020. https://doi.org/10.3386/w26902.
Baqaee, D., Farhi, E., Mina, M. J., & Stock, J. H. (2020). Reopening Scenarios (Working Paper No. 27244; Working Paper Series). National Bureau of Economic Research. https://doi.org/10.3386/w27244
Another Ruby gem, Spira, allows graph data to be used as model objects
Fernández-Villaverde, J., & Jones, C. I. (2020). Estimating and Simulating a SIRD Model of COVID-19 for Many Countries, States, and Cities (Working Paper No. 27128; Working Paper Series). National Bureau of Economic Research. https://doi.org/10.3386/w27128
Brooks, H. Z., Kanjanasaratool, U., Kureh, Y. H., & Porter, M. A. (2020). Disease Detectives: Using Mathematics to Forecast the Spread of Infectious Diseases [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/mvn9z
Gupta, H., & Porter, M. A. (2020). Mixed Logit Models and Network Formation. ArXiv:2006.16516 [Physics, Stat]. http://arxiv.org/abs/2006.16516
Rahman, M., Ali, G. G. M. N., Li, X. J., Paul, K. C., & Chong, P. H. J. (2020). Twitter and Census Data Analytics to Explore Socioeconomic Factors for Post-COVID-19 Reopening Sentiment [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/fz4ry
The diagram was generated with rails-erd
Domain Model
Response to “Modelling the pandemic”: Reconsidering the quality of evidence from epidemiological models. (2020). https://www.bmj.com/content/369/bmj.m1567/rr-0
Krönke, J., Wunderling, N., Winkelmann, R., Staal, A., Stumpf, B., Tuinenburg, O. A., & Donges, J. F. (2020). Dynamics of tipping cascades on complex networks. Physical Review E, 101(4), 042311. https://doi.org/10.1103/PhysRevE.101.042311
Edelsbrunner, P. A., & Thurn, C. (2020, April 22). Improving the Utility of Non-Significant Results for Educational Research. https://doi.org/10.31234/osf.io/j93a2
Etilé, F., Johnston, D., Frijters, P., & Shields, M. (2020, April 16). Psychological Resilience to Major Socioeconomic Life Events. https://doi.org/10.31234/osf.io/vp48c
The Web Annotation Data Model specification describes a structured model and format to enable annotations to be shared and reused across different hardware and software platforms.
The publication of this web standard changed everything. I look forward to true testing of interoperable open annotation. The publication of the standard nearly three years ago was a game changer, but the game is still in progress. The future potential is unlimited!
This article is a great example of a research model in measuring outcomes of adult learning.
On the other hand, a resource may be generic in that as a concept it is well specified but not so specifically specified that it can only be represented by a single bit stream. In this case, other URIs may exist which identify a resource more specifically. These other URIs identify resources too, and there is a relationship of genericity between the generic and the relatively specific resource.
I was not aware of this page when the Web Annotations WG was working through its specifications. The word "Specific Resource" used in the Web Annotations Data Model Specification always seemed adequate, but now I see that it was actually quite a good fit.
modelling UK parliament
The importance of models may need to be underscored in this age of “big data” and “data mining”. Data, no matter how big, can only tell you what happened in the past. Unless you’re a historian, you actually care about the future — what will happen, what could happen, what would happen if you did this or that. Exploring these questions will always require models. Let’s get over “big data” — it’s time for “big modeling”.
Num / Num summarizes a graph with nodes / arcs.
The underlying graph model is not explicitly mentionned here nor in the README of the plugin.