230 Matching Annotations
  1. Sep 2023
    1. During the discussion, Musk latched on to a key fact the team had discovered: The neural network did not work well until it had been trained on at least a million video clips.
    2. By early 2023, the neural network planner project had analyzed 10 million clips of video collected from the cars of Tesla customers. Did that mean it would merely be as good as the average of human drivers? “No, because we only use data from humans when they handled a situation well,” Shroff explained. Human labelers, many of them based in Buffalo, New York, assessed the videos and gave them grades. Musk told them to look for things “a five-star Uber driver would do,” and those were the videos used to train the computer.
    3. The “neural network planner” that Shroff and others were working on took a different approach. “Instead of determining the proper path of the car based on rules,” Shroff says, “we determine the car’s proper path by relying on a neural network that learns from millions of examples of what humans have done.” In other words, it’s human imitation. Faced with a situation, the neural network chooses a path based on what humans have done in thousands of similar situations. It’s like the way humans learn to speak and drive and play chess and eat spaghetti and do almost everything else; we might be given a set of rules to follow, but mainly we pick up the skills by observing how other people do them.
    1. Wang et. al. "Scientific discovery in the age of artificial intelligence", Nature, 2023.

      A paper about the current state of using AI/ML for scientific discovery, connected with the AI4Science workshops at major conferences.

      (NOTE: since Springer/Nature don't allow public pdfs to be linked without a paywall, we can't use hypothesis directly on the pdf of the paper, this link is to the website version of it.)

  2. Aug 2023
    1. Title: Delays, Detours, and Forks in the Road: Latent State Models of Training Dynamics Authors: Michael Y. Hu1 Angelica Chen1 Naomi Saphra1 Kyunghyun Cho Note: This paper seems cool, using older interpretable machine learning models, graphical models to understand what is going on inside a deep neural network

      Link: https://arxiv.org/pdf/2308.09543.pdf

  3. Jul 2023
    1. Shayan Shirahmad Gale Bagi, Zahra Gharaee, Oliver Schulte, and Mark Crowley Generative Causal Representation Learning for Out-of-Distribution Motion Forecasting In International Conference on Machine Learning (ICML). Honolulu, Hawaii, USA. Jul, 2023.

  4. Jun 2023
    1. We use the same model and architecture as GPT-2

      What do they mean by "model" here? If they have retrained on more data, with a slightly different architecture, then the model weights after training must be different.

    1. Recent work in computer vision has shown that common im-age datasets contain a non-trivial amount of near-duplicateimages. For instance CIFAR-10 has 3.3% overlap betweentrain and test images (Barz & Denzler, 2019). This results inan over-reporting of the generalization performance of ma-chine learning systems.

      CIFAR-10 performance results are overestimates since some of the training data is essentially in the test set.

    1. Blog post comparing ASG (Auto Segmentation Criterion - yes, the last letter doesn't match) to CTC (Connectionist Temporal Classification) for aligning speech recognition model outputs with a transcript.

  5. May 2023
    1. Minimum sample size for external validation of a clinicalprediction model with a binary outcome

      Minimum sample size for external validation of a clinical prediction model with a binary outcome

  6. Mar 2023
    1. we have turned to machine learning, an ingenious way of disclaiming responsibility for anything. Machine learning is like money laundering for bias. It's a clean, mathematical apparatus that gives the status quo the aura of logical inevitability. The numbers don't lie.

      Machine learning like money laundering for bias

    1. RSFCR can directlymodel non-linear effects and interactions to performaccurate prediction without making any prior assump-tions about the underlying data.

      Importante. Se pueden modelar efectos e interacciones para hacer predicciones predcisas sin la necesidad de cumplir con alguna asunción previa.

    2. The aims of this manuscript can be summarised as:(i) examination of extensions of PLANNCR method(PLANNCR extended) for the development and vali-dation of prognostic clinical prediction models withcompeting events, (ii) systematic evaluation of model-predictive performance for ML techniques (PLANNCRoriginal, PLANNCR extended, RSFCR) and SM (cause-specific Cox, Fine-Gray) regarding discrimination andcalibration, (iii) investigation of the potential role ofML in contrast to conventional regression methods forCRs in non-complex eSTS data (small/medium samplesize, low dimensional setting), (iv) practical utility of themethods for prediction

      Objetivos del estudio

    3. Nowadays, there is a growing interest in applyingmachine learning (ML) for prediction (diagnosis or prog-nosis) of clinical outcomes [12, 13] which has sparked adebate regarding the added value of ML techniques ver-sus SM in the medical field. Criticism is attributed toML prediction models. Despite no assumptions aboutthe data structure are made, and being able to naturallyincorporate interactions between predictive features,they are prone to overfitting of the training data andthey lack extensive assessment of predictive accuracy(i.e., absence of calibration curves) [14, 15]. On the otherhand, traditional regression methods are consideredstraightforward to use and harder to overfit. That beingsaid, they do make certain (usually strong) assumptionssuch as the proportional hazards over time for the Coxmodel, and require manual pre-specification of interac-tion terms.

      pros and cons about machine learning and traditional regression survival analysis such as KM-SV

    4. In health research, several chronic diseases are susceptible to competing risks (CRs). Initially, statisticalmodels (SM) were developed to estimate the cumulative incidence of an event in the presence of CRs. As recentlythere is a growing interest in applying machine learning (ML) for clinical prediction, these techniques have also beenextended to model CRs but literature is limited. Here, our aim is to investigate the potential role of ML versus SM forCRs within non-complex data (small/medium sample size, low dimensional setting).

      Comparison between statistical models and machine learning models for competing risks.

  7. Feb 2023
    1. No new physics and no new mathematics was discovered by the AI. The AI did however deduce something from the existing math and physics, that no one else had yet seen. Skynet is not coming for us yet.

    1. https://pair.withgoogle.com/

      People + AI Research (PAIR) is a multidisciplinary team at Google that explores the human side of AI by doing fundamental research, building tools, creating design frameworks, and working with diverse communities.

    1. There’s a holy trinity in machine learning: models, data, and compute. Models are algorithms that take inputs and produce outputs. Data refers to the examples the algorithms are trained on. To learn something, there must be enough data with enough richness that the algorithms can produce useful output. Models must be flexible enough to capture the complexity in the data. And finally, there has to be enough computing power to run the algorithms.

      “Holy trinity” of machine learning: models, data, and compute

      Models in 1990s, starting with convolutional neural networks for computer vision.

      Data in 2009 in the form of labeled images from Stanford AI researchers.

      Compute in 2006 with Nvidia’s CUDA programming language for GPUs.

      AlexNet in 2012 combined all of these.

  8. Jan 2023
    1. We can have a machine learning model which gives more than 90% accuracy for classification tasks but fails to recognize some classes properly due to imbalanced data or the model is actually detecting features that do not make sense to be used to predict a particular class.

      Les mesures de qualite d'un modele de machine learning

  9. Dec 2022
    1. Dans le zero-shot learning, le modèle doit être capable de généraliser ce qu'il a appris sur des exemples précédents pour effectuer une tâche sur laquelle il n'a jamais été entraîné. Cela signifie que le modèle doit être capable de transférer ses connaissances acquises sur une tâche donnée à une nouvelle tâche, sans avoir besoin d'exemples d'entraînement spécifiques pour cette nouvelle tâche.

      0-shot learning

    1. The collocation results can be used to correct the sensor data to more closely match thedata from the reference instrument. This correction process helps account for known biasand unknown interferences from weather and other pollutants and is typically done bydeveloping an algorithm. An algorithm can be a simple equation or more sophisticatedprocess (e.g., set of rules, machine learning) that is applied to the sensor data. This sectionfurther discusses the process of correcting sensor data

      correction factors for collocated sensors using ML

    Tags

    Annotators

    1. Emergent abilities are not present in small models but can be observed in large models.

      Here’s a lovely blog by Jason Wei that pulls together 137 examples of ’emergent abilities of large language models’. Emergence is a phenomenon seen in contemporary AI research, where a model will be really bad at a task at smaller scales, then go through some discontinuous change which leads to significantly improved performance.

  10. Nov 2022
    1. Eamonn Keogh is an assistant professor of Computer Science at the University ofCalifornia, Riverside. His research interests are in Data Mining, Machine Learning andInformation Retrieval. Several of his papers have won best paper awards, includingpapers at SIGKDD and SIGMOD. Dr. Keogh is the recipient of a 5-year NSF CareerAward for “Efficient Discovery of Previously Unknown Patterns and Relationships inMassive Time Series Databases”.

      Look into Eamonn Keogh's papers that won "best paper awards"

    1. “The metaphor is that the machine understands what I’m saying and so I’m going to interpret the machine’s responses in that context.”

      Interesting metaphor for why humans are happy to trust outputs from generative models

    1. The rapid increase in both the quantity and complexity of data that are being generated daily in the field of environmental science and engineering (ESE) demands accompanied advancement in data analytics. Advanced data analysis approaches, such as machine learning (ML), have become indispensable tools for revealing hidden patterns or deducing correlations for which conventional analytical methods face limitations or challenges. However, ML concepts and practices have not been widely utilized by researchers in ESE. This feature explores the potential of ML to revolutionize data analysis and modeling in the ESE field, and covers the essential knowledge needed for such applications. First, we use five examples to illustrate how ML addresses complex ESE problems. We then summarize four major types of applications of ML in ESE: making predictions; extracting feature importance; detecting anomalies; and discovering new materials or chemicals. Next, we introduce the essential knowledge required and current shortcomings in ML applications in ESE, with a focus on three important but often overlooked components when applying ML: correct model development, proper model interpretation, and sound applicability analysis. Finally, we discuss challenges and future opportunities in the application of ML tools in ESE to highlight the potential of ML in this field.

      环境科学与工程(ESE)领域日益增长的数据量和复杂性,伴随着数据分析技术的进步而不断提高。先进的数据分析方法,如机器学习(ML) ,已经成为揭示隐藏模式或推断相关性的不可或缺的工具,而传统的分析方法面临着局限性或挑战。然而,机器学习的概念和实践并没有得到广泛的应用。该特性探索了机器学习在 ESE 领域革新数据分析和建模的潜力,并涵盖了此类应用所需的基本知识。首先,我们使用五个示例来说明 ML 如何处理复杂的 ESE 问题。然后,我们总结了机器学习在 ESE 中的四种主要应用类型: 预测、提取特征重要性、检测异常和发现新材料或化学品。接下来,我们介绍了 ESE 中机器学习应用所需的基本知识和目前存在的缺陷,重点介绍了应用机器学习时三个重要但经常被忽视的组成部分: 正确的模型开发、适当的模型解释和良好的适用性分析。最后,我们讨论了机器学习工具在 ESE 中的应用所面临的挑战和未来的机遇,以突出机器学习在这一领域的潜力。

    1. "On the Opportunities and Risks of Foundation Models" This is a large report by the Center for Research on Foundation Models at Stanford. They are creating and promoting the use of these models and trying to coin this name for them. They are also simply called large pre-trained models. So take it with a grain of salt, but also it has a lot of information about what they are, why they work so well in some domains and how they are changing the nature of ML research and application.

    1. Technology like this, which lets you “talk” to people who’ve died, has been a mainstay of science fiction for decades. It’s an idea that’s been peddled by charlatans and spiritualists for centuries. But now it’s becoming a reality—and an increasingly accessible one, thanks to advances in AI and voice technology. 
  11. Oct 2022
    1. There's no market for a machine-learning autopilot, or content moderation algorithm, or loan officer, if all it does is cough up a recommendation for a human to evaluate. Either that system will work so poorly that it gets thrown away, or it works so well that the inattentive human just button-mashes "OK" every time a dialog box appears.

      ML algorithms must work or not work

  12. Sep 2022
  13. Aug 2022
  14. Jul 2022
    1. because it only needs to engage a portion of the model to complete a task, as opposed to other architectures that have to activate an entire AI model to run every request.

      i don't really understand this: in z-code thre are tasks that other competitive softwares would need to restart all over again while z-code can do it without restarting...

  15. Jun 2022
    1. determine the caliphate; and another group led by Mu'awiya in the Levant, who demanded revenge for Uthman's blood. He defeated the first group in the Battle of the Camel; but in the end,

      this is another post

    1. Discussion of the paper:

      Ghojogh B, Ghodsi A, Karray F, Crowley M. Theoretical Connection between Locally Linear Embedding, Factor Analysis, and Probabilistic PCA. Proceedings of the Canadian Conference on Artificial Intelligence [Internet]. 2022 May 27; Available from: https://caiac.pubpub.org/pub/7eqtuyyc

  16. Apr 2022
  17. Feb 2022
  18. Jan 2022
    1. We are definitely living in interesting times!

      The problem with Machine learning in my eyes seems to be the non-transparency in the field. After all what makes the data we are researching valuable. If he collect so much data why is only .5% being studied? There seems to be a lot missing and big opportunities here that aren't being used properly.

  19. Dec 2021
  20. Oct 2021
  21. Sep 2021
    1. a class of attacks that were enabled by Privacy Badger’s learning. Essentially, since Privacy Badger adapts its behavior based on the way that sites you visit behave, a dedicated attacker could manipulate the way Privacy Badger acts: what it blocks and what it allows. In theory, this can be used to identify users (a form of fingerprinting) or to extract some kinds of information from the pages they visit
  22. Jul 2021
  23. Jun 2021
    1. The problem is, algorithms were never designed to handle such tough choices. They are built to pursue a single mathematical goal, such as maximizing the number of soldiers’ lives saved or minimizing the number of civilian deaths. When you start dealing with multiple, often competing, objectives or try to account for intangibles like “freedom” and “well-being,” a satisfactory mathematical solution doesn’t always exist.

      We do better with algorithms where the utility function can be expressed mathematically. When we try to design for utility/goals that include human values, it's much more difficult.

    2. many other systems that are already here or not far off will have to make all sorts of real ethical trade-offs

      And the problem is that, even human beings are not very sensitive to how this can be done well. Because there is such diversity in human cultures, preferences, and norms, deciding whose values to prioritise is problematic.

  24. May 2021
  25. Apr 2021
    1. Machine learning app development has been gaining traction among companies from all over the world. When dealing with this part of machine learning application development, you need to remember that machine learning can recognize only the patterns it has seen before. Therefore, the data is crucial for your objectives. If you’ve ever wondered how to build a machine learning app, this article will answer your question.

    1. The insertion of an algorithm’s predictions into the patient-physician relationship also introduces a third party, turning the relationship into one between the patient and the health care system. It also means significant changes in terms of a patient’s expectation of confidentiality. “Once machine-learning-based decision support is integrated into clinical care, withholding information from electronic records will become increasingly difficult, since patients whose data aren’t recorded can’t benefit from machine-learning analyses,” the authors wrote.

      There is some work being done on federated learning, where the algorithm works on decentralised data that stays in place with the patient and the ML model is brought to the patient so that their data remains private.

  26. Mar 2021
  27. Feb 2021
  28. Jan 2021
    1. I present the Data Science Venn Diagram… hacking skills, math and stats knowledge, and substantive expertise.

      An understanding of advanced statistics is a must as the methodologies get more complex and new methods are being created such as machine learning

    1. Zappos created models to predict customer apparel sizes, which are cached and exposed at runtime via microservices for use in recommendations.

      There is another company named Virtusize who is doing the same thing like size predicting or recommendation

  29. Dec 2020
  30. Nov 2020
  31. Oct 2020
    1. A statistician is the exact same thing as a data scientist or machine learning researcher with the differences that there are qualifications needed to be a statistician, and that we are snarkier.
    1. numerically evaluate the derivative of a function specified by a computer program

      I understand what they're saying, but one should be careful here not to confuse themselves with numerical differentiation a la finite differnces

  32. Sep 2020
    1. For example, the one- pass (hardware) translator generated a symbol table and reverse Polish code as in conven- tional software interpretive languages. The translator hardware (compiler) operated at disk transfer speeds and was so fast there was no need to keep and store object code, since it could be quickly regenerated on-the-fly. The hardware-implemented job controller per- formed conventional operating system func- tions. The memory controller provided

      Hardware assisted compiler is a fantastic idea. TPUs from Google are essentially this. They're hardware assistance for matrix multiplication operations for machine learning workloads created by tools like TensorFlow.

  33. Aug 2020
  34. Jul 2020
    1. Determine if who is using my computer is me by training a ML model with data of how I use my computer. This is a project for the Intrusion Detection Systems course at Columbia University.
    1. Our membership inference attack exploits the observationthat machine learning models often behave differently on thedata that they were trained on versus the data that they “see”for the first time.

      How well would this work on some of the more recent zero-shot models?

    1. data leakage (data from outside of your test set making it back into your test set and biasing the results)

      This sounds like the inverse of “snooping”, where information about the test data is inadvertently built into the model.

  35. Jun 2020
  36. May 2020
    1. the network typically learns to useh(t)as a kind of lossysummary of the task-relevant aspects of the past sequence of inputs up tot

      The hidden state h(t) is a high-level representation of whatever happened until time step t.

    2. Parameter sharingmakes it possible to extend and apply the model to examples of different forms(different lengths, here) and generalize across them. If we had separate parametersfor each value of the time index, we could not generalize to sequence lengths notseen during training, nor share statistical strength across different sequence lengthsand across different positions in time. Such sharing is particularly important whena specific piece of information can occur at multiple positions within the sequence.

      RNN have the same parameters for each time step. This allows to generalize the inferred "meaning", even when it's inferred at different steps.

    1. Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed