Hypothesis

39 Matching Annotations

Feb 2023
motherduck.com motherduck.com

MotherDuck: Big Data is Dead

5
1. pyxelr 26 Feb 2023
  
  in Public
  
  A huge percentage of the data that gets processed is less than 24 hours old. By the time data gets to be a week old, it is probably 20 times less likely to be queried than from the most recent day. After a month, data mostly just sits there.
  
  BigData
2. pyxelr 26 Feb 2023
  
  in Public
  
  Customer data sizes followed a power-law distribution. The largest customer had double the storage of the next largest customer, the next largest customer had half of that, etc.
  
  BigData
3. pyxelr 26 Feb 2023
  
  in Public
  
  the vast majority of customers had less than a terabyte of data in total data storage. There were, of course, customers with huge amounts of data, but most organizations, even some fairly large enterprises, had moderate data sizes.
  
  BigData
4. pyxelr 26 Feb 2023
  
  in Public
  
  Most applications do not need to process massive amounts of data. This has led to a resurgence in data management systems with traditional architectures; SQLite, Postgres, MySQL are all growing strongly, while “NoSQL” and even “NewSQL” systems are stagnating.
  
  SQL still shines over NoSQL
  
  BigData SQL NoSQL SQLite PostgreSQL MySQL NewSQL
5. pyxelr 26 Feb 2023
  
  in Public
  
  The most surprising thing that I learned was that most of the people using “Big Query” don’t really have Big Data. Even the ones who do tend to use workloads that only use a small fraction of their dataset sizes.
  
  BigData BigQuery
Visit annotations in context

Tags

BigQuery

PostgreSQL

NewSQL

NoSQL

BigData

SQL

MySQL

SQLite

Annotators

pyxelr

URL

motherduck.com/blog/big-data-is-dead/
Nov 2022
doordash.engineering doordash.engineering

Building Faster Indexing with Apache Kafka and Elasticsearch - DoorDash Engineering Blog

1
1. caiuswang 13 Nov 2022
  
  in Public
  
  rebuild search arch with bigdata tool
  
  bigdata
Visit annotations in context

Tags

bigdata

Annotators

caiuswang

URL

doordash.engineering/2021/07/14/open-source-search-indexing/
Nov 2021
ethnographymatters.net ethnographymatters.net

Big Data Needs Thick Data

1
1. axelduc 25 Nov 2021
  
  in Public
  
  May 13, 2013 | 43 Comments Big Data Needs Thick Data Tricia Wang Editor’s Note: Tricia provides an excellent segue between last month’s “Ethnomining” Special Edition and this month’s on “Talking to Companies about Ethnography.” She offers further thoughts building on our collective discussion (perhaps bordering on obsession?) with the big data trend. With nuance she tackles and reinvents some of the terminology circulating in the various industries that wish to make use of social research. In the wake of big data, ethnographers, she suggests, can offer thick data. In the face of derisive mention of “anecdotes” we ought to stand up to defend the value of stories. __________________________________________________ image from Mark Smiciklas at Intersection Consulting Big Data can have enormous appeal. Who wants to be thought of as a small thinker when there is an opportunity to go BIG? The positivistic bias in favor of Big Data (a term often used to describe the quantitative data that is produced through analysis of enormous datasets) as an objective way to understand our world presents challenges for ethnographers. What are ethnographers to do when our research is seen as insignificant or invaluable? Can we simply ignore Big Data as too muddled in hype to be useful? No. Ethnographers must engage with Big Data. Otherwise our work can be all too easily shoved into another department, minimized as a small line item on a budget, and relegated to the small data corner. But how can our kind of research be seen as an equally important to algorithmically processed data? What is the ethnographer’s 10 second elevator pitch to a room of data scientists? …and GO! Big Data produces so much information that it needs something more to bridge and/or reveal knowledge gaps. That’s why ethnographic work holds such enormous value in the era of Big Data. Lacking the conceptual words to quickly position the value of ethnographic work in the context of Big Data, I have begun, over the last year, to employ the term Thick Data (with a nod to Clifford Geertz!) to advocate for integrative approaches to research. Thick Data uncovers the meaning behind Big Data visualization and analysis. Thick Data: ethnographic approaches that uncover the meaning behind Big Data visualization and analysis. Thick Data analysis primarily relies on human brain power to process a small “N” while big data analysis requires computational power (of course with humans writing the algorithms) to process a large “N”. Big Data reveals insights with a particular range of data points, while Thick Data reveals the social context of and connections between data points. Big Data delivers numbers; thick data delivers stories. Big data relies on machine learning; thick data relies on human learning.
  
  dataanalytics bigdata
Visit annotations in context

Tags

dataanalytics

bigdata

Annotators

axelduc

URL

ethnographymatters.net/blog/2013/05/13/big-data-needs-thick-data/
Mar 2021
www.nytimes.com www.nytimes.com

N.Y.’s Vaccine Websites Weren’t Working. He Built a New One for $50.

1
1. sergiouribe 08 Mar 2021
  
  in Public
  
  N.Y.’s Vaccine Websites Weren’t Working. He Built a New One for $50.
  
  example of big data available to the public
  
  bigdata
Visit annotations in context

Tags

bigdata

Annotators

sergiouribe

URL

nytimes.com/2021/02/09/nyregion/vaccine-website-appointment-nyc.html
Feb 2021
www.xml.com www.xml.com

Topic Maps Now

1
1. mshook 09 Feb 2021
  
  in Public
  
  There are two directions to look for: first, using the principle of independence between the sources and the knowledge management layer, and second, fine tuning the balance between automatic processing and manual curation.
  
  bigdata
Visit annotations in context

Tags

bigdata

Annotators

mshook

URL

xml.com/articles/2017/06/23/topic-maps-now/
Mar 2020
colaboracion.dnp.gov.co colaboracion.dnp.gov.co

El presente documento presenta a consideración del CONPES para la Política Social la distribución territorial y sectorial del Situado Fiscal, así como la distribución territorial de la participación de municipios y resguardos indígenas en los Ingresos Co

5
1. uhernandez 06 Mar 2020
  
  in Public
  
  En relación con los ciudadanos, estos deben contar con alfabetización en datos44, esto es, las capacidades para navegar en sus propios ecosistemas de datos para producirlos , apropiarlos , comunicarlos y usarlos
  
  Alfabetización de datos
  
  BigData
2. uhernandez 06 Mar 2020
  
  in Public
  
  la innovación basada en datos corresponde al aprovechamiento de los mismos mediante la aplicación de técnicas de analítica para mejorar o crear nuevos bienes, servicios o procesos, que aporten a la diversificación y sofisticación de la economía y a la generación de valor social, como una nueva fuente de crecimiento (OCDE, 2015).
  
  Innovación con datos significa crear bienes, servicios o procesos a partir de técnicas de analítica.
  
  BigData
3. uhernandez 06 Mar 2020
  
  in Public
  
  La recolección, almacenamiento y procesamiento de datos da lugar a la información, de la cual es posible obtener conocimiento
  
  Definición de Datos, Información y Conocimiento. Esta definición ya había sido planteada por la Gestión del Conocimiento desde hace más de 40 años
  
  GestiónConocimiento DatosAbiertos BigData
4. uhernandez 06 Mar 2020
  
  in Public
  
  es necesario que las condiciones para la explotación de datos sean impulsadas mediante la intervención pública, corrigiendo las fallas de gobierno que impiden el surgimiento de elementos habilitadores. Lo anterior,mediante el aprovechamiento de un activo público que es generado de manera rutinaria y masiva, que por su naturaleza no es creado por el mercado: los datos públicos.
  
  Este documento aborda los Datos Abiertos y BigData para la generación de nuevos mercados.
  
  BigData
5. uhernandez 06 Mar 2020
  
  in Public
  
  el PND 2014-2018 es el único antecedente directo que determina expresamente la necesidad de disponer de una política pública de explotación de datos
  
  La explotación de datos en Colombia se plantea inicialmente, a nivel normativo, en eel PND 2014-2018
  
  BigData
Visit annotations in context

Tags

GestiónConocimiento

DatosAbiertos

BigData

Annotators

uhernandez

URL

colaboracion.dnp.gov.co/CDT/Conpes/Económicos/3920.pdf
Aug 2019
www.cambridge.org www.cambridge.org

design_for_health_40_exploration_of_a_new_area.pdf

2
1. laulaugris 04 Aug 2019
  
  in Public
  
  “patient sovereignty” will now become an important debate. In particular, the ownership of data in healthcare, while already an important topic of discussion, will become an even more complex argument.
  
  Discussão sobre os dados em/na/da saúde
  
  bigdata
2. laulaugris 04 Aug 2019
  
  in Public
  
  m-health offers predominantly interconnectivity between patients and healthcare professionals while IoT devices offer the ability to collect information and perform procedures with increasingly minimal invasion. Finally, big data gives healthcare professionals an opportunity to spot trends and patterns for both individual patients and groups of patients, improving the speed of diagnosis and disease prevention. In the next section the third and final pillar of Health 4.0; design, is discussed
  
  Como as tecnologias interagem na Saúde 4.0
  
  health4.0 technology mHealth IoT bigdata
Visit annotations in context

Tags

technology

IoT

health4.0

bigdata

mHealth

Annotators

laulaugris

URL

cambridge.org/core/services/aop-cambridge-core/content/view/A78210D237F5AEF878D7568B1AA8DB47/S2220434219000933a.pdf/design_for_health_40_exploration_of_a_new_area.pdf
Jun 2019
harvardmagazine.com harvardmagazine.com

Understanding big data leads to insights, efficiencies, and saved lives | Harvard Magazine Mar-Apr 2014

1
1. miyab007 21 Jun 2019
  
  in Public
  
  In marketing, familiar uses of big data include “recommendation engines” like those used by companies such as Netflix and Amazon to make purchase suggestions based on the prior interests of one customer as compared to millions of others.
  
  Jonathan Shaw explained "Big Data" as a beneficial device in our society. He describes "Big Data" can be helpful to find awareness and tendency especially in the industry. For example, giving the consumer's pattern of purchase from the big number of information. However, when you have so much information, it can be an obstruction to find good specific detail that you are looking for. Knowing the characteristic which advantage and weakness of how to handle "Big Data" will be the key of a development in our society.
  
  #bigdata
Visit annotations in context

Tags

#bigdata

Annotators

miyab007

URL

harvardmagazine.com/2014/03/why-big-data-is-a-big-deal
Nov 2017
www.datavisor.com www.datavisor.com

Guest Post: End the False Positive Alerts Plague in Anti-Money Laundering (AML) Systems | DataVisor

1
1. ramlinuxprasad 08 Nov 2017
  
  in Public
  
  They have a very simplistic view of the activity being monitored by only distilling it down into only a few dimensions for the rule to interrogate
  
  Number of dimensions need to be large. In normal database systems these dimensions are small.
  
  Banks AML Datavisor Database BigData
Visit annotations in context

Tags

Banks

Database

Datavisor

AML

BigData

Annotators

ramlinuxprasad

URL

datavisor.com/technical-posts/guest-post-end-the-false-positive-alerts-plague-in-anti-money-laundering-aml-systems/
Sep 2016
www.chronicle.com www.chronicle.com

Group Unveils a 'Model Policy' for Handling Student Data

2
1. Enkerli 08 Sep 2016
  
  in Public
  
  often private companies whose technologies power the systems universities use for predictive analytics and adaptive courseware
  
  #MoneyQuote #BigData learner data Learner as Product Business Models for Higher Education predictive models Learning Analytics Personal-ized Education
2. Enkerli 08 Sep 2016
  
  in Public
  
  the use of data in scholarly research about student learning; the use of data in systems like the admissions process or predictive-analytics programs that colleges use to spot students who should be referred to an academic counselor; and the ways colleges should treat nontraditional transcript data, alternative credentials, and other forms of documentation about students’ activities, such as badges, that recognize them for nonacademic skills.
  
  Useful breakdown. Research, predictive models, and recognition are quite distinct from one another and the approaches to data that they imply are quite different. In a way, the “personalized learning” model at the core of the second topic is close to the Big Data attitude (collect all the things and sense will come through eventually) with corresponding ethical problems. Through projects vary greatly, research has a much more solid base in both ethics and epistemology than the kind of Big Data approach used by technocentric outlets. The part about recognition, though, opens the most interesting door. Microcredentials and badges are a part of a broader picture. The data shared in those cases need not be so comprehensive and learners have a lot of agency in the matter. In fact, when then-Ashoka Charles Tsai interviewed Mozilla executive director Mark Surman about badges, the message was quite clear: badges are a way to rethink education as a learner-driven “create your own path” adventure. The contrast between the three models reveals a lot. From the abstract world of research, to the top-down models of Minority Report-style predictive educating, all the way to a form of heutagogy. Lots to chew on.
  
  Learning Analytics #BigData Data Economy research ethics ethics learner data #LearnerAgency Learner as Product learner-driven education predictive models #OpenBadges
Visit annotations in context

Tags

predictive models

learner data

ethics

Data Economy

Learner as Product

Business Models for Higher Education

learner-driven education

#BigData

#LearnerAgency

#OpenBadges

Personal-ized Education

research ethics

#MoneyQuote

Learning Analytics

Annotators

Enkerli

URL

chronicle.com/article/Group-Unveils-a-Model-Policy/237690
Jul 2016
hackeducation.com hackeducation.com

Convivial Tools in an Age of Surveillance

1
1. Enkerli 19 Jul 2016
  
  in Public
  
  I could have easily chosen a different prepositional phrase. "Convivial Tools in an Age of Big Data.” Or “Convivial Tools in an Age of DRM.” Or “Convivial Tools in an Age of Venture-Funded Education Technology Startups.” Or “Convivial Tools in an Age of Doxxing and Trolls."
  
  The Others.
  
  surveillance society #BigData Intellectual Property startups Business Models for Higher Education #privacy
Visit annotations in context

Tags

surveillance society

#privacy

Intellectual Property

startups

Business Models for Higher Education

#BigData

Annotators

Enkerli

URL

hackeducation.com/2014/11/13/convivial-tools-in-an-age-of-surveillance
www.businessinsider.com www.businessinsider.com

Colleges can now figure out which students will be successful — even before classes start

1
1. Enkerli 18 Jul 2016
  
  in Public
  
  "We know the day before the course starts which students are highly unlikely to succeed,"
  
  Easier to do with a strict model for success.
  
  Quotables Learning Analytics #BigData predictive models Student Success
Visit annotations in context

Tags

Quotables

predictive models

Student Success

#BigData

Learning Analytics

Annotators

Enkerli

URL

businessinsider.com/how-colleges-use-big-data-2016-6
Apr 2016
allthingsanalytics.com allthingsanalytics.com

Big Data and the Future of Work

1
1. mariewallace 21 Apr 2016
  
  in Public
  
  “fundamentally if we want to realize the potential of human networks to change how we work then we need analytics to transform information into insight otherwise we will be drowning in a sea of content and deafened by a cacophony of voices”
  
  Marie Wallace's perspective on the potential of bigdata analytics, specifically analysis of human networks, in the context of creating a smarter workplace.
  
  bigdata analytics future of work
Visit annotations in context

Tags

analytics

bigdata

future of work

Annotators

mariewallace

URL

allthingsanalytics.com/2015/06/26/big-data-and-the-future-of-work/
Jan 2016
www.profweb.ca www.profweb.ca

Making: Tooling Up to Teach and Collaborate | Articles | Publications | Profweb

2
1. Enkerli 26 Jan 2016
  
  in Public
  
  Dominique Cardon is calling on businesses to disclose the purpose of their algorithms.
  
  #BigData transparency technological appropriation #privacy
2. Enkerli 26 Jan 2016
  
  in Public
  
  it’s more the harmful threat of algorithms that we should be worried about.
  
  technological appropriation #privacy #BigData algorithms
Visit annotations in context

Tags

technological appropriation

transparency

#privacy

#BigData

algorithms

Annotators

Enkerli

URL

profweb.ca/en/publications/articles/making-tooling-up-to-teach-and-collaborate
lancasteronline.com lancasteronline.com

Woman staged 'rape' scene with knife, vodka, called 9-1-1, police say

1
1. Enkerli 22 Jan 2016
  
  in Public
  
  However, a Fitbit device Risley was wearing told a different story, the affidavit shows.
  
  #privacy Quantified Self #BigData
Visit annotations in context

Tags

Quantified Self

#privacy

#BigData

Annotators

Enkerli

URL

lancasteronline.com/news/local/woman-staged-rape-scene-with-knife-vodka-called--/article_9295bdbe-167c-11e5-b6eb-07d1288cc937.html
Dec 2015
mfeldstein.com mfeldstein.com

Why Big Data (Mostly) Can’t Help Improve Teaching

6
1. Enkerli 21 Dec 2015
  
  in Public
  
  your system is able to flag at least a critical mass of videos taught in the Mueller method as having a bigger educational impact on the students the average educational video by some measure you have identified
  
  Sounds like a neat description of what many Big Data enthusiasts are actually trying to do. Some Big Data positivists do go so far as to claim that the “inference engine” will eventually be powerful enough to find meaning. But this distinction is within the Big Data field, not between it and other fields.
  
  #BigData
2. Enkerli 21 Dec 2015
  
  in Public
  
  sufficiently rich information
  
  Thick data
  
  Thick Data #BigData
3. Enkerli 21 Dec 2015
  
  in Public
  
  It’s educators who come up with hypotheses and test them using a large data set.
  
  And we need an ever-larger data set, right?
  
  #BigData Student Data Learning Analytics
4. Enkerli 21 Dec 2015
  
  in Public
  
  Some were done this way on purpose but based on intuitions by classroom teachers.
  
  Isn’t Big Data partly about reverse-engineering these intuitions?
  
  #BigData
5. Enkerli 21 Dec 2015
  
  in Public
  
  hadoop thingamabob back end
  
  Hadoop==BigData
  
  #BigData Hadoop
6. Enkerli 21 Dec 2015
  
  in Public
  
  a good example of the kind of insight that big data is completely blind to
  
  Not sure it follows directly, but also important to point out.
  
  Learning Analytics #BigData #ConfirmationBias
Visit annotations in context

Tags

Student Data

#ConfirmationBias

Hadoop

Thick Data

#BigData

Learning Analytics

Annotators

Enkerli

URL

mfeldstein.com/why-big-data-mostly-cant-help-improve-teaching/
mfeldstein.com mfeldstein.com

McGraw Hill’s New Personalized Learning Authoring Product

2
1. Enkerli 21 Dec 2015
  
  in Public
  
  As long as the content in SmartBooks is locked down, then it is possible to run machine learning algorithms against the clicks of millions of students using that content. To the degree that the platform is opened up for custom, newly created books, the controlled experiment goes away and the possibility of big data analysis goes with it.
  
  Not sure it follows…
  
  #BigData
2. Enkerli 21 Dec 2015
  
  in Public
  
  they are making a bet against the software as a replacement for the teacher and against big data
  
  Learning Analytics #BigData
Visit annotations in context

Tags

#BigData

Learning Analytics

Annotators

Enkerli

URL

mfeldstein.com/mcgraw-hills-new-personalized-learning-authoring-product/
larrycuban.wordpress.com larrycuban.wordpress.com

Data-Driven Teaching Practices: Rhetoric and Reality

1
1. Enkerli 21 Dec 2015
  
  in Public
  
  numbers have to be interpreted by those who do the daily work of classroom teaching
  
  Quantified Self quants Business Models for Higher Education Learning Analytics #BigData
Visit annotations in context

Tags

Quantified Self

#BigData

Business Models for Higher Education

quants

Learning Analytics

Annotators

Enkerli

URL

larrycuban.wordpress.com/2015/10/06/another-look-at-data-driven-teaching-practices/
hackeducation.com hackeducation.com

Top Ed-Tech Trends of 2015: The Compulsion for Data

1
1. Enkerli 21 Dec 2015
  
  in Public
  
  As usual, @AudreyWatters puts things in proper perspective.
  
  Learning Analytics @AudreyWatters #privacy #BigData Student Data Data Economy
Visit annotations in context

Tags

Student Data

Data Economy

#privacy

#BigData

@AudreyWatters

Learning Analytics

Annotators

Enkerli

URL

hackeducation.com/2015/12/16/trends-data
Nov 2015
europa.eu europa.eu

European Commission - PRESS RELEASES - Press release - Neelie Kroes Vice-President of the European Commission responsible for the Digital Agenda Digital Agenda and Open Data From Crisis of Trust to Open Governing Bratislava, 5 March 2012

1
1. Enkerli 27 Nov 2015
  
  in Public
  
  That's why I say that data is the new oil for the digital age
  
  Data Linked Data Open Data #BigData Data Economy
Visit annotations in context

Tags

Data Economy

#BigData

Data

Open Data

Linked Data

Annotators

Enkerli

URL

europa.eu/rapid/press-release_SPEECH-12-149_en.htm
mfeldstein.com mfeldstein.com

The IMS's New "Caliper" Learning Analytics Interoperability Framework Is Deeply Interesting -

1
1. Enkerli 11 Nov 2015
  
  in Public
  
  grows exponentially.
  
  As we get into “Big Data”, individual datapoints become less important.
  
  #BigData Learning Analytics
Visit annotations in context

Tags

#BigData

Learning Analytics

Annotators

Enkerli

URL

mfeldstein.com/imss-new-caliper-learning-analytics-interoperability-framework-deeply-interesting/
Jun 2015
www.heise.de www.heise.de

Datensuche im Dark Web soll Online-Einbrüche früher verraten

1
1. pguth 15 Jun 2015
  
  in Public
  
  Ein Start-up aus den USA testet ein neues Hilfsmittel zur Minimierung der Folgen von Datendiebstählen: Es prüft, ob in versteckten Teilen des Web fremde Daten zum Kauf angeboten werden.
  
  bigdata datamining cypherpunk
Visit annotations in context

Tags

datamining

bigdata

cypherpunk

Annotators

pguth

URL

heise.de/newsticker/meldung/Datensuche-im-Dark-Web-soll-Online-Einbrueche-frueher-verraten-2686347.html

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL