Hypothesis

119 Matching Annotations

Sep 2024
www.platformer.news www.platformer.news

Why note-taking apps don't make us smarter

1
1. chrisaldrich 09 Sep 2024
  
  in Public
  
  databases are not designed to be browsed.
  
  Casey Newton makes this blanket statement. Any real evidence for this beyond his "gut"?
  
  Many "paper machines" like Niklas Luhmann's zettelkasten were almost custom made not just for searching, but for browsing through regularly much like commonplace books.
  
  Perhaps the question is really, how is your particular database designed?
  
  databases card index as database database design commonplace books knowledge management
Visit annotations in context

Tags

knowledge management

databases

card index as database

commonplace books

database design

Annotators

chrisaldrich

URL

platformer.news/why-note-taking-apps-dont-make-us/
Aug 2024
pinokio.computer pinokio.computer

Pinokio

1
1. chrisaldrich 21 Aug 2024
  
  in Public
  
  https://pinokio.computer/<br /> Pinokio is a browser that lets you install, run, and programmatically control ANY application, automatically.
  
  Pinokio Pinocchio browsers bots servers databases artificial intelligence apps Friends of the Link 2024-08-21
Visit annotations in context

Tags

Pinocchio

bots

Friends of the Link 2024-08-21

databases

Pinokio

artificial intelligence apps

browsers

servers

Annotators

chrisaldrich

URL

pinokio.computer/
Apr 2024
www.newyorker.com www.newyorker.com

A New History of Arabia, Written in Stone

1
1. chrisaldrich 11 Apr 2024
  
  in Public
  
  Inscriptions, Al-Jallad explained, tend to cluster on higher ground, where nomadic herders could keep an easier watch for predators. In a landscape with no other traces of human civilization, the rocks preserved the nomads’ names and genealogies, along with descriptions of their animals, their wars, their journeys, and their rituals. There were prayers to deities, worries about the lack of rain, and complaints about the cruelty of Romans.
  
  safaitic script stone inscriptions nomadic life genealogy databases
Visit annotations in context

Tags

genealogy databases

nomadic life

stone inscriptions

safaitic script

Annotators

chrisaldrich

URL

newyorker.com/culture/culture-desk/a-new-history-of-arabia-written-in-stone
Feb 2024
www.youtube.com www.youtube.com

Antedating the OED

1
1. chrisaldrich 02 Feb 2024
  
  in Public
  
  Antedating the OED
  
  concordances antedatings Oxford English Dictionary databases watch dipstick white lies
Visit annotations in context

Tags

databases

antedatings

watch

white lies

dipstick

Oxford English Dictionary

concordances

Annotators

chrisaldrich

URL

youtube.com/watch
Nov 2023
cinematreasures.org cinematreasures.org

Cinema Treasures

1
1. chrisaldrich 20 Nov 2023
  
  in Public
  
  https://cinematreasures.org/ Cinema Treasures
  
  movie theaters movie palaces databases
Visit annotations in context

Tags

movie palaces

databases

movie theaters

Annotators

chrisaldrich

URL

cinematreasures.org/
Oct 2023
www.academia.edu www.academia.edu

(99+) Academia.edu

1
1. chrisaldrich 27 Oct 2023
  
  in Public
  
  https://www.academia.edu/community/VDDN6V
  
  Evina Stein has a nice list of digital manuscript databases
  
  manuscript studies manuscript databases compilation Evina Stein
Visit annotations in context

Tags

manuscript databases

manuscript studies

Evina Stein

compilation

Annotators

chrisaldrich

URL

academia.edu/community/VDDN6V
Sep 2023
www.reddit.com www.reddit.com

r/Zettelkasten - What are the disadvantages of the zettelkasten method?

1
1. chrisaldrich 24 Sep 2023
  
  in Public
  
  I wonder what you think of a distinction between the more traditional 'scholar's box', and the proto-databases that were used to write dictionaries and then for projects such as the Mundaneum. I can't help feeling there's a significant difference between a collection of notes meant for a single person, and a collection meant to be used collaboratively. But not sure exactly how to characterize this difference. Seems to me that there's a tradition that ended up with the word processor, and another one that ended up with the database. I feel that the word processor, unlike the database, was a dead end.
  
  reply to u/atomicnotes at https://www.reddit.com/r/Zettelkasten/comments/16njtfx/comment/k1tuc9c/?utm_source=reddit&utm_medium=web2x&context=3
  
  u/atomicnotes, this is an excellent question. (Though I'd still like to come to terms with people who don't think it acts as a knowledge management system, there's obviously something I'm missing.)
  
  Some of your distinction comes down to how one is using their zettelkasten and what sorts of questions are being asked of it. One of the earliest descriptions I've seen that begins to get at the difference is the description by Beatrice Webb of her notes (appendix C) in My Apprenticeship. As she describes what she's doing, I get the feeling that she's taking the same broad sort of notes we're all used to, but it's obvious from her discussion that she's also using her slips as a traditional database, but is lacking modern vocabulary to describe it as such.
  
  Early efforts like the OED, TLL, the Wb, and even Gertrud Bauer's Coptic linguistic zettelkasten of the late 1970s were narrow enough in scope and data collected to make them almost dead simple to define, organize and use as databases on paper. Of course how they were used to compile their ultimate reference books was a bit more complex in form than the basic data from which they stemmed.
  
  The Mundaneum had a much more complex flavor because it required a standardized system for everyone to work in concert against much more freeform as well as more complex forms of collected data and still be able to search for the answers to specific questions. While still somewhat database flavored, it was dramatically different from the others because of it scope and the much broader sorts of questions one could ask of it. I think that if you ask yourself what sorts of affordances you get from the two different groups (databases and word processors (or even their typewriter precursors) you find even more answers.
  
  Typewriters and word processors allowed one to get words down on paper quicker by a magnitude of order or two faster, and in combination with reproduction equipment, made it easier to spin off copies of the document for small scale and local mass distribution a lot easier. They do allow a few affordances like higher readability (compared with less standardized and slower handwriting), quick search (at least in the digital era), and moving pieces of text around (also in digital). Much beyond this, they aren't tremendously helpful as a composition tool. As a thinking tool, typewriters and word processors aren't significantly better than their analog predecessors, so you don't gain a huge amount of leverage by using them.
  
  On the other hand, databases and their spreadsheet brethren offer a lot more, particularly in digital realms. Data collection and collation become much easier. One can also form a massive variety of queries on such collected data, not to mention making calculations on those data or subjecting them to statistical analyses. Searching, sorting, and making direct comparisons also become far easier and quicker to do once you've amassed the data you need. Here again, Beatrice Webb's early experience and descriptions are very helpful as are Hollerinth's early work with punch cards and census data and the speed with which the results could be used.
  
  Now if you compare the affordances by each of these in the digital era and plot their shifts against increasing computer processing power, you'll see that the value of the word processor stays relatively flat while the database shows much more significant movement.
  
  Surely there is a lot more at play, particularly at scale and when taking network effects into account, but perhaps this quick sketch may explain to you a bit of the difference you've described.
  
  Another difference you may be seeing/feeling is that of contextualization. Databases usually have much smaller and more discrete amounts of data cross-indexed (for example: a subject's name versus weight with a value in pounds or kilograms.) As a result the amount of context required to use them is dramatically lower compared to the sorts of data you might keep in an average atomic/evergreen note, which may need to be more heavily recontextualized for you when you need to use it in conjunction with other similar notes which may also need you to recontextualize them and then use them against or with one another.
  
  Some of this is why the cards in the Thesaurus Linguae Latinae are easier to use and understand out of the box (presuming you know Latin) than those you might find in the Mundaneum. They'll also be far easier to use than a stranger's notes which will require even larger contextualization for you, especially when you haven't spent the time scaffolding the related and often unstated knowledge around them. This is why others' zettelkasten will be more difficult (but not wholly impossible) for a stranger to use. You might apply the analogy of context gaps between children and adults for a typical Disney animated movie to the situation. If you're using someone else's zettelkasten, you'll potentially be able to follow a base level story the way a child would view a Disney cartoon. Compare this to the zettelkasten's creator who will not only see that same story, but will have a much higher level of associative memory at play to see and understand a huge level of in-jokes, cultural references, and other associations that an adult watching the Disney movie will understand that the child would completely miss.
  
  I'm curious to hear your thoughts on how this all plays out for your way of conceptualizing it.
  
  reply collaborative zettelkasten databases zettelkasten as database word processors typewriters affordances contextualization
Visit annotations in context

Tags

affordances

word processors

contextualization

zettelkasten as database

databases

collaborative zettelkasten

reply

typewriters

Annotators

chrisaldrich

URL

reddit.com/r/Zettelkasten/comments/16njtfx/what_are_the_disadvantages_of_the_zettelkasten/
Aug 2023
zenodo.org zenodo.org

D3.1 FAIR Policy Landscape Analysis

1
1. WHPrivate 25 Aug 2023
  
  in Public
  
  research data life cycle
  
  Annotated with RDA Tags: Working groups
  
  FAIR Data Maturity Model WG FAIR for Research Software (FAIR4RS) WG FAIR for Virtual Research Environments WG FAIRsharing Registry: Connecting data policies, standards and databases RDA WG rda_graph
Visit annotations in context

Tags

FAIR Data Maturity Model WG

FAIR for Research Software (FAIR4RS) WG

FAIR for Virtual Research Environments WG

rda_graph

FAIRsharing Registry: Connecting data policies, standards and databases RDA WG

Annotators

WHPrivate

URL

zenodo.org/record/5537032
stephanus.tlg.uci.edu stephanus.tlg.uci.edu

TLG - Home

1
1. chrisaldrich 03 Aug 2023
  
  in Public
  
  https://stephanus.tlg.uci.edu/
  
  Thesaurus Linguae Graecae (TLG) University of California Irvine databases Greek lexicography
Visit annotations in context

Tags

University of California Irvine

databases

Thesaurus Linguae Graecae (TLG)

Greek lexicography

Annotators

chrisaldrich

URL

stephanus.tlg.uci.edu/
www.degruyter.com www.degruyter.com

Bibliotheca Teubneriana Latina Online

1
1. chrisaldrich 03 Aug 2023
  
  in Public
  
  The BTL Online database provides electronic access to all editions of Latin texts published in the Bibliotheca Teubneriana, ranging from antiquity and late antiquity to medieval and neo-Latin texts. A total of approximately 13 million word forms are thus accessible electronically.
  
  https://www.degruyter.com/database/btl/html?lang=en
  
  De Gruyter Bibliotheca Teubneriana Latina (BTL) Latin texts Latin databases
Visit annotations in context

Tags

Latin texts

Latin

databases

De Gruyter

Bibliotheca Teubneriana Latina (BTL)

Annotators

chrisaldrich

URL

degruyter.com/database/btl/html
May 2023
techleadjournal.dev techleadjournal.dev

#131 - Data Essentials in Software Architecture - Pramod Sadalage

1
1. bruno.caimar 12 May 2023
  
  in Public
  
  ake sure you are mentoring someone, always. Make sure you’re always mentoring others. And I have found that mentoring others gives me new perspectives on things. Because I may be telling them things, but I’m learning a lot while I’m telling them.
  
  Great
  
  databases refactoring agile pramod
Visit annotations in context

Tags

agile

databases

refactoring

pramod

Annotators

bruno.caimar

URL

techleadjournal.dev/episodes/131/
Apr 2023
en.wikipedia.org en.wikipedia.org

Lace card - Wikipedia

1
1. chrisaldrich 27 Apr 2023
  
  in Public
  
  Lace card
  
  lace cards punched cards databases DDoS denial of service edge-notched cards read
Visit annotations in context

Tags

databases

punched cards

edge-notched cards

denial of service

lace cards

read

DDoS

Annotators

chrisaldrich

URL

en.wikipedia.org/wiki/Lace_card
typewriterdatabase.com typewriterdatabase.com

The Typewriter Database - Version Epsilon

1
1. chrisaldrich 16 Apr 2023
  
  in Public
  
  https://typewriterdatabase.com/
  
  typewriters databases
Visit annotations in context

Tags

databases

typewriters

Annotators

chrisaldrich

URL

typewriterdatabase.com/
Mar 2023
web.archive.org web.archive.org

Ancient Egyptian Dictionary

2
1. chrisaldrich 28 Mar 2023
  
  in Public
  
  Die Erfahrungen Ermans und seiner Mitarbeiter lehren nur zu deutlich, dass die Buchform der Präsentation eines solchen Materialbestands durchaus nicht entgegenkommt.
  
  For some research the book form is just not conducive to the most productive work. Both the experiences of Beatrice Webb (My Apprenticeship, Appendix C) and Adolph Erman (on Wb) show that database forms for sorting, filtering, and comparing have been highly productive and provide a wealth of information which simply couldn't be done otherwise.
  
  Beatrice Webb Adolph Erman databases relational databases books reading practices note taking Wörterbuch der ägyptischen Sprache
2. chrisaldrich 28 Mar 2023
  
  in Public
  
  Ausgangspunkt und Zentrum der Arbeit am Altägyptischen Wörterbuch ist die Anlage eines erschöpfenden Corpus ägyptischer Texte.
  
  In the early twentieth century one might have created a card index to study a large textual corpus, but in the twenty first one is more likely to rely on a relational database instead.
  
  digital humanities relational databases zettelkasten as database
Visit annotations in context

Tags

note taking

relational databases

Beatrice Webb

zettelkasten as database

reading practices

Adolph Erman

books

digital humanities

Wörterbuch der ägyptischen Sprache

databases

Annotators

chrisaldrich

URL

web.archive.org/web/20180627163317/https://aaew.bbaw.de/wbhome/Broschuere/index.html
niklas-luhmann-archiv.de niklas-luhmann-archiv.de

ZK II: Zettel 9/8b2 - Niklas Luhmann-Archiv

1
1. chrisaldrich 23 Mar 2023
  
  in Public
  
  9/8b2 "Multiple storage" als Notwendigkeit derSpeicherung von komplexen (komplex auszu-wertenden) Informationen.
  
  Seems like from a historical perspective hierarchical databases were more prevalent in the 1960s and relational databases didn't exist until the 1970s. (check references for this historically)
  
  Of course one must consider that within a card index or zettelkasten the ideas of both could have co-existed in essence even if they weren't named as such. Some of the business use cases as early as 1903 (earlier?) would have shown the idea of multiple storage and relational database usage. Beatrice Webb's usage of her notes in a database-like way may have indicated this as well.
  
  hierarchical databases relational databases multiple storage databases card index for business Niklas Luhmann's zettelkasten Beatrice Webb
Visit annotations in context

Tags

hierarchical databases

Niklas Luhmann's zettelkasten

card index for business

relational databases

Beatrice Webb

databases

multiple storage

Annotators

chrisaldrich

URL

niklas-luhmann-archiv.de/bestand/zettelkasten/zettel/ZK_2_NB_9-8b2_V
projects.propublica.org projects.propublica.org

University of California, Berkeley — The Repatriation Project

1
1. chrisaldrich 04 Mar 2023
  
  in Public
  
  University of California, Berkeley — The Repatriation Project<br /> by Ash Ngu, Andrea Suozzo
  
  repatriation Native American Graves Protection and Repatriation Act (NAGPRA) Indigenous cultures University of California Berkeley databases
Visit annotations in context

Tags

databases

Native American Graves Protection and Repatriation Act (NAGPRA)

University of California Berkeley

repatriation

Indigenous cultures

Annotators

chrisaldrich

URL

projects.propublica.org/repatriation-nagpra-database/institution/university-california-berkeley/
Jan 2023
refubium.fu-berlin.de refubium.fu-berlin.de

Comprehensive Coptic Lexicon: Including Loanwords from Ancient Greek v 1: Version 1

1
1. chrisaldrich 18 Jan 2023
  
  in Public
  
  https://refubium.fu-berlin.de/handle/fub188/24570
  
  Some interesting programming and structured data with relationship to the Gertrud Bauer Zettelkasten Online.
  
  Gertrud Bauer Zettelkasten Online Database and Dictionary of Greek Loanwords in Coptic (DDGLC) loanwords ancient Greek Coptic lexicons dictionaries non-Semitic Afro-Asiatic languages preclassical Greek postclassical Greek datasets databases
Visit annotations in context

Tags

non-Semitic Afro-Asiatic languages

loanwords

lexicons

postclassical Greek

datasets

Database and Dictionary of Greek Loanwords in Coptic (DDGLC)

ancient Greek

Gertrud Bauer Zettelkasten Online

databases

dictionaries

Coptic

preclassical Greek

Annotators

chrisaldrich

URL

refubium.fu-berlin.de/handle/fub188/24570
userpage.fu-berlin.de userpage.fu-berlin.de

Bauer Archive Online

1
1. chrisaldrich 16 Jan 2023
  
  in Public
  
  After browsing through a variety of the cards in Gertrud Bauer's Zettelkasten Online it becomes obvious that the collection was created specifically as a paper-based database for search, retrieval, and research. The examples and data within it are much more narrowly circumscribed for a specific use than those of other researchers like Niklas Luhmann whose collection spanned a much broader variety of topics and areas of knowledge.
  
  This particular use case makes the database nature of zettelkasten more apparent than some others, particularly in modern (post-2013 zettelkasten of a more personal nature).
  
  I'm reminded here of the use case(s) described by Beatrice Webb in My Apprenticeship for scientific note taking, by which she more broadly meant database creation and use.
  
  Gertrud Bauer's zettelkasten Gertrud Bauer Zettelkasten Online card index as database scientific note taking Beatrice Webb Niklas Luhmann's zettelkasten databases Database and Dictionary of Greek Loanwords in Coptic (DDGLC)
Visit annotations in context

Tags

Niklas Luhmann's zettelkasten

Gertrud Bauer's zettelkasten

Beatrice Webb

Database and Dictionary of Greek Loanwords in Coptic (DDGLC)

scientific note taking

Gertrud Bauer Zettelkasten Online

databases

card index as database

Annotators

chrisaldrich

URL

userpage.fu-berlin.de/johnkatrin/bauer1/index.html
www.geschkult.fu-berlin.de www.geschkult.fu-berlin.de

Database and Dictionary of Greek Loanwords in Coptic (DDGLC)

1
1. chrisaldrich 16 Jan 2023
  
  in Public
  
  Database and Dictionary of Greek Loanwords in Coptic (DDGLC) https://www.geschkult.fu-berlin.de/en/e/ddglc/index.html
  
  Greek Coptic loanwords linguistics databases dictionaries Free University of Berlin
Visit annotations in context

Tags

databases

loanwords

linguistics

dictionaries

Coptic

Greek

Free University of Berlin

Annotators

chrisaldrich

URL

geschkult.fu-berlin.de/en/e/ddglc/index.html
geniza.princeton.edu geniza.princeton.edu

Search Documents

1
1. chrisaldrich 09 Jan 2023
  
  in Public
  
  https://geniza.princeton.edu/en/documents/
  
  Princeton Geniza Lab databases Cairo Geniza
Visit annotations in context

Tags

Cairo Geniza

databases

Princeton Geniza Lab

Annotators

chrisaldrich

URL

geniza.princeton.edu/en/documents/
Dec 2022
tellico-project.org tellico-project.org

Tellico – Collection management software, free and simple

1
1. chrisaldrich 30 Dec 2022
  
  in Public
  
  https://tellico-project.org/
  
  Tellico<br /> Collection management software, free and simple
  
  <small><cite class='h-cite via'>ᔥ <span class='p-author h-card'>Fernando Borretti</span> in Unbundling Tools for Thought (<time class='dt-published'>12/29/2022 15:59:17</time>)</cite></small>
  
  tools apps collections collection management databases IndieWeb Tellico
Visit annotations in context

Tags

tools

collection management

databases

Tellico

apps

collections

IndieWeb

Annotators

chrisaldrich

URL

tellico-project.org/
dirt.substack.com dirt.substack.com

Dirt: Worldbuilding, Pt. 1

1
1. tarkowski 28 Dec 2022
  
  in Public
  
  Good stories are of diminishing importance — ironic, given how audiences were traditionally drawn into a world, like that of The Odyssey, by way of a single character’s journey.
  
  There's a Le Guin quote in the piece, on how a world of the story needs to be described in the story, and "that's tricky business". And this quote argues about the diminishing role of narratives. So what takes over their role? I think that, broadly speaking, databases and information: any narrative today quickly gets surrounded with coral-like growth of commentary, reviews, fanfiction, databases of world's details. This has some reference to Johnson's work on database as a modern media form.
  
  worldbuilding databases
Visit annotations in context

Tags

databases

worldbuilding

Annotators

tarkowski

URL

dirt.substack.com/p/dirt-worldbuilding-pt-1
genizalab.princeton.edu genizalab.princeton.edu

History of the Princeton Geniza Lab

1
1. chrisaldrich 16 Dec 2022
  
  in Public
  
  Keeping track of research materials used to require an excellent memory, exceptional bookkeeping skills or blind luck; now we have databases.
  
  Love the phrasing of this. :)
  
  luck databases research methods
Visit annotations in context

Tags

luck

databases

research methods

Annotators

chrisaldrich

URL

genizalab.princeton.edu/about/history-princeton-geniza-lab
Nov 2022
www.youtube.com www.youtube.com

LA Public Library

1
1. chrisaldrich 19 Nov 2022
  
  in Public
  
  Genealogy Garage: Researching at the Huntington Library
  <iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/0f2j2K6JWGg" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
  
  Julie Huffman jhuffman@lapl.org (host)
  
  Stephanie Arias
  
  Anne Blecksmith
  
  Li Wei Yang
  
  Clay Stalls cstalls@huntington.org
  
  ECPP
  
  Early California Population Project: Database of Baptism, Marriage, and Burial Records from California Missions
  
  https://huntington.org/early-california-population-project
  
  Tips https://researchguides.huntington.org/ecpp/tips
  
  Family Histories: A guide to resources for family history research at The Huntington Library
  
  https://researchguides.huntington.org/familyhistories
  
  Huntington Library
  
  Visit checklist
  
  Create a library account via Aeon: https://aeon.huntington.org
  
  Request rare materials via Aeon: https://researchguides.huntington.org/aeon
  
  Review Reading Room policies and Conditions of Use: https://researchguides.huntington.org/usingthelibrary/usingthelibrary
  
  Schedule an appointment: https://huntingtonlibrary.libcal.com
  
  genealogy Huntington Library Genealogy Garage Los Angeles Public Library watch Early California Population Project genealogy databases California Missions Jewish ancestry events
Visit annotations in context

Tags

Los Angeles Public Library

Huntington Library

genealogy databases

events

genealogy

Early California Population Project

California Missions

watch

Genealogy Garage

Jewish ancestry

Annotators

chrisaldrich

URL

youtube.com/channel/UCCIKzqWUnkomsnkelPMvgsQ
Oct 2022
Local file Local file

On intellectual craftsmanship (1952)

1
1. chrisaldrich 01 Oct 2022
  
  in Public
  
  A career-line study of the presidents, all cabinet members,and all members of the Supreme Court. This 1 'already have onIBM cards from the constitutional period through Truman'ssecond term, but I want to expand the items used and analyze itafresh.
  
  Notice that it's not just notes, but data on IBM cards that he's using for research here. This sort of data analysis is much easier now, but is also of the sort detailed by Beatrice Webb in her scientific note taking.
  
  databases IBM cards scientific note taking
Tags

databases

scientific note taking

IBM cards

Annotators

chrisaldrich
Sep 2022
www.youtube.com www.youtube.com

How To Choose The Right Database? - YouTube

1
1. Abstractigakis 20 Sep 2022
  
  in Public
  
  How To Choose The Right Database?
  
  very differnt opinions than theo
  
  #databases
Visit annotations in context

Tags

#databases

Annotators

Abstractigakis

URL

youtube.com/watch
Aug 2022
kevin.burke.dev kevin.burke.dev

Reddit’s database has two tables | Kevin Burke

2
1. pyxelr 14 Aug 2022
  
  in Public
  
  Instead, they keep a Thing Table and a Data Table. Everything in Reddit is a Thing: users, links, comments, subreddits, awards, etc. Things keep common attribute like up/down votes, a type, and creation date. The Data table has three columns: thing id, key, value. There’s a row for every attribute. There’s a row for title, url, author, spam votes, etc. When they add new features they didn’t have to worry about the database anymore. They didn’t have to add new tables for new things or worry about upgrades. Easier for development, deployment, maintenance.
  
  Reddit uses only 2 tables, with the cost of not being able to use cool relational features
  
  databases Reddit
2. pyxelr 14 Aug 2022
  
  in Public
  
  Schema updates are very slow when you get bigger. Adding a column to 10 million rows takes locks and doesn’t work. They used replication for backup and for scaling. Schema updates and maintaining replication is a pain.
  
  Schema updates and replications are not easy to handle
  
  databases Reddit
Visit annotations in context

Tags

Reddit

databases

Annotators

pyxelr

URL

kevin.burke.dev/kevin/reddits-database-has-two-tables/
multimediaman.blog multimediaman.blog

How the index card launched the information age

1
1. chrisaldrich 05 Aug 2022
  
  in Public
  
  In 1896, Dewey formed a partnership with Herman Hollerith and the Tabulating Machine Company (TMC) to provide the punch cards used for the electro-mechanical counting system of the US government census operations. Dewey’s relationship with Hollerith is significant as TMC would be renamed International Business Machines (IBM) in 1924 and become an important force in the information age and creator of the first relational database.
  
  Melvil Dewey Herman Hollerith 1896 1926 IBM Tabulating Machine Company census relational databases punched cards
Visit annotations in context

Tags

relational databases

1896

Herman Hollerith

IBM

Tabulating Machine Company

Melvil Dewey

punched cards

1926

census

Annotators

chrisaldrich

URL

multimediaman.blog/2016/09/30/how-the-index-card-launched-the-information-age/
Jul 2022
Local file Local file

My Apprenticeship

3
1. chrisaldrich 22 Jul 2022
  
  in Public
  
  It wasnot until we had completely re-sorted all our innumerable sheets ofpaper according to subjects, thus bringing together all the facts relatingto each, whatever the trade concerned, or the place or the date—andhad shuffled and reshuffled these sheets according to various tentativehypotheses—that a clear, comprehensive and verifiable theory of theworking and results of Trade Unionism emerged in our minds; tobe embodied, after further researches by way of verification, in ourIndustrial Democracy (1897).
  
  Beatrice Webb was using her custom note taking system in the lead up to the research that resulted in the publication of Industrial Democracy (1897).
  
  Is there evidence that she was practicing this note taking/database practice earlier than this?
  
  Beatrice Webb card index as database databases note taking Industrial Democracy (book) 1897
2. chrisaldrich 22 Jul 2022
  
  in Public
  
  On many occasions we have been com¬pelled to break off the writing of a particular chapter, or even of aparticular paragraph, in order to test, by reshuffling the whole of ournotes dealing with a particular subject, a particular place, a particularorganisation or a particular date, the relative validity of hypotheses asto cause and effect. I may remark, parenthetically, that we have foundthis “ game with reality ”, this building up of one hypothesis andknocking it down in favour of others that had been revealed or verifiedby a new shuffle of the notes—especially when we severally “ backed ”rival hypotheses—a most stimulating recreation! In that way alonehave we been able “ to put our bias out of gear ”, and to make ourorder of thought correspond, not with our own prepossessions, butwith the order of things discovered by our investigations.
  
  Beatrice Webb's note taking system here shows indications of being actively used as a database system!
  
  intellectual history Beatrice Webb databases
3. chrisaldrich 19 Jul 2022
  
  in Public
  
  An instance may be given of the necessity of the “ separate sheet ” system.Among the many sources of information from which we constructed our bookThe Manor and the Borough were the hundreds of reports on particular boroughsmade by the Municipal Corporation Commissioners in 1835 .These four hugevolumes are well arranged and very fully indexed; they were in our own possession;we had read them through more than once; and we had repeatedly consulted themon particular points. We had, in fact, used them as if they had been our own boundnotebooks, thinking that this would suffice. But, in the end, we found ourselvesquite unable to digest and utilise this material until we had written out every oneof the innumerable facts on a separate sheet of paper, so as to allow of the mechanicalabsorption of these sheets among our other notes; of their complete assortment bysubjects; and of their being shuffled and reshuffled to test hypotheses as to suggestedco-existences and sequences.
  
  Webb's use case here sounds like she's got the mass data, but that what she really desired was a database which she could more easily query to do her work and research. As a result, she took the flat file data and made it into a manually sortable and searchable database.
  
  note taking Beatrice Webb databases archives flat files edge-notched cards computer science search discovery
Tags

note taking

archives

1897

Beatrice Webb

computer science

Industrial Democracy (book)

databases

intellectual history

card index as database

search

edge-notched cards

flat files

discovery

Annotators

chrisaldrich
en.wikipedia.org en.wikipedia.org

Paper data storage - Wikipedia

1
1. chrisaldrich 19 Jul 2022
  
  in Public
  
  https://en.wikipedia.org/wiki/Paper_data_storage
  
  read paper data storage punched cards edge-notched cards note taking databases
Visit annotations in context

Tags

note taking

databases

punched cards

edge-notched cards

paper data storage

read

Annotators

chrisaldrich

URL

en.wikipedia.org/wiki/Paper_data_storage
May 2022
www.newyorker.com www.newyorker.com

Future Reading

1
1. chrisaldrich 20 May 2022
  
  in Public
  
  Perseus, a site, based at Tufts, specializing in Greek and Latin
  
  Perseus Project Latin Greek databases classics
Visit annotations in context

Tags

Latin

databases

classics

Greek

Perseus Project

Annotators

chrisaldrich

URL

newyorker.com/magazine/2007/11/05/future-reading
kiranrao.ca kiranrao.ca

Changing Tires at 100mph: A Guide to Zero Downtime Migrations | Kiran Rao

1
1. pyxelr 08 May 2022
  
  in Public
  
  Create the new empty table Write to both old and new table Copy data (in chunks) from old to new Validate consistency Switch reads to new table Stop writes to the old table Cleanup old table
  
  7 steps required while migrating to a new table
  
  SQL databases
Visit annotations in context

Tags

databases

SQL

Annotators

pyxelr

URL

kiranrao.ca/2022/05/04/zero-downtime-migrations.html
Mar 2022
hypothes.is hypothes.is

Hipótesis

1
1. Itzel_irais 01 Mar 2022
  
  in Public
  
  https://www.oxfordjournals.org/nar/database/summary/813
  
  Biodatabases nardbs Immunological databases
Visit annotations in context

Tags

Biodatabases

Immunological databases

nardbs

Annotators

Itzel_irais

URL

hypothes.is/search
Jan 2022
www.slavevoyages.org www.slavevoyages.org

Trans-Atlantic Slave Trade

1
1. chrisaldrich 30 Jan 2022
  
  in Public
  
  Explore the Origins and Forced Relocations of Enslaved Africans Across the Atlantic World The SlaveVoyages website is a collaborative digital initiative that compiles and makes publicly accessible records of the largest slave trades in history. Search these records to learn about the broad origins and forced relocations of more than 12 million African people who were sent across the Atlantic in slave ships, and hundreds of thousands more who were trafficked within the Americas. Explore where they were taken, the numerous rebellions that occurred, the horrific loss of life during the voyages, the identities and nationalities of the perpetrators, and much more.
  
  https://www.slavevoyages.org/
  
  bookmark slavery United States history databases SlaveVoyages
Visit annotations in context

Tags

slavery

databases

bookmark

SlaveVoyages

history

United States

Annotators

chrisaldrich

URL

slavevoyages.org/
www.nytimes.com www.nytimes.com

Opinion | Is Slavery an Evil Beyond Measure?

1
1. chrisaldrich 30 Jan 2022
  
  in Public
  
  It is thanks to decades of painstaking, difficult work that we know a great deal about the scale of human trafficking across the Atlantic Ocean and about the people aboard each ship. Much of that research is available to the public in the form of the SlaveVoyages database. A detailed repository of information on individual ships, individual voyages and even individual people, it is a groundbreaking tool for scholars of slavery, the slave trade and the Atlantic world. And it continues to grow. Last year, the team behind SlaveVoyages introduced a new data set with information on the domestic slave trade within the United States, titled “Oceans of Kinfolk.”
  
  human trafficking slavery middle passage SlaveVoyages databases
Visit annotations in context

Tags

slavery

databases

human trafficking

SlaveVoyages

middle passage

Annotators

chrisaldrich

URL

nytimes.com/2022/01/28/opinion/slavery-voyages-data-sets.html
www.youtube.com www.youtube.com

Intro to Graph Thinking

1
1. chrisaldrich 13 Jan 2022
  
  in Public
  
  https://www.youtube.com/watch?v=z3Tvjf0buc8
  
  graph thinking
  
  intuitive
  
  speed, agility
  
  adaptability
  
  ; graph thinking : focuses on relationships to turn data into information and uses patterns to find meaning
  
  property graph data model
  
  relationships (connectors with verbs which can have properties)
  
  nodes (have names and can have properties)
  
  Examples:
  
  Purchase recommendations for products in real time
  
  Fraud detection
  
  Use for dependency analysis
  
  graphs graph thinking networks neo4j marketing category theory databases information theory relationships patterns graph theory mathematics watch
Visit annotations in context

Tags

neo4j

networks

marketing

information theory

category theory

graph theory

databases

graph thinking

graphs

watch

patterns

mathematics

relationships

Annotators

chrisaldrich

URL

youtube.com/watch
www.goedel.io www.goedel.io

Tools for Thought but not for Search?

1
1. chrisaldrich 13 Jan 2022
  
  in Public
  
  https://www.goedel.io/p/tools-for-thought-but-not-for-search
  
  Searching for two ingredients in an effort to find a recipe that will allow their use should be de rigueur in a personal knowledge manager, sadly it doesn't appear to be the case.
  
  This sort of simple search not working in these tools is just silly.
  
  They should be able to search across blocks, pages, and even provide graph views to help in this process. Where are all the overlaps of these words within one's database?
  
  databases search recipes tools for thought personal knowledge management blocks versus pages
Visit annotations in context

Tags

recipes

databases

tools for thought

search

personal knowledge management

blocks versus pages

Annotators

chrisaldrich

URL

goedel.io/p/tools-for-thought-but-not-for-search
Jul 2021
testdriven.io testdriven.io

Developing and Testing an Asynchronous API with FastAPI and Pytest

1
1. pyxelr 19 Jul 2021
  
  in Public
  
  databases is an async SQL query builder that works on top of the SQLAlchemy Core expression language.
  
  databases Python package
  
  Python databases SQLAlchemy
Visit annotations in context

Tags

SQLAlchemy

Python

databases

Annotators

pyxelr

URL

testdriven.io/blog/fastapi-crud/
Jun 2021
drewdevault.com drewdevault.com

How to store data forever

7
1. pyxelr 13 Jun 2021
  
  in Public
  
  This is where off-site backups come into play. For this purpose, I recommend Borg backup. It has sophisticated features for compression and encryption, and allows you to mount any version of your backups as a filesystem to recover the data from. Set this up on a cronjob as well for as frequently as you feel the need to make backups, and send them off-site to another location, which itself should have storage facilities following the rest of the recommendations from this article. Set up another cronjob to run borg check and send you the results on a schedule, so that their conspicuous absence may indicate that something fishy is going on. I also use Prometheus with Pushgateway to make a note every time that a backup is run, and set up an alarm which goes off if the backup age exceeds 48 hours. I also have periodic test alarms, so that the alert manager’s own failures are noticed.
  
  Solution for human failures and existential threads:
  
  Borg backup on a cronjob
  
  Prometheus with Pushgateway
  
  databases backup
2. pyxelr 13 Jun 2021
  
  in Public
  
  RAID is complicated, and getting it right is difficult. You don’t want to wait until your drives are failing to learn about a gap in your understanding of RAID. For this reason, I recommend ZFS to most. It automatically makes good decisions for you with respect to mirroring and parity, and gracefully handles rebuilds, sudden power loss, and other failures. It also has features which are helpful for other failure modes, like snapshots. Set up Zed to email you reports from ZFS. Zed has a debug mode, which will send you emails even for working disks — I recommend leaving this on, so that their conspicuous absence might alert you to a problem with the monitoring mechanism. Set up a cronjob to do monthly scrubs and review the Zed reports when they arrive. ZFS snapshots are cheap - set up a cronjob to take one every 5 minutes, perhaps with zfs-auto-snapshot.
  
  ZFS is recommended (not only for the beginners) over the complicated RAID
  
  databases ZFS RAID
3. pyxelr 13 Jun 2021
  
  in Public
  
  these days hardware RAID is almost always a mistake. Most operating systems have software RAID implementations which can achieve the same results without a dedicated RAID card.
  
  According to the author software RAID is preferable over hardware RAID
  
  databases RAID
4. pyxelr 13 Jun 2021
  
  in Public
  
  Failing disks can show signs of it in advance — degraded performance, or via S.M.A.R.T reports. Learn the tools for monitoring your storage medium, such as smartmontools, and set it up to report failures to you (and test the mechanisms by which the failures are reported to you).
  
  Preventive maintenance of disk failures
  
  databases
5. pyxelr 13 Jun 2021
  
  in Public
  
  RAID gets more creative with three or more hard drives, utilizing parity, which allows it to reconstruct the contents of failed hard drives from still-online drives.
  
  If you are using RAID and one of the 3 drives fail, you can still recover its content thanks to XOR operation
  
  databases RAID
6. pyxelr 13 Jun 2021
  
  in Public
  
  A more reliable solution is to store the data on a hard drive1. However, hard drives are rated for a limited number of read/write cycles, and can be expected to fail eventually.
  
  Hard drives are a better lifetime option than microSD cards but still not ideal
  
  databases
7. pyxelr 13 Jun 2021
  
  in Public
  
  The worst way I can think of is to store it on a microSD card. These fail a lot. I couldn’t find any hard data, but anecdotally, 4 out of 5 microSD cards I’ve used have experienced failures resulting in permanent data loss.
  
  microSD cards aren't recommended for storing lifetime data
  
  databases
Visit annotations in context

Tags

databases

ZFS

backup

RAID

Annotators

pyxelr

URL

drewdevault.com/2020/04/22/How-to-store-data-forever.html
Mar 2021
antonz.org antonz.org

SQLite is not a toy database

1
1. pyxelr 28 Mar 2021
  
  in Public
  
  The console is a killer SQLite feature for data analysis: more powerful than Excel and more simple than pandas. One can import CSV data with a single command, the table is created automatically
  
  SQLite makes it fairly easy to import and analyse data. For example:
  
  import --csv city.csv city
  
  select count(*) from city;
  
  SQLite DataScience databases
Visit annotations in context

Tags

SQLite

databases

DataScience

Annotators

pyxelr

URL

antonz.org/sqlite-is-not-a-toy-database/
antonz.org antonz.org

How to create a 1M record table with a single query

1
1. pyxelr 28 Mar 2021
  
  in Public
  
  This is not a problem if your DBMS supports SQL recursion: lots of data can be generated with a single query. The WITH RECURSIVE clause comes to the rescue.
  
  WITH RECURSIVE can help you quickly generate a series of random data.
  
  SQL DataScience databases
Visit annotations in context

Tags

DataScience

databases

SQL

Annotators

pyxelr

URL

antonz.org/random-table/
Oct 2020
dev.to dev.to

How to Use Google Sheets as a Database (Responsibly) - DEV Community 👩‍💻👨‍💻

1
1. pyxelr 13 Oct 2020
  
  in Public
  
  Queries became impractically slow around the 500,000 cell mark, but were still below 2 seconds for a 100,000 cell query. Therefore, if you anticipate a dataset larger than a few hundred thousand cells, it would probably be smart to choose a more scalable option.
  
  Scalability of Google Sheets. They have a hard limit of 5,000,000 cells (including blank ones)
  
  GoogleSheets databases
Visit annotations in context

Tags

GoogleSheets

databases

Annotators

pyxelr

URL

dev.to/hacubu/how-to-use-google-sheets-as-a-database-responsibly-3ohk
Sep 2020
duckdb.org duckdb.org

DuckDB - An embeddable SQL OLAP database management system

1
1. pyxelr 26 Sep 2020
  
  in Public
  
  DuckDB is an embeddable SQL OLAP database management system
  
  Database not requiring a server like SQLite and offering advantages of PostgreSQL
  
  databases DataEngineering
Visit annotations in context

Tags

databases

DataEngineering

Annotators

pyxelr

URL

duckdb.org/
Aug 2020
www.splitgraph.com www.splitgraph.com

Port 5432 is open: introducing the Splitgraph Data Delivery Network

1
1. pyxelr 29 Aug 2020
  
  in Public
  
  The Splitgraph DDN is a single SQL endpoint that lets you query over 40,000 public datasets hosted on or proxied by Splitgraph.You can connect to it from most PostgreSQL clients and BI tools without having to install anything else. It supports all read-only SQL constructs, including filters and aggregations. It even lets you run joins across distinct datasets.
  
  Splitgraph - efficient DDN (Data Delivery Network):
  
  connect to it from most PostgreSQL clients and BI tools without having to install anything else
  
  you can queory +40k public datasets hosten on or proxied by Splitgraph
  
  supports all SQL constructs (even SQL joins between tables)
  
  databases DataScience
Visit annotations in context

Tags

DataScience

databases

Annotators

pyxelr

URL

splitgraph.com/
bahr.dev bahr.dev

Archive your AWS data to reduce storage cost

1
1. pyxelr 19 Aug 2020
  
  in Public
  
  Archive your AWS data to reduce storage cost
  
  By archiving data on AWS we can reduce the costs up to 97%
  
  AWS databases cloudcomputing
Visit annotations in context

Tags

databases

AWS

cloudcomputing

Annotators

pyxelr

URL

bahr.dev/2020/08/07/archiving-data/
Jul 2020
dev.to dev.to

When to choose NoSQL over SQL? - DEV Community 👩‍💻👨‍💻

2
1. pyxelr 10 Jul 2020
  
  in Public
  
  So in brief, for our application service, if we understand the access patterns very well, they’re repeatable, they’re consistent, and scalability is a big factor, then NoSQL is a perfect choice.
  
  When NoSQL is a perfect choice
  
  NoSQL databases
2. pyxelr 10 Jul 2020
  
  in Public
  
  Comparison Time … 🤞
  
  Brief comparison of 8 aspects between SQL vs NoSQL
  
  SQL NoSQL databases
Visit annotations in context

Tags

NoSQL

databases

SQL

Annotators

pyxelr

URL

dev.to/ombharatiya/when-to-choose-nosql-over-sql-536p
May 2020
muldoon.cloud muldoon.cloud

Rules of thumb for a 1x developer

1
1. pyxelr 02 May 2020
  
  in Public
  
  Which database technology to choose
  
  Which database to choose (advice from an Amazon employee):
  
  SQL - ad hoc queries and/or support of ACID and transactions
  
  NoSQL - otherwise. NoSQL is getting better with transactions and PostgreSQL is getting better with availability, scalability, durability
  
  programming databases SQL NoSQL PostgreSQL
Visit annotations in context

Tags

NoSQL

programming

PostgreSQL

databases

SQL

Annotators

pyxelr

URL

muldoon.cloud/programming/2020/04/17/programming-rules-thumb.html
Apr 2020
web.archive.org web.archive.org

Folksonomies: how to do things with words on social media | OxfordWords blog

1
1. chrisaldrich 29 Apr 2020
  
  in Public
  
  From a narratological perspective, it would probably be fair to say that most databases are tragic. In their design, the configuration of their user interfaces, the selection of their contents, and the indexes that manage their workings, most databases are limited when set against the full scope of the field of information they seek to map and the knowledge of the people who created them. In creating a database, we fight against the constraints of the universe – the categories we use to sort out the world; the limitations of time and money and technology – and succumb to them.
  
  databases are tragic!
  
  databases database antipattern
Visit annotations in context

Tags

databases

database antipattern

Annotators

chrisaldrich

URL

web.archive.org/web/20190403094328/https://blog.oxforddictionaries.com/2018/10/30/folksonomies-things-words-social-media/
dev.to dev.to

What software technologies will earn you the highest pay?

1
1. pyxelr 29 Apr 2020
  
  in Public
  
  Finding a database management system that works for you
  
  Well paid database technologies:
  
  programming salary databases
Visit annotations in context

Tags

programming

databases

salary

Annotators

pyxelr

URL

dev.to/educative/what-software-technologies-will-earn-you-the-highest-pay-3fc3
rakyll.medium.com rakyll.medium.com

Things I Wished More Developers Knew About Databases

1
1. pyxelr 26 Apr 2020
  
  in Public
  
  I’m sharing a few insights I specifically found useful for developers who are not specialized in this domain.
  
  Insights on databases from a Google engineer:
  
  You are lucky if 99.999% of the time network is not a problem.
  
  ACID has many meanings.
  
  Each database has different consistency and isolation capabilities.
  
  Optimistic locking is an option when you can’t hold a lock.
  
  There are anomalies other than dirty reads and data loss.
  
  My database and I don’t always agree on ordering.
  
  Application-level sharding can live outside the application.
  
  AUTOINCREMENT’ing can be harmful.
  
  Stale data can be useful and lock-free.
  
  Clock skews happen between any clock sources.
  
  Latency has many meanings.
  
  Evaluate performance requirements per transaction.
  
  Nested transactions can be harmful.
  
  Transactions shouldn’t maintain application state.
  
  Query planners can tell a lot about databases.
  
  Online migrations are complex but possible.
  
  Significant database growth introduces unpredictability.
  
  databases
Visit annotations in context

Tags

databases

Annotators

pyxelr

URL

rakyll.medium.com/things-i-wished-more-developers-knew-about-databases-2d0178464f78
news.ycombinator.com news.ycombinator.com

Falcon is a free, open-source SQL editor with inline data visualization | Hacker News

1
1. pyxelr 17 Apr 2020
  
  in Public
  
  1) Redash and Falcon focus on people that want to do visualizations on top of SQL2) Superset, Tableau and PowerBI focus on people that want to do visualizations with a UI3) Metabase and SeekTable focus on people that want to do quick analysis (they are the closest to an Excel replacement)
  
  Comparison of data analysis tools:
  
  1) Redash & Falcon - SQL focus
  
  2) Superset, Tableau & PowerBI - UI workflow
  
  3) Metabase & SeekTable - Excel like experience
  
  DataScience databases SQL
Visit annotations in context

Tags

DataScience

databases

SQL

Annotators

pyxelr

URL

news.ycombinator.com/item
Mar 2020
beepb00p.xyz beepb00p.xyz

Against unnecessary databases | beepb00p

3
1. pyxelr 02 Mar 2020
  
  in Public
  
  supporting this field is extremely easy If you keep raw data, it's just a matter of adding a getter method to the Article class.
  
  Way of supporting a new field in JSON is much easier than in a relational database:
  
  @property def highlights(self) -> Sequence[Highlight]: default = [] # defensive to handle older export formats that had no annotations jsons = self.json.get('annotations', default) return list(map(Highlight, jsons))
  
  databases Python
2. pyxelr 02 Mar 2020
  
  in Public
  
  query language doesn't necessarily mean a database. E.g. see pandas which is capable of what SQL is capable of, and even more convenient than SQL for our data exploration purposes.
  
  Query language, not always = database. For example, see pandas
  
  databases Python
3. pyxelr 02 Mar 2020
  
  in Public
  
  cachew lets you cache function calls into an sqlite database on your disk in a matter of single decorator (similar to functools.lru_cache). The difference from functools.lru_cache is that cached data is persisted between program runs, so next time you call your function, it will only be a matter of reading from the cache.
  
  cachew tool isolates the complexity of database access patterns in a Python library
  
  databases Python
Visit annotations in context

Tags

Python

databases

Annotators

pyxelr

URL

beepb00p.xyz/unnecessary-db.html
Feb 2020
beepb00p.xyz beepb00p.xyz

Against unnecessary databases | beepb00p

4
1. pyxelr 21 Feb 2020
  
  in Public
  
  Imagine that you're using a database to export them, so your schema is: TABLE Article(STRING id, STRING url, STRING title, DATETIME added). One day, the developers expose highlights (or annotations) from the private API and your export script stats receiving it in the response JSON. It's quite useful data to have! However, your database can't just magically change to conform to the new field.
  
  Relational model can be sometimes hand tying, unlike JSON
  
  databases
2. pyxelr 21 Feb 2020
  
  in Public
  
  Storage saved by using a database instead of plaintext is marginal and not worth the effort.
  
  Databases save some space used by data, but it's marginal
  
  databases
3. pyxelr 21 Feb 2020
  
  in Public
  
  if necessary use databases as an intermediate layer to speed access up and as an additional interface to your data Nothing wrong with using databases for caching if you need it!
  
  You may want to use databases for:
  
  speeding access up
  
  creating additional layer
  
  caching
  
  databases
4. pyxelr 21 Feb 2020
  
  in Public
  
  I want to argue very strongly against forcing the data in the database, unless it's really inevitable.
  
  After scraping some data, don't go immediately to databases, unless it's a great stream of data
  
  databases
Visit annotations in context

Tags

databases

Annotators

pyxelr

URL

beepb00p.xyz/unnecessary-db.html
Jan 2020
earthworks.stanford.edu earthworks.stanford.edu

Global GIS : volcanoes of the world ; volcano basic data in EarthWorks

1
1. jferrer 03 Jan 2020
  
  in Public
  
  Another version of the volcano database
  
  F2.9 T3.4 global databases
Visit annotations in context

Tags

global databases

F2.9

T3.4

Annotators

jferrer

URL

earthworks.stanford.edu/catalog/harvard-glb-volc
catalog.data.gov catalog.data.gov

Global Volcano Locations Database - Data.gov

1
1. jferrer 03 Jan 2020
  
  in Public
  
  Known location of known volcanoes with activity records
  
  F2.9 T3.4 global databases
Visit annotations in context

Tags

global databases

F2.9

T3.4

Annotators

jferrer

URL

catalog.data.gov/dataset/global-volcano-locations-database
Nov 2019
github.com github.com

Thoughts on Foreign Keys? · Issue #331 · github/gh-ost

3
1. pyxelr 17 Nov 2019
  
  in Public
  
  FKs don't work well with online schema migrations.
  
  3rd reason why at GitHub they don't rely on Foreign Keys: Working with online schema migrations.
  
  FKs impose a lot of constraints on what's possible and what's not possible
  
  databases
2. pyxelr 17 Nov 2019
  
  in Public
  
  FKs are a performance impact. The fact they require indexes is likely fine, since those indexes are needed anyhow. But the lookup made for each insert/delete is an overhead.
  
  2nd reason why at GitHub they don't rely on Foreign Keys: FK performance impact
  
  databases
3. pyxelr 17 Nov 2019
  
  in Public
  
  FKs are in your way to shard your database. Your app is accustomed to rely on FK to maintain integrity, instead of doing it on its own. It may even rely on FK to cascade deletes (shudder). When eventually you want to shard or extract data out, you need to change & test the app to an unknown extent.
  
  1st reason why at GitHub they don't rely on Foreign Keys: Relying on FK to maintain integrity, instead of doing it on its own
  
  databases
Visit annotations in context

Tags

databases

Annotators

pyxelr

URL

github.com/github/gh-ost/issues/331
Sep 2019
www.prisma.io www.prisma.io

Comparing Database Types: How Database Types Evolved to Meet Different Needs | Prisma

30
1. pyxelr 29 Sep 2019
  
  in Public
  
  To address the availability concern, new architectures were developed to minimize the impact of partitions. For instance, splitting data sets into smaller ranges called shards can minimize the amount of data that is unavailable during partitions. Furthermore, mechanisms to automatically alter the roles of various cluster members based on network conditions allow them to regain availability quickly
  
  Qualities of NewSQL - mainly minimisation of the impact of partitions
  
  databases
2. pyxelr 29 Sep 2019
  
  in Public
  
  typically less flexible and generalized than their more conventional relational counterparts. They also usually only offer a subset of full SQL and relational features, which means that they might not be able to handle certain kinds of usage. Many NewSQL implementations also store a large part of or their entire dataset in the computer's main memory. This improves performance at the cost of greater risk to unpersisted changes
  
  Differences between NewSQL and relational databases:
  
  typically less flexible and generalized
  
  usually only offer a subset of full SQL and relational features, which means that they might not be able to handle certain kinds of usage.
  
  many NewSQL implementations also store a large part of or their entire dataset in the computer's main memory. This improves performance at the cost of greater risk to unpersisted changes.
  
  databases
3. pyxelr 29 Sep 2019
  
  in Public
  
  using a mixture of different database types is the best approach for handling the data of your projects
  
  Many times mixing different databases is a good approach.
  
  For example:
  
  store user information - relational databases
  
  configuration values - in-memory key-value store
  
  databases
4. pyxelr 29 Sep 2019
  
  in Public
  
  best suited for use cases with high volumes of relational data in distributed, cloud-like environments
  
  Best suit of NewSQL
  
  databases
5. pyxelr 29 Sep 2019
  
  in Public
  
  CAP theorem is a statement about the trade offs that distributed databases must make between availability and consistency. It asserts that in the event of a network partition, a distributed database can choose either to remain available or remain consistent, but it cannot do both. Cluster members in a partitioned network can continue operating, leading to at least temporary inconsistency. Alternatively, at least some of the disconnected members must refuse to alter their data during the partition to ensure data consistency
  
  CAP Theorem relating to distributed databases
  
  databases
6. pyxelr 29 Sep 2019
  
  in Public
  
  NewSQL databases: bringing modern scalability and performance to the traditional relational pattern
  
  NewSQL databases - designed with scalability and modern performance requirements. Follow the relational structure and semantics, but are built using more modern, scalable design. Rise in popularity in 2010s.
  
  Examples:
  
  MemSQL
  
  VoltDB
  
  Spanner
  
  Calvin
  
  CockroachDB
  
  FaunaDB
  
  yugabyteDB
  
  databases
7. pyxelr 29 Sep 2019
  
  in Public
  
  aggregate queries like summing, averaging, and other analytics-oriented processes can be difficult or impossible
  
  Disadvantage of column databases
  
  databases
8. pyxelr 29 Sep 2019
  
  in Public
  
  Column-family databases are good when working with applications that requires great performance for row-based operations and highly scalability
  
  Advantage of column databases. They also collect row data in a cluster on the same machine, simplifying data sharding and scaling
  
  databases
9. pyxelr 29 Sep 2019
  
  in Public
  
  it helps to think of column family databases as key-value databases where each key (row identifier) returns a dictionary of arbitrary attributes and their values (the column names and their values)
  
  Tip to remember the idea of column databases
  
  databases
10. pyxelr 29 Sep 2019
  
  in Public
  
  Column-family databases: databases with flexible columns to bridge the gap between relational and document databases
  
  Column-family databases - also called as non-relational column stores, wide-column databases or column databases. Rise in popularity in 2000s. Look highly similar to relational databases. They have structure called column families, which contain rows of data, each of which define their own format. Therefore, each row in a column family defines its own schema.
  
  Examples:
  
  Cassandra
  
  HBase
  
  databases
11. pyxelr 29 Sep 2019
  
  in Public
  
  querying for the connection between two users of a social media site in a relational database is likely to require multiple table joins and therefore be rather resource intensive. This same query would be straightforward in a graph database that directly maps connections
  
  Social media prefers graph databases over relational ones
  
  databases
12. pyxelr 29 Sep 2019
  
  in Public
  
  Graph databases are most useful when working with data where the relationships or connections are highly important
  
  Major use of graph databases
  
  databases
13. pyxelr 29 Sep 2019
  
  in Public
  
  network databases require step-by-step traversal to travel between items and are limited in the types of relationships they can represent.
  
  Difference between network databases (SQL) and graph databases (NoSQL)
  
  databases
14. pyxelr 29 Sep 2019
  
  in Public
  
  Graph databases: mapping relationships by focusing on how connections between data are meaningful
  
  Graph databases - establishes connections using the concepts of nodes, edges, and properties. Rise in popularity in 2000s.
  
  Examples:
  
  Neo4j
  
  JanusGraph
  
  Dgraph
  
  databases
15. pyxelr 29 Sep 2019
  
  in Public
  
  Document databases: Storing all of an item's data in flexible, self-describing structures
  
  Document databases - also known as document-oriented databases or document stores, share the basic access and retrieval semantics of key-value stores. Rise in popularity in 2009.
  
  They also used keys to uniquely identify data, therefore the line between advanced key-value stores and document databases can be fairly unclear.
  
  Instead of storing arbitrary blobs of data, document databases store data in structured formats called documents, often using formats like JSON, BSON, or XML.
  
  Examples:
  
  MongoDB
  
  RethinkDB
  
  Couchbase
  
  databases
16. pyxelr 29 Sep 2019
  
  in Public
  
  Document databases are a good choice for rapid development because you can change the properties of the data you want to save at any point without altering existing structures or data. You only need to backfill records if you want to. Each document within the database stands on its own with its own system of organization. If you're still figuring out your data structure and your data is mainly composed discrete entries that don't include a lot of cross references, a document database might be a good place to start. Be careful, however, as the extra flexibility means that you are responsible for maintaining the consistency and structure of your data, which can be extremely challenging
  
  Pros and cons of document databases
  
  databases
17. pyxelr 29 Sep 2019
  
  in Public
  
  Though the data within documents is organized within a structure, document databases do not prescribe any specific format or schema
  
  Therefore, unlike in key-value stores, the content stored in document databases can be queried and analysed
  
  databases
18. pyxelr 29 Sep 2019
  
  in Public
  
  Key-value stores are often used to store configuration data, state information, and any data that might be represented by a dictionary or hash in a programming language. Key-value stores provide fast, low-complexity access to this type of data
  
  Use and advantages of of key-value stores
  
  databases
19. pyxelr 29 Sep 2019
  
  in Public
  
  Key-value databases: simple, dictionary-style lookups for basic storage and retrieval
  
  Key-value databases - one of the simplest database types. Initially introduced in 1970s (rise in popularity: 2000-2010). Work by storing arbitrary data accessible through a specific key.
  
  to store data, you provide a key and the blob of data you wish to save, for example a JSON object, an image, or plain text.
  
  to retrieve data, you provide the key and will then be given the blob of data back.
  
  Examples:
  
  Redis
  
  memcached
  
  etcd
  
  databases
20. pyxelr 29 Sep 2019
  
  in Public
  
  NoSQL databases: modern alternatives for data that doesn't fit the relational paradigm
  
  NoSQL databases - stands for either non-SQL or not only SQL to clarify that sometimes they allow SQL-like querying.
  
  4 types:
  
  Key-value
  
  Document
  
  Graph
  
  Column-family
  
  databases
21. pyxelr 29 Sep 2019
  
  in Public
  
  relational databases are often a good fit for any data that is regular, predictable, and benefits from the ability to flexibly compose information in various formats. Because relational databases work off of a schema, it can be more challenging to alter the structure of data after it is in the system. However, the schema also helps enforce the integrity of the data, making sure values match the expected formats, and that required information is included. Overall, relational databases are a solid choice for many applications because applications often generate well-ordered, structured data
  
  Pros and cons of relational database
  
  databases
22. pyxelr 29 Sep 2019
  
  in Public
  
  querying language called SQL, or structured query language, was created to access and manipulate data stored with that format
  
  SQL was created for relational databases
  
  databases
23. pyxelr 29 Sep 2019
  
  in Public
  
  Relational databases: working with tables as a standard solution to organize well-structured data
  
  Relational databases - oldest general purpose database type still widely used today. They comprise the majority of databases currently used in production. Initially introduced in 1969.
  
  They organise data using tables - structures that impose a schema on the records that they hold.
  
  each column has a name and a data type
  
  each row represents an individual record
  
  Examples:
  
  MySQL
  
  MariaDB
  
  PostgreSQL
  
  SQLite
  
  databases
24. pyxelr 29 Sep 2019
  
  in Public
  
  database schema is a description of the logical structure of a database or the elements it contains. Schemas often include declarations for the structure of individual entries, groups of entries, and the individual attributes that database entries are comprised of. These may also define data types and additional constraints to control the type of data that may be added to the structure
  
  Database schema
  
  databases
25. pyxelr 29 Sep 2019
  
  in Public
  
  Network databases: mapping more flexible connections with non-hierarchical links
  
  Network databases - built on the foundation provided by hierarchical databases by adding additional flexibility. Initially introduced in late 1960s. Instead of always having a single parent, as in hierarchical databases, network database entries can have more than one parent, which effectively allows them to model more complex relationships.
  
  Examples:
  
  IDMS
  
  Have graph-like structure
  
  databases
26. pyxelr 29 Sep 2019
  
  in Public
  
  Hierarchical databases: using parent-child relationships to map data into trees
  
  Hierarchical databases - the next evolution in database development. Initially introduced in 1960s. They encode a relationship between items where every record has a single parent.
  
  Examples:
  
  Filesystems
  
  DNS
  
  LDAP directories
  
  Have tree-like structure
  
  databases
27. pyxelr 29 Sep 2019
  
  in Public
  
  Hierarchical databases are not used much today due to their limited ability to organize most data and because of the overhead of accessing data by traversing the hierarchy
  
  Hierarchical databases aren't used as much anymore
  
  databases
28. pyxelr 29 Sep 2019
  
  in Public
  
  The first flat file databases represented information in regular, machine parse-able structures within files. Data is stored in plain text, which limits the type of content that can be represented within the database itself. Sometimes, a special character or other indicator is chosen to use as a delimiter, or marker for when one field ends and the next begins. For example, a comma is used in CSV (comma-separated values) files, while colons or white-space are used in many data files in Unix-like systems
  
  Flat-file databases - 1st type of databases with a simple data structure for organising small amounts of local data.
  
  Examples:
  
  /etc/passwd and /etc/fstab on Linux and Unix-like systems
  
  CSV files
  
  databases
29. pyxelr 29 Sep 2019
  
  in Public
  
  Some advantages of this format
  
  Advantages of flat-file format:
  
  has robust, flexible toolkit
  
  easily managed without specialised software
  
  easy to understand and work with
  
  databases
30. pyxelr 29 Sep 2019
  
  in Public
  
  While flat file databases are simple, they are very limited in the level of complexity they can handle
  
  Disadvantages of flat-file databases:
  
  system that reads or manipulates the data cannot make easy connections between the data represented
  
  usually don't have any type of user or data concurrency features either
  
  usually only practical for systems with small read or write requirements. For example, many operating systems use flat-files to store configuration data
  
  databases
Visit annotations in context

Tags

databases

Annotators

pyxelr

URL

prisma.io/blog/comparison-of-database-models-1iz9u29nwn37
Dec 2018
blog.jonudell.net blog.jonudell.net

Designing for least knowledge

1
1. daveh70 27 Dec 2018
  
  in Public
  
  principle of least knowledge - Systems should be designed to limit the operator's access to user data.
  
  web development databases privacy
Visit annotations in context

Tags

databases

privacy

web development

Annotators

daveh70

URL

blog.jonudell.net/2018/12/27/designing-for-least-knowledge/
Aug 2018
avant.org avant.org

A Brief History of Databases

1
1. nichtich 22 Aug 2018
  
  in Public
  
  Electronic Recording Machine Accounting (ERMA)
  
  It also introduced bank account numbers, see https://en.wikipedia.org/wiki/Electronic_Recording_Machine,_Accounting
  
  bookkeeping databases history
Visit annotations in context

Tags

bookkeeping

databases

history

Annotators

nichtich

URL

avant.org/project/history-of-databases/
May 2018
hypothes.is hypothes.is

Hypothesis

1
1. urmilakhanna074 08 May 2018
  
  in Public
  
  hi there check out the SAS Basics Training and Tutorial with better Explanation on the Analytical tools and data workflow in the databases https://www.youtube.com/watch?v=wNQLAUaaMHE
  
  SAS Basics Dataworkflow Databases Datawarehousing formats of Data representation
Visit annotations in context

Tags

Datawarehousing

SAS Basics

Dataworkflow

Databases

formats of Data representation

Annotators

urmilakhanna074

URL

hypothes.is/users/urmilakhanna074
Oct 2017
eng.uber.com eng.uber.com

Why Uber Engineering Switched from Postgres to MySQL

2
1. tilgovi 05 Oct 2017
  
  in Public
  
  MySQL’s replication architecture means that if bugs do cause table corruption, the problem is unlikely to cause a catastrophic failure.
  
  I can't follow the reasoning here. I guess it's not guaranteed to replicate the corruption like Postgres would, but it seems totally possible to trigger similar or identical corruption because the implementation of the logical statement would be similar on the replica.
  
  databases postgres mysql
2. tilgovi 05 Oct 2017
  
  in Public
  
  The bug we ran into only affected certain releases of Postgres 9.2 and has been fixed for a long time now. However, we still find it worrisome that this class of bug can happen at all. A new version of Postgres could be released at any time that has a bug of this nature, and because of the way replication works, this issue has the potential to spread into all of the databases in a replication hierarchy.
  
  Not really a criticism of Postgres so much as it is a criticism of software in general.
  
  software databases
Visit annotations in context

Tags

databases

software

postgres

mysql

Annotators

tilgovi

URL

eng.uber.com/postgres-to-mysql-migration/
Aug 2017
cosette.cs.washington.edu cosette.cs.washington.edu

Cosette: An Automated SQL Solver

1
1. daveh70 17 Aug 2017
  
  in Public
  
  Cosette - an automated prover for checking equivalence of SQL queries. From the Database Group at the University of Washington.
  
  https://medium.com/@uwdb/introducing-cosette-527898504bd6
  
  programming databases web development
Visit annotations in context

Tags

programming

databases

web development

Annotators

daveh70

URL

cosette.cs.washington.edu/
Jun 2016
blog.jonudell.net blog.jonudell.net

Annotation is not (only) web comments

1
1. Enkerli 21 Jun 2016
  
  in Public
  
  If the RRID is well-formed, and if the lookup found the right record, a human validator tags it a valid RRID — one that can now be associated mechanically with occurrences of the same resource in other contexts. If the RRID is not well-formed, or if the lookup fails to find the right record, a human validator tags the annotation as an exception and can discuss with others how to handle it. If an RRID is just missing, the validator notes that with another kind of exception tag.
  
  Sounds a lot like the way reference managers work. In many cases, people keep the invalid or badly-formed results.
  
  reference managers Zotero ProCite Endnote BibTeX Mendeley information science library librarians citation databases
Visit annotations in context

Tags

citation databases

Endnote

Zotero

reference managers

BibTeX

ProCite

library

librarians

Mendeley

information science

Annotators

Enkerli

URL

blog.jonudell.net/2016/04/24/annotation-is-not-only-web-comments/
Apr 2016
biosharing.org biosharing.org

BioSharing: biodbcore-000595: GigaDB

1
1. scotted400 12 Apr 2016
  
  in Public
  
  Giga Science Database
  
  For more about GigaDB, see the paper in Database Journal: http://database.oxfordjournals.org/content/2014/bau018.full
  
  GigaDB databases open data
Visit annotations in context

Tags

open data

databases

GigaDB

Annotators

scotted400

URL

biosharing.org/biodbcore-000595
Jan 2016
pgexercises.com pgexercises.com

PostgreSQL Exercises

1
1. daveh70 16 Jan 2016
  
  in Public
  
  Exercises for learning PostgreSQL.
  
  sql databases programming
Visit annotations in context

Tags

programming

databases

sql

Annotators

daveh70

URL

pgexercises.com/
Dec 2015
math.mit.edu math.mit.edu

CT4S.pdf

2
1. bbarker 06 Dec 2015
  
  in Public
  
  Data gathering is ubiquitous in science. Giant databases are currently being minedfor unknown patterns, but in fact there are many (many) known patterns that simplyhave not been catalogued. Consider the well-known case of medical records. A patient’smedical history is often known by various individual doctor-offices but quite inadequatelyshared between them. Sharing medical records often means faxing a hand-written noteor a filled-in house-created form between offices.
  
  category theory patterns EHR EMR data databases
2. bbarker 01 Dec 2015
  
  in Public
  
  I will use a mathematical tool calledologs, or ontology logs, to givesome structure to the kinds of ideas that are often communicated in pictures like theone on the cover. Each olog inherently offers a framework in which to record data aboutthe subject. More precisely it encompasses adatabase schema, which means a system ofinterconnected tables that are initially empty but into which data can be entered.
  
  category theory databases term ontology ologs schema
Visit annotations in context

Tags

ologs

schema

ontology

category theory

databases

EMR

data

patterns

term

EHR

Annotators

bbarker

URL

math.mit.edu/~dspivak/teaching/sp13/CT4S.pdf
May 2015
www.mendeley.com www.mendeley.com

Import citations into your library using the Mendeley Web Importer | Mendeley

1
1. rvidal 05 May 2015
  
  in Public
  
  Supported sites
  
  Wow, so many!
  
  databases metadata
Visit annotations in context

Tags

databases

metadata

Annotators

rvidal

URL

mendeley.com/import/
Oct 2014
antirez.com antirez.com

Redis cluster, no longer vaporware. - Antirez weblog

1
1. tilgovi 09 Oct 2014
  
  in Public
  
  This in turn means that Redis Cluster does not have to take meta data in the data structures in order to attempt a value merge, and that the fancy commands and data structures supported by Redis are also supported by Redis Cluster. So no additional memory overhead, no API limits, no limits in the amount of elements a value can contain, but less safety during partitions.
  
  A solid trade-off, I think, and says a lot about the intended use cases.
  
  redis distributed systems databases
Visit annotations in context

Tags

databases

distributed systems

redis

Annotators

tilgovi

URL

antirez.com/news/79
Sep 2014
www.aerospike.com www.aerospike.com

Aerospike Technology

2
1. tilgovi 08 Sep 2014
  
  in Public
  
  Fast restart. If a server is temporarily taken down, this capability restores the index from a saved copy, eliminating delays due to index rebuilding.
  
  This point seems to be in direct contradiction to the claim above that "Indexes (primary and secondary) are always stored in DRAM for fast access and are never stored on Solid State Drives (SSDs) to ensure low wear."
  
  databases storage
2. tilgovi 08 Sep 2014
  
  in Public
  
  Unlike other databases that use the linux file system that was built for rotational drives, Aerospike has implemented a log structured file system to access flash – raw blocks on SSDs – directly.
  
  Does this really mean to suggest that Aerospike bypasses the linux block device layer? Is there a kernel driver? Does this mean I can't use any filesystem I want and know how to administrate? Is the claim that the "linux file system" (which I take to mean, I guess, the virtual file system layer) "built for rotation drives" even accurate? We've had ram disks for a long, long time. And before that we've had log structured filesystems, too, and even devices that aren't random access like tape drives. Seems like dubious claims all around.
  
  databases storage filesystems
Visit annotations in context

Tags

databases

filesystems

storage

Annotators

tilgovi

URL

aerospike.com/technology/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Genealogy Garage: Researching at the Huntington Library

ECPP

Huntington Library

Visit checklist

Tags

Annotators

URL

Tags

Annotators

Tags

Annotators